h31337
/

Qwen-7B-Chat-GGUF

Model card Files Files and versions

Qwen-7B-Chat-GGUF

이 저장소는 Qwen/Qwen-7B-Chat 모델을 GGUF로 변환하고 양자화한 결과물입니다.

생성 정보

원본 모델: Qwen/Qwen-7B-Chat
원본 revision: main
샤드 병합 사용: False
기본 GGUF 타입: f16
양자화 타입: Q2_K, Q3_K, Q4_K, Q5_K, Q6_K, Q8_0

파일 목록

Qwen-7B-Chat-f16.gguf
Qwen-7B-Chat-Q2_K.gguf
Qwen-7B-Chat-Q3_K.gguf
Qwen-7B-Chat-Q4_K.gguf
Qwen-7B-Chat-Q5_K.gguf
Qwen-7B-Chat-Q6_K.gguf
Qwen-7B-Chat-Q8_0.gguf
quantizer-manifest.json

사용 예시

./llama-cli -m Qwen-7B-Chat-f16.gguf -p "안녕하세요"

참고

변환/양자화는 llama.cpp 도구 체인을 사용했습니다.
생성 과정 메타데이터는 quantizer-manifest.json 파일에 포함되어 있습니다.

Downloads last month: 264

GGUF

Model size

8B params

Architecture

qwen

Hardware compatibility

Log In to add your hardware

2-bit

6-bit

8-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for h31337/Qwen-7B-Chat-GGUF

Base model

Qwen/Qwen-7B-Chat

Quantized

(8)

this model