Instructions to use K024/chatglm2-6b-int4g32 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use K024/chatglm2-6b-int4g32 with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("K024/chatglm2-6b-int4g32", dtype="auto") - Notebooks
- Google Colab
- Kaggle
metadata
language:
- zh
- en
tags:
- glm
- chatglm
- thudm
ChatGLM2 6b int4 g32 量化模型
详情参考 K024/chatglm-q。
See K024/chatglm-q for more details.
import torch
from chatglm_q.decoder import ChatGLMDecoder, chat_template
device = torch.device("cuda")
decoder = ChatGLMDecoder.from_pretrained("K024/chatglm2-6b-int4g32", device=device)
prompt = chat_template([], "我是谁?")
for text in decoder.generate(prompt):
print(text)
模型权重按 ChatGLM2-6b 许可发布,见 MODEL LICENSE。
Model weights are released under the same license as ChatGLM2-6b, see MODEL LICENSE.