huawei-csl
/

Qwen3-14B-4bit-ASINQ

@@ -1,7 +1,10 @@
 ---
 language:
 - en
 license: apache-2.0
 tags:
 - quantization
 - sinq
@@ -11,7 +14,6 @@ tags:
 - qwen
 - llm
 - compression
-base_model: Qwen/Qwen3-14B
 base_model_relation: quantized
 ---
@@ -48,7 +50,7 @@ To support the project please put a star ⭐ in the official [SINQ](https://gith
 ---
-# 🚀 Usage</span>
 ## Prerequisite
 Before running the quantization script, make sure the **SINQ** library is installed.
@@ -58,6 +60,7 @@ Installation instructions and setup details are available in the [SINQ official
 You can load and use the model with our wrapper based on the 🤗 Transformers library:
 ```python
 from transformers import AutoTokenizer
 from sinq.patch_model import AutoSINQHFModel
@@ -74,7 +77,6 @@ inputs = tokenizer(prompt, return_tensors="pt").to("cuda:0")
 with torch.inference_mode():
     out_ids = sinq_model.generate(**inputs, max_new_tokens=32, do_sample=False)
 print(tokenizer.decode(out_ids[0], skip_special_tokens=True))
 ```
 <details>
@@ -83,6 +85,7 @@ print(tokenizer.decode(out_ids[0], skip_special_tokens=True))
 The quantized model was obtained using the **SINQ** quantization library, following the steps below:
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 from sinq.patch_model import AutoSINQHFModel
 from sinq.sinqlinear import BaseQuantizeConfig

 ---
+base_model: Qwen/Qwen3-14B
 language:
 - en
 license: apache-2.0
+pipeline_tag: text-generation
+library_name: sinq
 tags:
 - quantization
 - sinq
 - qwen
 - llm
 - compression
 base_model_relation: quantized
 ---
 ---
+# 🚀 Usage
 ## Prerequisite
 Before running the quantization script, make sure the **SINQ** library is installed.
 You can load and use the model with our wrapper based on the 🤗 Transformers library:
 ```python
+import torch
 from transformers import AutoTokenizer
 from sinq.patch_model import AutoSINQHFModel
 with torch.inference_mode():
     out_ids = sinq_model.generate(**inputs, max_new_tokens=32, do_sample=False)
 print(tokenizer.decode(out_ids[0], skip_special_tokens=True))
 ```
 <details>
 The quantized model was obtained using the **SINQ** quantization library, following the steps below:
 ```python
+import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer
 from sinq.patch_model import AutoSINQHFModel
 from sinq.sinqlinear import BaseQuantizeConfig