Instructions to use ertghiu256/Qwen3.5-2b-ReMix with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use ertghiu256/Qwen3.5-2b-ReMix with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="ertghiu256/Qwen3.5-2b-ReMix")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("ertghiu256/Qwen3.5-2b-ReMix")
model = AutoModelForImageTextToText.from_pretrained("ertghiu256/Qwen3.5-2b-ReMix")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use ertghiu256/Qwen3.5-2b-ReMix with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "ertghiu256/Qwen3.5-2b-ReMix"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ertghiu256/Qwen3.5-2b-ReMix",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/ertghiu256/Qwen3.5-2b-ReMix

SGLang

How to use ertghiu256/Qwen3.5-2b-ReMix with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "ertghiu256/Qwen3.5-2b-ReMix" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ertghiu256/Qwen3.5-2b-ReMix",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "ertghiu256/Qwen3.5-2b-ReMix" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ertghiu256/Qwen3.5-2b-ReMix",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Unsloth Studio new

How to use ertghiu256/Qwen3.5-2b-ReMix with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for ertghiu256/Qwen3.5-2b-ReMix to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for ertghiu256/Qwen3.5-2b-ReMix to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for ertghiu256/Qwen3.5-2b-ReMix to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="ertghiu256/Qwen3.5-2b-ReMix",
    max_seq_length=2048,
)

Docker Model Runner
How to use ertghiu256/Qwen3.5-2b-ReMix with Docker Model Runner:
```
docker model run hf.co/ertghiu256/Qwen3.5-2b-ReMix
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

🚀 Qwen3.5-2B-ReMix (Reasoning Mix)

This repository contains a fully merged, native Float16 (F16) fine-tune of Qwen/Qwen3.5-2B 🤖. The primary objective of this model is to significantly scale up performance on complex reasoning tasks, specifically targeting advanced mathematics 🧮, logical deduction, and structured coding problems 💻.

By leveraging open-source distillation data, it aims to achieve high-tier, "frontier-style" reasoning capabilities while keeping the footprint compact enough to run smoothly at native speeds on local, everyday consumer hardware 🏠 without the need to load external adapters.

🌟 Model Highlights

🏗️ Base Architecture: Qwen/Qwen3.5-2B (Dense, Hybrid Gated DeltaNet)
💾 Precision format: Native Float16 (F16) Merged Weights — No adapter required!
🎯 Main Goal: Advanced mathematical reasoning and complex code generation/debugging.
🛡️ Data Origin: 100% open-source distilled reasoning datasets natively hosted on Hugging Face. No proprietary data or closed APIs (OpenAI, Anthropic, Google) were used or involved in the collection or training process.
⚡ Target Environment: Local, high-efficiency edge execution with minimal hardware requirements.

🎛️ Recommended Generation Parameters

To unlock the best reasoning patterns and prevent the model from drifting into creative fluff, it is highly recommended to override the default sampler settings with the following values during local inference:

Parameter	Value	Note
🌡️ Temperature (`temp`)	`0.4`	Keeps logical thoughts focused and mathematically stable.
🎯 Top P (`top_p`)	`30.0`	Expands token exploration for rich code structures.

💡 Prompting Note: Because this is built on top of the Qwen3.5 Small architecture, make sure your UI environment or inference wrapper passes parameters that allow the system to natively isolate and render its internal chain-of-thought steps.

📊 Training & Merge Details

The model was adapted using Parameter-Efficient Fine-Tuning (PEFT) and then compiled back into the core network layers to output clean, unified F16 weights via Unsloth.

🔄 Training Steps: 175
📉 Loss Profile: Convergence floor reached ~0.58; stabilized consistently around 0.85
📈 Learning Rate: 4e-5
📐 LoRA Rank ($R$) during training: 16
⚖️ LoRA Alpha ($\alpha$) during training: 32

⚠️ Limitations & Risks

While this fine-tune aggressively pushes the boundaries of what a 2B parameter model can achieve locally, users should carefully account for the following behaviors before deployment:

🔮 Hallucinations: Like all highly compact language models, the model can still confidently present false calculations or logically flawed code snippets as absolute facts. Always verify output strings.
🎭 Inconsistent Styles: Because the underlying training data aggregates multiple distinct open-source distilled reasoning sets, the model may occasionally exhibit shifting output structures, stylistic variations, or unpredictable pacing across sequential prompts.
🛑 Logic Mismatches: For highly advanced mathematical proofs or incredibly niche programming languages, the model may occasionally produce broken syntax or reverse its logical assertions.

📦 How to Use Natively

🐍 Using Hugging Face Transformers

Because this is a standalone model with the weights baked in, you load it directly without any PEFT wrapper boilerplate:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_path = "YOUR_USERNAME/YOUR_REPO_NAME"

# Load the aligned tokenizer and model weights directly
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
    model_path, 
    torch_dtype=torch.float16, # Native F16 weight format
    device_map="auto"
)

messages = [
    {"role": "user", "content": "Write a Python script to calculate the exact nth Fibonacci number using matrix exponentiation."}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    temperature=0.5,
    top_p=30.0,
    repeat_penalty=1.2,
)

# Uploaded finetuned model

- **Developed by:** ertghiu256
- **License:** apache-2.0
- **Finetuned from model :** unsloth/Qwen3.5-2B

This qwen3_5 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

Downloads last month: 89

Safetensors

Model size

2B params

Tensor type

F32

BF16

Model tree for ertghiu256/Qwen3.5-2b-ReMix

Base model

Qwen/Qwen3.5-2B-Base

Finetuned

Qwen/Qwen3.5-2B

Finetuned

(177)

this model

Quantizations

2 models

ertghiu256
/

Qwen3.5-2b-ReMix

🚀 Qwen3.5-2B-ReMix (Reasoning Mix)

🌟 Model Highlights

🎛️ Recommended Generation Parameters

📊 Training & Merge Details

⚠️ Limitations & Risks

📦 How to Use Natively

🐍 Using Hugging Face Transformers

Model tree for ertghiu256/Qwen3.5-2b-ReMix

Datasets used to train ertghiu256/Qwen3.5-2b-ReMix