Qwen3 4B Thinking 2507 - MiniMax M2.1 Distill

This model was trained on a reasoning dataset of MiniMax M2.1.

  • 🧬 Datasets:

    • TeichAI/MiniMax-M2.1-8800x
  • 🏗 Base Model:

    • unsloth/Qwen3-4B-Thinking-2507
  • ⚡ Use cases:

    • Coding
    • Science
    • Deep Research
  • ∑ Stats (Dataset)

    • Costs: $ 42.94 (USD)
    • Total tokens (input + output): 39.2 M

This qwen3 model was trained 2x faster with Unsloth and Huggingface's TRL library.

An Ollama Modelfile is included for easy deployment.

Downloads last month
1,001
GGUF
Model size
4B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for TeichAI/Qwen3-4B-Thinking-2507-MiniMax-M2.1-Distill-GGUF

Dataset used to train TeichAI/Qwen3-4B-Thinking-2507-MiniMax-M2.1-Distill-GGUF