Qwen3.5-35B-Optimized-HauhauCS
Join the Discord for updates, roadmaps, projects, or just to chat.
Optimized Qwen3.5-35B-A3B by HauhauCS.
Access
This is currently a Closed Beta release designed to lower (V)RAM requirements by up to 50% without sacrificing real world capabilities.
Downloads
| File | Type | Size |
|---|---|---|
| Qwen3.5-35B-Optimized-HauhauCS-40-Q8_K_P.gguf | Q8_K_P | 25 GB |
| Qwen3.5-35B-Optimized-HauhauCS-40-Q4_K_P.gguf | Q4_K_P | 14 GB |
| mmproj-Qwen3.5-35B-Optimized-HauhauCS-40-f16.gguf | mmproj (F16) | 858 MB |
What are K_P quants?
K_P quants use model-specific importance analysis to selectively preserve quality where it matters most. Fully compatible with llama.cpp, LM Studio, and any GGUF runtime.
Specs
- 35B-A3B MoE (35B total, ~3B active per forward pass)
- 262K context
- Multimodal (vision support via mmproj)
- Based on Qwen3.5-35B-A3B
Usage
Works with llama.cpp, LM Studio, Jan, koboldcpp, etc.
llama-cli -m Qwen3.5-35B-Optimized-HauhauCS-40-Q8_K_P.gguf --mmproj mmproj-Qwen3.5-35B-Optimized-HauhauCS-40-f16.gguf -ngl 99
Note: K_P quants may show as "?" in LM Studio's quant column โ display issue only, loads and runs fine.
- Downloads last month
- 11
Hardware compatibility
Log In to add your hardware
We're not able to determine the quantization variants.
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support