You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Qwen3.5-35B-Optimized-HauhauCS

Join the Discord for updates, roadmaps, projects, or just to chat.

Optimized Qwen3.5-35B-A3B by HauhauCS.

Access

This is currently a Closed Beta release designed to lower (V)RAM requirements by up to 50% without sacrificing real world capabilities.

Downloads

File Type Size
Qwen3.5-35B-Optimized-HauhauCS-40-Q8_K_P.gguf Q8_K_P 25 GB
Qwen3.5-35B-Optimized-HauhauCS-40-Q4_K_P.gguf Q4_K_P 14 GB
mmproj-Qwen3.5-35B-Optimized-HauhauCS-40-f16.gguf mmproj (F16) 858 MB

What are K_P quants?

K_P quants use model-specific importance analysis to selectively preserve quality where it matters most. Fully compatible with llama.cpp, LM Studio, and any GGUF runtime.

Specs

  • 35B-A3B MoE (35B total, ~3B active per forward pass)
  • 262K context
  • Multimodal (vision support via mmproj)
  • Based on Qwen3.5-35B-A3B

Usage

Works with llama.cpp, LM Studio, Jan, koboldcpp, etc.

llama-cli -m Qwen3.5-35B-Optimized-HauhauCS-40-Q8_K_P.gguf --mmproj mmproj-Qwen3.5-35B-Optimized-HauhauCS-40-f16.gguf -ngl 99

Note: K_P quants may show as "?" in LM Studio's quant column โ€” display issue only, loads and runs fine.

Downloads last month
11
GGUF
Model size
22B params
Architecture
qwen35moe
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support