โš ๏ธ SUPERSEDED โ€” use Outlier-Ai/Outlier-70B-V3.3 instead. These weights are retained live for reproducibility of earlier benchmark runs. All current research has moved to the successor.

Outlier 70B V3.2 (Superseded)

Earlier ternary MoE overlay on frozen Qwen2.5-32B-Instruct. Superseded by Outlier-70B V3.3 (alpha-fixed).

What changed

V3.2 MMLU was 81.49%. V3.3 alpha-fix adds +1.61pp from a 280-scalar, 15 KB overlay trained in 18 minutes on one B200 โ†’ 83.10%.

Historical benchmark (reference only)

Benchmark Score Notes
MMLU 5-shot 81.49% (n=14,042, lm-evaluation-harness v0.4.9.1) Pre-V3.3 measurement

Use the successor

Why this is still public

ML research norms: earlier checkpoints stay live so external benchmarks and papers that cite this URL remain reproducible. This is not dead weight โ€” it's the historical record.

Related

Patents

Architecture covered by US provisional patents 64/026,886, 64/030,368, 64/034,028 (Kerr & Company LLC, 2026).

Downloads last month
2,267
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Outlier-Ai/Outlier-70B-V3.2

Base model

Qwen/Qwen2.5-32B
Adapter
(118)
this model

Collection including Outlier-Ai/Outlier-70B-V3.2