Outlier Vision (MLX 4-bit)

Image + text understanding on Apple Silicon. Runs natively on M1/M2/M3/M4 Macs via MLX. Upload an image and ask anything — no cloud round-trip, no per-token billing, fully offline.

Bundled in the Outlier desktop app. One-click install. Vision, chat, code agent, projects — all local.

Quick facts

Architecture Qwen3_5MoeForConditionalGeneration (35B MoE, ~3.6B active params)
Format MLX 4-bit
Peak RAM ~14.69 GB
Text speed (M1 Ultra 64GB) 61.28 tok/s (mlx-lm path)
Context length 256K tokens
Image support ✅ via mlx_vlm 0.4.4+
License Apache 2.0
Compatible with mlx_lm, mlx_vlm, Outlier desktop app

Quickstart — text only (mlx-lm)

pip install -U mlx-lm
python -m mlx_lm.generate \
  --model Outlier-Ai/Outlier-Vision \
  --prompt "Explain mixture-of-experts in one paragraph." \
  --max-tokens 256

Quickstart — image + text (mlx_vlm)

pip install -U mlx-vlm torchvision

from pathlib import Path
import mlx_vlm

model, processor = mlx_vlm.load("Outlier-Ai/Outlier-Vision")
config = model.config

# REQUIRED: apply_chat_template injects vision tokens
prompt = mlx_vlm.apply_chat_template(
    processor, config,
    "What is in this image?",
    num_images=1
)

output = mlx_vlm.generate(
    model, processor,
    prompt=prompt,
    image="path/to/image.jpg",
    max_tokens=512,
    verbose=True,
)
print(output)

Important: always use mlx_vlm.apply_chat_template(..., num_images=1) before passing an image. Skipping this causes a ValueError: Image features and image tokens do not match.

Benchmarks (Mac Studio M1 Ultra 64GB)

Metric Value Date Source
Text throughput 61.28 tok/s 2026-04-25 evidence/track_f/vision_tier_text_speed.json
Peak RAM 14.69 GB 2026-04-25 swebench sprint track_g
Image inference ✅ working 2026-04-23 mlx_vlm 0.4.4 unblock

Use the Outlier desktop app

outlier.host — 9 MB Mac installer. Vision tier loads automatically. No Python setup required.

What is Outlier?

Mac-native AI platform — chat, code agent with 9 tools, projects with .gitignore-aware codebase indexing, artifacts, SQLite memory, OpenAI-compatible local API. Everything offline, no subscription.

Known limits

  • Intel Macs: MLX requires Apple Silicon. For Intel Mac / Windows / Linux, no supported path yet.
  • Video inference: architecture supports it; mlx_vlm video path not wired in current app build.
  • Image inference requires mlx_vlm ≥ 0.4.4 + torchvision installed in the Python env.

Attribution

Base: mlx-community/Qwen3.6-35B-A3B-4bit, which derives from Qwen3.6-35B-A3B by the Qwen team at Alibaba Cloud. Apache 2.0. Outlier contributes the app integration, the /chat/vision inference path, and the VLM serving logic. Capability credit for the base model belongs to upstream.

License

Apache 2.0 throughout.

Downloads last month
39
Safetensors
Model size
6B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Outlier-Ai/Outlier-Vision

Quantized
(1)
this model

Evaluation results

  • Text throughput tok/s (Mac Studio M1 Ultra 64GB, mlx-lm path)
    self-reported
    61.280
  • Peak RAM GB (Mac Studio M1 Ultra 64GB)
    self-reported
    14.690