Outlier Vision (MLX 4-bit)

Image + text understanding on Apple Silicon. Runs natively on M1/M2/M3/M4 Macs via MLX. Upload an image and ask anything — no cloud round-trip, no per-token billing, fully offline.

Bundled in the Outlier desktop app. One-click install. Vision, chat, code agent, projects — all local.

Quick facts


Architecture	Qwen3_5MoeForConditionalGeneration (35B MoE, ~3.6B active params)
Format	MLX 4-bit
Peak RAM	~14.69 GB
Text speed (M1 Ultra 64GB)	61.28 tok/s (mlx-lm path)
Context length	256K tokens
Image support	✅ via mlx_vlm 0.4.4+
License	Apache 2.0
Compatible with	mlx_lm, mlx_vlm, Outlier desktop app

Quickstart — text only (mlx-lm)

pip install -U mlx-lm
python -m mlx_lm.generate \
  --model Outlier-Ai/Outlier-Vision \
  --prompt "Explain mixture-of-experts in one paragraph." \
  --max-tokens 256

Quickstart — image + text (mlx_vlm)

pip install -U mlx-vlm torchvision

from pathlib import Path
import mlx_vlm

model, processor = mlx_vlm.load("Outlier-Ai/Outlier-Vision")
config = model.config

# REQUIRED: apply_chat_template injects vision tokens
prompt = mlx_vlm.apply_chat_template(
    processor, config,
    "What is in this image?",
    num_images=1
)

output = mlx_vlm.generate(
    model, processor,
    prompt=prompt,
    image="path/to/image.jpg",
    max_tokens=512,
    verbose=True,
)
print(output)

Important: always use mlx_vlm.apply_chat_template(..., num_images=1) before passing an image. Skipping this causes a ValueError: Image features and image tokens do not match.

Benchmarks (Mac Studio M1 Ultra 64GB)

Metric	Value	Date	Source
Text throughput	61.28 tok/s	2026-04-25	evidence/track_f/vision_tier_text_speed.json
Peak RAM	14.69 GB	2026-04-25	swebench sprint track_g
Image inference	✅ working	2026-04-23	mlx_vlm 0.4.4 unblock

Use the Outlier desktop app

outlier.host — 9 MB Mac installer. Vision tier loads automatically. No Python setup required.

What is Outlier?

Mac-native AI platform — chat, code agent with 9 tools, projects with .gitignore-aware codebase indexing, artifacts, SQLite memory, OpenAI-compatible local API. Everything offline, no subscription.

App: outlier.host
Discord: discord.gg/Hapennmdn9
Founders lifetime ($200 one-time): buy.polar.sh
Org: huggingface.co/Outlier-Ai

Known limits

Intel Macs: MLX requires Apple Silicon. For Intel Mac / Windows / Linux, no supported path yet.
Video inference: architecture supports it; mlx_vlm video path not wired in current app build.
Image inference requires mlx_vlm ≥ 0.4.4 + torchvision installed in the Python env.

Attribution

Base: mlx-community/Qwen3.6-35B-A3B-4bit, which derives from Qwen3.6-35B-A3B by the Qwen team at Alibaba Cloud. Apache 2.0. Outlier contributes the app integration, the /chat/vision inference path, and the VLM serving logic. Capability credit for the base model belongs to upstream.

License

Apache 2.0 throughout.

Downloads last month: 39

Safetensors

Model size

6B params

Tensor type

BF16

U32

MLX

Hardware compatibility

4-bit

Model tree for Outlier-Ai/Outlier-Vision

Base model

Qwen/Qwen3.6-35B-A3B

Quantized

mlx-community/Qwen3.6-35B-A3B-4bit

Quantized

(1)

this model

Evaluation results

Text throughput tok/s (Mac Studio M1 Ultra 64GB, mlx-lm path)
self-reported

61.280
Peak RAM GB (Mac Studio M1 Ultra 64GB)
self-reported

14.690