Wan2.1-T2V-1.3B — Ternary Quantized (tritplane3)

First publicly available ternary-quantized Wan video model on HuggingFace.

Ternary-quantized version of Wan-AI/Wan2.1-T2V-1.3B-Diffusers — Alibaba's text-to-video DiT model. Produced with ternary-quant applied to the WanTransformer3DModel.

Specifications

Property	Value
Base Model	Wan-AI/Wan2.1-T2V-1.3B-Diffusers
Architecture	WanTransformer3DModel (DiT)
Transformer Params	1.42B
Quantization	tritplane3 (306 linear layers)
Text Encoder (UMT5-XXL)	FP16 (preserved)
VAE (WanVAE)	FP16 (preserved)
License	Apache 2.0

Size

Method	Transformer Size
FP16 (original)	2.84 GB
Ternary tritplane3 (theoretical packed)	~1.42 GB
In this repo (dequantized FP16)	2.7 GB

Weights have ternary precision but stored as FP16 for drop-in diffusers compatibility.

Usage

import torch
from diffusers import WanPipeline
from diffusers.utils import export_to_video

pipe = WanPipeline.from_pretrained(
    "AsadIsmail/Wan2.1-T2V-1.3B-ternary",
    torch_dtype=torch.bfloat16,
)
pipe.to("mps")  # or "cuda"

output = pipe(
    prompt="a cat walking on green grass",
    num_frames=81,
    num_inference_steps=30,
).frames[0]
export_to_video(output, "output.mp4", fps=16)

Collection

Part of ternary-models.

GitHub: github.com/Asad-Ismail/ternary-models

Downloads last month: 15

Model tree for AsadIsmail/Wan2.1-T2V-1.3B-ternary

Base model

Wan-AI/Wan2.1-T2V-1.3B-Diffusers

Finetuned

(10)

this model

Collection including AsadIsmail/Wan2.1-T2V-1.3B-ternary

ternary-models: VLMs, Multimodal & Audio

Collection

Ternary-quantized models for architectures GGUF can't handle. tritplane3 scheme. • 16 items • Updated Apr 17 • 2