Wan2.1-T2V-1.3B โ€” Ternary Quantized (tritplane3)

First publicly available ternary-quantized Wan video model on HuggingFace.

Ternary-quantized version of Wan-AI/Wan2.1-T2V-1.3B-Diffusers โ€” Alibaba's text-to-video DiT model. Produced with ternary-quant applied to the WanTransformer3DModel.

Specifications

Property Value
Base Model Wan-AI/Wan2.1-T2V-1.3B-Diffusers
Architecture WanTransformer3DModel (DiT)
Transformer Params 1.42B
Quantization tritplane3 (306 linear layers)
Text Encoder (UMT5-XXL) FP16 (preserved)
VAE (WanVAE) FP16 (preserved)
License Apache 2.0

Size

Method Transformer Size
FP16 (original) 2.84 GB
Ternary tritplane3 (theoretical packed) ~1.42 GB
In this repo (dequantized FP16) 2.7 GB

Weights have ternary precision but stored as FP16 for drop-in diffusers compatibility.

Usage

import torch
from diffusers import WanPipeline
from diffusers.utils import export_to_video

pipe = WanPipeline.from_pretrained(
    "AsadIsmail/Wan2.1-T2V-1.3B-ternary",
    torch_dtype=torch.bfloat16,
)
pipe.to("mps")  # or "cuda"

output = pipe(
    prompt="a cat walking on green grass",
    num_frames=81,
    num_inference_steps=30,
).frames[0]
export_to_video(output, "output.mp4", fps=16)

Collection

Part of ternary-models.

GitHub: github.com/Asad-Ismail/ternary-models

Downloads last month
15
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for AsadIsmail/Wan2.1-T2V-1.3B-ternary

Finetuned
(10)
this model

Collection including AsadIsmail/Wan2.1-T2V-1.3B-ternary