MobileCLIP-2
Collection
OpenCLIP / timm ports of Apple's MobileCLIP-2 multi-modal and image encoders • 12 items • Updated • 1
How to use timm/fastvit_mci2.apple_mclip2_dfndr2b with timm:
import timm
model = timm.create_model("hf_hub:timm/fastvit_mci2.apple_mclip2_dfndr2b", pretrained=True)How to use timm/fastvit_mci2.apple_mclip2_dfndr2b with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("image-feature-extraction", model="timm/fastvit_mci2.apple_mclip2_dfndr2b") # Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("timm/fastvit_mci2.apple_mclip2_dfndr2b", dtype="auto")A MobileCLIP v2 (image encoder only) for timm. Equivalent to image tower from https://huggingface.co/timm/MobileCLIP2-S2-OpenCLIP.
@article{faghri2025mobileclip2,
title={MobileCLIP2: Improving Multi-Modal Reinforced Training},
author={Faghri, Fartash and Vasu, Pavan Kumar Anasosalu and Koc, Cem and Shankar, Vaishaal and Toshev, Alexander and Tuzel, Oncel and Pouransari, Hadi},
journal={arXiv preprint arXiv:2508.20691},
year={2025}
}