Instructions to use genmo/mochi-1-preview with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use genmo/mochi-1-preview with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("genmo/mochi-1-preview", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Genmo
How to use genmo/mochi-1-preview with Genmo:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Inference
- Notebooks
- Google Colab
- Kaggle
Fail to run on two A10Gs
#25
by yNilay - opened
Hi! The model failed to run on two A10Gs, is there any way to run it on two A10Gs? Thanks!
Error:
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 522.00 MiB. GPU 0 has a total capacity of 22.18 GiB of which 328.69 MiB is free. Process 293155 has 21.86 GiB memory in use. Of the allocated memory 20.74 GiB is allocated by PyTorch, and 858.09 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
Code:
import torch
from diffusers import MochiPipeline
from diffusers.utils import export_to_video
pipe = MochiPipeline.from_pretrained("genmo/mochi-1-preview", variant="bf16", torch_dtype=torch.bfloat16)
# Enable memory savings
pipe.enable_model_cpu_offload()
pipe.enable_vae_tiling()
prompt = "Close-up of a chameleon's eye, with its scaly skin changing color. Ultra high resolution 4k."
frames = pipe(prompt, num_frames=84).frames[0]
export_to_video(frames, "mochi.mp4", fps=30)
Can you try using the official Mochi API, instead of the diffusers API? https://github.com/genmoai/mochi
There's a cli.py script in demos that should automatically shard the model on multiple GPUs.