GarmentGPT Models
This repository contains all the necessary model components for the GarmentGPT project.
Models Included
This repository hosts three key components:
- Vision-Language Model (LLM): A fine-tuned multi-modal model responsible for generating discrete garment tokens from an input image.
- Edge Codec: A VQ-VAE-based model for decoding edge indices into high-fidelity geometric curves. The configuration is in
codec_config.yamland weights are incodec_model.pth. - RT Codec: A VQ-VAE-based model for decoding location indices into 3D panel rotation and translation. The configuration is in
rt_config.yamland weights are inrt_model.pth.
Usage
These models are designed to be used with the main application code available at https://github.com/ChimerAI-MMLab/Garment-GPT. The inference script will automatically download these files.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support