ACE-Step 1.5 XL โ€” Turbo (4B DiT) BF16

Project | Hugging Face | ModelScope | Space Demo | Discord | Tech Report

Model Details

This is the BF16 version of ACE-Step/acestep-v15-xl-turbo โ€” the XL (4B) Turbo variant of ACE-Step 1.5. This BF16 conversion reduces memory usage while maintaining near-identical quality to the original model. It is a distillation-accelerated model that generates high-quality audio in just 8 steps, combining the speed of turbo with the quality of the 4B architecture.

XL Architecture

Parameter Value
DiT Decoder hidden_size 2560
DiT Decoder layers 32
DiT Decoder attention heads 32
Encoder hidden_size 2048
Encoder layers 8
Total params ~4B
Weights size (bf16) ~7.5 GB
Inference steps 8 (no CFG, distilled)

GPU Requirements

VRAM Support
โ‰ฅ8 GB With CPU offload + INT8 quantization
โ‰ฅ12 GB With CPU offload
โ‰ฅ16 GB Without offload (recommended)
โ‰ฅ20 GB Full quality (XL + 4B LM)

All LM models (0.6B / 1.7B / 4B) are fully compatible with XL.

Key Features

  • ๐Ÿ’ฐ Commercial-Ready: Trained on legally compliant datasets. Generated music can be used for commercial purposes.
    • ๐Ÿ“š Safe Training Data: Licensed music, royalty-free/public domain, and synthetic (MIDI-to-Audio) data.
      • โšก Fast: 8-step inference โ€” the fastest XL variant.
        • ๐Ÿ”ฎ Higher Quality: 4B parameters provide richer audio quality than 2B turbo.

          • ๐Ÿง  BF16 Precision: Converted to BF16 for reduced VRAM usage and faster inference, with negligible quality loss.

          • Quick Start

        • # Install ACE-Step
          git clone https://github.com/ace-step/ACE-Step-1.5.git
          cd ACE-Step-1.5
          pip install -e .
          
          # Download this model
          huggingface-cli download marcorez8/acestep-v15-xl-turbo-bf16 --local-dir ./checkpoints/acestep-v15-xl-turbo-bf16
          
          # Run with Gradio UI
          python acestep --config-path acestep-v15-xl-turbo-bf16
          

          Model Zoo

          XL (4B) DiT Models

          DiT Model CFG Steps Quality Diversity Tasks Hugging Face ModelScope
          acestep-v15-xl-base โœ… 50 High High All (extract, lego, complete) Link Link
          acestep-v15-xl-sft โœ… 50 Very High Medium Standard Link Link
          acestep-v15-xl-turbo โŒ 8 Very High Medium Standard Link Link
          acestep-v15-xl-turbo-bf16 โŒ 8 Very High Medium Standard This repo โ€”

          LM Models (all compatible with XL)

          LM Model Params Audio Understanding Composition Hugging Face ModelScope
          acestep-5Hz-lm-0.6B 0.6B Medium Medium Link Link
          acestep-5Hz-lm-1.7B 1.7B Medium Medium Included in main Included in main
          acestep-5Hz-lm-4B 4B Strong Strong Link Link

          Acknowledgements

          This project is co-led by ACE Studio and StepFun. The BF16 conversion was done by marcorez8 to make the model more accessible to the community.

          Citation

          @misc{gong2026acestep,
            title={ACE-Step 1.5: Pushing the Boundaries of Open-Source Music Generation},
            author={Junmin Gong, Yulin Song, Wenxiao Zhao, Sen Wang, Shengyuan Xu, Jing Guo},
            howpublished={\url{https://github.com/ace-step/ACE-Step-1.5}},
            year={2026},
            note={GitHub repository}
          }
          

Downloads last month
87
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for marcorez8/acestep-v15-xl-turbo-bf16

Finetuned
(1)
this model

Paper for marcorez8/acestep-v15-xl-turbo-bf16