Instructions to use akashicmarga/indic-parler-tts-bhojpuri-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use akashicmarga/indic-parler-tts-bhojpuri-lora with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir indic-parler-tts-bhojpuri-lora akashicmarga/indic-parler-tts-bhojpuri-lora
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
Indic Parler TTS — Bhojpuri LoRA (MLX)
First MLX implementation of Parler TTS, fine-tuned for Bhojpuri language on Apple Silicon.
This is a LoRA adapter on top of ai4bharat/indic-parler-tts.
The base model supports 18 Indian languages — this adapter teaches it Bhojpuri phoneme patterns.
Highlights
- Runs natively on Apple Silicon (M1/M2/M3/M4) via MLX
- First Parler TTS port to MLX — no PyTorch required
- Fine-tuned on the IISc SYSPIN Bhojpuri corpus (CC-BY-4.0)
- LoRA rank-8 adapter — only 4.3M trainable params out of 920M total (0.47%)
- All 69 base model speakers can now speak Bhojpuri
Model Details
| Property | Value |
|---|---|
| Base model | ai4bharat/indic-parler-tts |
| Training data | IISc SYSPIN Bhojpuri Female (5,537 clips) |
| LoRA rank | 8, alpha 16 |
| LoRA targets | Decoder self-attn, cross-attn Q/V, FFN |
| Trained on | MacBook Pro (Apple Silicon) |
| Framework | MLX |
| Training steps | ~1,400 (2 epochs) |
Audio Samples
"ई बहुत नीमन बा। हम कल जाइब।" — Bhojpuri (short)
| Base model | Fine-tuned | |
|---|---|---|
| Divya | ||
| Generic |
"रउरा के राम राम। आज हम एही गाँव में रहीला। का हाल बा रउरा के?" — Bhojpuri (long)
| Base model | Fine-tuned | |
|---|---|---|
| Divya | ||
| Generic |
"नमस्ते, आप कैसे हैं? मैं बहुत अच्छा हूँ।" — Hindi sanity check (should sound similar on both)
| Base model | Fine-tuned | |
|---|---|---|
| Divya | ||
| Generic |
Usage
pip install mlx mlx-lm soundfile
import sys
sys.path.insert(0, "path/to/mlx-audio-train")
from models.indic_parler_tts.generate import load_model, generate
from train.lora import apply_lora, load_adapters, LoRAConfig
import soundfile as sf
import mlx.core as mx
from huggingface_hub import snapshot_download
# 1. Load base model
model, tokenizers = load_model("ai4bharat/indic-parler-tts")
# 2. Apply LoRA and load adapter
adapter_dir = snapshot_download("akashicmarga/indic-parler-tts-bhojpuri-lora")
lora_config = LoRAConfig(
rank=8, alpha=16.0, dropout=0.0,
target_modules=[
"decoder.layers.*.self_attn.q", "decoder.layers.*.self_attn.k",
"decoder.layers.*.self_attn.v", "decoder.layers.*.self_attn.out",
"decoder.layers.*.cross_attn.q", "decoder.layers.*.cross_attn.v",
"decoder.layers.*.fc1", "decoder.layers.*.fc2",
],
model_type="indic_parler_tts",
)
apply_lora(model, lora_config)
load_adapters(model, f"{adapter_dir}/adapters.safetensors")
mx.eval(model.parameters())
# 3. Generate Bhojpuri speech
audio = generate(
model, tokenizers,
description="A female speaker delivers speech at a moderate pace. The recording is of very high quality.",
text="रउरा के राम राम। आज हम एही गाँव में रहीला।",
)
sf.write("bhojpuri.wav", audio, 44100)
Any speaker description supported by the base model works:
# Divya speaking Bhojpuri
audio = generate(
model, tokenizers,
description="Divya's voice is slightly expressive and very animated. She speaks at a moderate pace.",
text="ई बहुत नीमन बा। हम कल जाइब।",
)
Training
Trained using mlx-audio-train on Apple Silicon.
python scripts/train.py --config configs/indic_parler_bhojpuri.yaml
Dataset
IISc SYSPIN Corpus — Bhojpuri Female Speaker
License: CC-BY-4.0
License
Adapter weights: Apache 2.0
Base model: Apache 2.0 (ai4bharat/indic-parler-tts)
Training data: CC-BY-4.0 (IISc SYSPIN)
Quantized
Model tree for akashicmarga/indic-parler-tts-bhojpuri-lora
Base model
ai4bharat/indic-parler-tts