Indic Parler TTS — Bhojpuri LoRA (MLX)

First MLX implementation of Parler TTS, fine-tuned for Bhojpuri language on Apple Silicon.

This is a LoRA adapter on top of ai4bharat/indic-parler-tts.
The base model supports 18 Indian languages — this adapter teaches it Bhojpuri phoneme patterns.

Highlights

Runs natively on Apple Silicon (M1/M2/M3/M4) via MLX
First Parler TTS port to MLX — no PyTorch required
Fine-tuned on the IISc SYSPIN Bhojpuri corpus (CC-BY-4.0)
LoRA rank-8 adapter — only 4.3M trainable params out of 920M total (0.47%)
All 69 base model speakers can now speak Bhojpuri

Model Details

Property	Value
Base model	`ai4bharat/indic-parler-tts`
Training data	IISc SYSPIN Bhojpuri Female (5,537 clips)
LoRA rank	8, alpha 16
LoRA targets	Decoder self-attn, cross-attn Q/V, FFN
Trained on	MacBook Pro (Apple Silicon)
Framework	MLX
Training steps	~1,400 (2 epochs)

Audio Samples

"ई बहुत नीमन बा। हम कल जाइब।" — Bhojpuri (short)

	Base model	Fine-tuned
Divya
Generic

"रउरा के राम राम। आज हम एही गाँव में रहीला। का हाल बा रउरा के?" — Bhojpuri (long)

	Base model	Fine-tuned
Divya
Generic

"नमस्ते, आप कैसे हैं? मैं बहुत अच्छा हूँ।" — Hindi sanity check (should sound similar on both)

	Base model	Fine-tuned
Divya
Generic

Usage

pip install mlx mlx-lm soundfile

import sys
sys.path.insert(0, "path/to/mlx-audio-train")

from models.indic_parler_tts.generate import load_model, generate
from train.lora import apply_lora, load_adapters, LoRAConfig
import soundfile as sf
import mlx.core as mx
from huggingface_hub import snapshot_download

# 1. Load base model
model, tokenizers = load_model("ai4bharat/indic-parler-tts")

# 2. Apply LoRA and load adapter
adapter_dir = snapshot_download("akashicmarga/indic-parler-tts-bhojpuri-lora")
lora_config = LoRAConfig(
    rank=8, alpha=16.0, dropout=0.0,
    target_modules=[
        "decoder.layers.*.self_attn.q", "decoder.layers.*.self_attn.k",
        "decoder.layers.*.self_attn.v", "decoder.layers.*.self_attn.out",
        "decoder.layers.*.cross_attn.q", "decoder.layers.*.cross_attn.v",
        "decoder.layers.*.fc1", "decoder.layers.*.fc2",
    ],
    model_type="indic_parler_tts",
)
apply_lora(model, lora_config)
load_adapters(model, f"{adapter_dir}/adapters.safetensors")
mx.eval(model.parameters())

# 3. Generate Bhojpuri speech
audio = generate(
    model, tokenizers,
    description="A female speaker delivers speech at a moderate pace. The recording is of very high quality.",
    text="रउरा के राम राम। आज हम एही गाँव में रहीला।",
)
sf.write("bhojpuri.wav", audio, 44100)

Any speaker description supported by the base model works:

# Divya speaking Bhojpuri
audio = generate(
    model, tokenizers,
    description="Divya's voice is slightly expressive and very animated. She speaks at a moderate pace.",
    text="ई बहुत नीमन बा। हम कल जाइब।",
)

Training

Trained using mlx-audio-train on Apple Silicon.

python scripts/train.py --config configs/indic_parler_bhojpuri.yaml

Dataset

IISc SYSPIN Corpus — Bhojpuri Female Speaker
License: CC-BY-4.0

License

Adapter weights: Apache 2.0
Base model: Apache 2.0 (ai4bharat/indic-parler-tts)
Training data: CC-BY-4.0 (IISc SYSPIN)

Downloads last month: -; Downloads are not tracked for this model. How to track

MLX

Hardware compatibility

Quantized

Model tree for akashicmarga/indic-parler-tts-bhojpuri-lora

Base model

ai4bharat/indic-parler-tts

Adapter

(1)

this model