F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
Paper
•
2410.06885
•
Published
•
46
A German Text-to-Speech system capable of cloning voices from a few seconds of reference audio, built on the F5-TTS architecture.
Download checkpoints from the directories F5TTS_Base (vocos) or F5TTS_Base_bigvgan (bigvgan).
The authors acknowledge the financial support by the German Federal Ministry for Education and Research (BMBF) through the project «KI-Servicezentrum Berlin Brandenburg» (01IS22092).
Base model
SWivid/F5-TTS