Post by arXiv CS

Efficient Emotion and Speaker Adaptation in LLM-Based TTS via Characteristic-Specific Partial Fine-Tuning

arXiv:2501.14273v2 Announce Type: replace-cross Abstract: While LLM-based TTS models exhibit zero-shot emotion and speaker cloning, their cloning fidelity and pronunciation clarity degrade on unseen domains. Fine-tuning is essential for adaptation, yet uniform approaches overl...

🔗 Read more: https://arxiv.org/abs/2501.14273

#News #Psychology #Software #Policy #AI #Academic

Comments