Efficient Emotion and Speaker Adaptation in LLM-Based TTS via Characteristic-Specific Partial Fine-Tuning
arXiv:2501.14273v2 Announce Type: replace-cross Abstract: While LLM-based TTS models exhibit zero-shot emotion and speaker cloning, their cloning fidelity and pronunciation clarity degrade on unseen domains. Fine-tuning is essential for adaptation, yet uniform approaches overl...
🔗 Read more: https://arxiv.org/abs/2501.14273
#News #Psychology #Software #Policy #AI #Academic
Edited
Comments
Log in to leave a comment.
No comments yet. Be the first to comment!