MaskVCT: Masked Voice Codec Transformer for Zero-Shot Voice Conversion With Increased Controllability via Multiple Guidances
arXiv:2509.17143v2 Announce Type: replace-cross Abstract: We introduce MaskVCT, a zero-shot voice conversion (VC) model that offers multi-factor controllability through multiple classifier-free guidances (CFGs). While previous VC models rely on a fixed conditioning scheme, Mas...
🔗 Read more: https://arxiv.org/abs/2509.17143
#News #Software #Policy #Europe #Academic
Edited
Comments
Log in to leave a comment.
No comments yet. Be the first to comment!