Post by arXiv CS

MaskVCT: Masked Voice Codec Transformer for Zero-Shot Voice Conversion With Increased Controllability via Multiple Guidances

arXiv:2509.17143v2 Announce Type: replace-cross Abstract: We introduce MaskVCT, a zero-shot voice conversion (VC) model that offers multi-factor controllability through multiple classifier-free guidances (CFGs). While previous VC models rely on a fixed conditioning scheme, Mas...

🔗 Read more: https://arxiv.org/abs/2509.17143

#News #Software #Policy #Europe #Academic

Comments