MegaTTS 3 Voice Cloning
MegaTTS 3 is a text-to-speech model trained by ByteDance with exceptional voice cloning capabilities. The original authors did not release the WavVAE encoder, so voice cloning was not publicly available; however, thanks to @ACoderPassBy's WavVAE encoder, we can now clone voices with MegaTTS 3!
This is by no means the best voice cloning solution, but it works pretty well for some specific use-cases. Try out multiple and see which one works best for you.
Please use this Space responsibly and do not abuse it! This demo is for research and educational purposes only!
h/t to MysteryShack on Discord for the info about the unofficial WavVAE encoder!
Upload a reference audio clip and enter text to generate speech with the cloned voice.