kNN-TTS: kNN Retrieval for Simple and Effective Zero-Shot Multi-speaker Text-to-Speech

Input

Target Voice
0 2
1 50

Weight neighbors by similarity distance

Generated Audio

Examples
Text Target Voice Voice Morphing (λ) Top-k Retrieval Use Weighted Averaging