相关工具
VibeVoice
Open-source voice AI framework from Microsoft that includes both ASR and TTS capabilities. ASR supports 60-minute long-form audio with speaker diarization and 50+ languages; TTS generates natural long-form multi-speaker or real-time streaming speech.
Free
so-vits-svc
SoftVC VITS Singing Voice Conversion.
Free
lala.ai
Extract vocal, accompaniment and various instruments from any audio and video
Free/Paid访问 →