相关工具
VibeVoice
Open-source voice AI framework from Microsoft that includes both ASR and TTS capabilities. ASR supports 60-minute long-form audio with speaker diarization and 50+ languages; TTS generates natural long-form multi-speaker or real-time streaming speech.
Free
vocalremover
Extract vocal and music
Free访问 →
lala.ai
Extract vocal, accompaniment and various instruments from any audio and video
Free/Paid访问 →