Summary
OpenVoice does voice cloning/synthesis. FunASR provides the complementary ASR front-end:
- 170x faster transcription — RTF 0.006-0.007
- 50+ languages (SenseVoice)
- Speaker diarization (CAM++) — useful for multi-speaker voice cloning
- OpenAI-compatible API — easy to chain ASR → voice cloning
In a voice pipeline: FunASR transcribes → OpenVoice clones voice → TTS generates. Both are Apache 2.0.
pip install funasr
funasr speaker_audio.wav --spk -f json # Identify speakers
# → feed speaker audio to OpenVoice for cloning
GitHub: https://github.com/modelscope/FunASR (17.8K+ stars)
Summary
OpenVoice does voice cloning/synthesis. FunASR provides the complementary ASR front-end:
In a voice pipeline: FunASR transcribes → OpenVoice clones voice → TTS generates. Both are Apache 2.0.
GitHub: https://github.com/modelscope/FunASR (17.8K+ stars)