gabrieldemarmiesse/moshi
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
Stars: 0Language: Python
Give AlbumentationsX a star on GitHub — it powers this leaderboard
Star on GitHubMoshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.