$20

VibeVoice Large Modal

I want this!

VibeVoice Large Modal

$20

We’re bringing long-form, multi-speaker conversational speech synthesis to everyone — fully open-source, easy to run, and now live on Modal.


🚀 What’s New

  • Bigger Model (7B): Generate high-fidelity speech for extended conversations.
  • Multi-Speaker Dialogue: Up to 4 unique voices for podcasts, audiobooks, or roleplay.
  • Ultra-Long Context: Handle conversations up to 45 minutes with natural flow.
  • Next-Token Diffusion Framework: Combines LLM-style context with diffusion-based acoustic detail for expressive realism.

✅ What You’ll Receive

A ready-to-run Google Colab link that includes:

  • All code + configs pre-set (no local setup needed)
  • Automatic download of VibeVoice-7B weights into your environment
  • A Gradio interface to test multi-speaker dialogue instantly
  • Auto-deployment to your own Modal endpoint

Once deployed, your model will be live at:

https://<your-modal-space>/vibevoice
I want this!
Powered by