Dia is a powerful text-to-speech model from Nari Labs that can generate realistic multi-speaker conversations directly from a transcript.

It supports emotion, tone, and expressive nonverbal sounds like laughing or coughing, making dialogue feel far more natural than standard TTS.

Capabilities

Voice Cloning Note

The repository claims basic voice cloning from a short audio sample. In practice, this feature is still very rough and experimental.

Even so, there is plenty here to experiment with, and the project is evolving quickly.

Availability

Source and Hosting Notice

Get Going Fast provides community setup guidance, documentation, tutorials, troubleshooting support, and member services. Get Going Fast does not sell, host, store, mirror, or redistribute AI model files, model weights, training datasets, or third-party project files.

When a setup guide references third-party dependencies, repositories, or model files, it points users to official upstream public sources such as GitHub, Hugging Face, package managers, or original project repositories, subject to those sources' own licenses, terms, and availability.

Get Going Fast is a general-audience AI education and workflow site, not an adult-content site or hosted AI generation service. Do not use Get Going Fast materials, support, guidance, or referenced third-party tools for unlawful, abusive, non-consensual, sexually explicit, exploitative, harassing, deceptive, or privacy-violating content, including misuse of another person's likeness, voice, identity, intellectual property, privacy, or publicity rights. See our Acceptable Use Policy.

If you are a rights holder, platform reviewer, payment processor, or hosting provider with a concern about a listed tool, guide reference, or upstream source, please contact us. We will review the concern promptly and remove or revise references when appropriate.

Dia 1.6b – Multi-Speaker Text-to-Speech

Capabilities

Voice Cloning Note

Availability

Related Tools