Show HN: Dia2, open-weights TTS model for realtime speech to speech | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		Show HN: Dia2, open-weights TTS model for realtime speech to speech (github.com/nari-labs)
		3 points by toebee 26 days ago \| hide \| past \| favorite \| 2 comments
		Dia2 is an open-weights, streaming dialogue TTS model. It is capable of generating speech without a full sentence, making it suitable for low-latency speech-to-speech systems. It can generate up to 2 minutes of English audio, and supports audio prefixing. The inference code and weights (1B / 2B variants) are uploaded to Github and Hugging Face with Apache 2.0 license, to accelerate research. This work was heavily influenced by KyutaiTTS, Mimi, and Sesame. We thank the TPU research cloud for providing computational resources.

gac3 26 days ago [–]

Was this trained on the same data as Dia 1?

gac3 26 days ago | [–]

Would be interesting to know what improvements come from arch, data, and different tokenizer.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact