Miso One

AI Voice Generator List

Miso One is an 8B open-weights English text-to-speech generator for expressive conversational speech, voice continuation, and low-latency voice-agent research.

View tool

Voice

What does Miso One do?

Miso One is the product-facing name for Miso Labs’ Miso TTS 8B, an 8B open-weights text-to-speech system built to produce expressive English conversational speech. It’s designed for creator and voice-agent style workflows, including voice continuation from prompt audio.

Use it to turn scripts into expressive audio with a real-time preview flow. The hosted demo supports quick evaluation via a streaming experience and includes options like live translation draft and streaming transcripts to help you iterate faster.

Miso One is also built for developers who want to inspect and run the model locally. The model weights and inference code are public, but because the checkpoint is 8B-sized, local use requires meaningful GPU resources and careful benchmarking for your latency and response-length needs.

What is Miso One?

Miso One is an 8B open-weights English text-to-speech model (Miso TTS 8B) aimed at expressive, conversational speech and voice continuation from prompt audio.

Is Miso One multilingual?

The current public model is focused on English generation rather than broad multilingual support.

Can I run Miso TTS 8B locally?

Yes—public model weights and inference code are available for local inference. Plan for real GPU requirements and benchmark latency and memory usage in your environment.

Does Miso One support voice continuation or one-shot voice cloning?

It supports prompt-audio conditioned generation, which makes voice continuation a key evaluation area. Voice cloning features exist in the workflow, but you should use consented audio when testing or deploying.

How much text can I generate on the free plan?

Free plan generations are limited to a maximum of 120 characters per conversion. Upgrades increase the maximum to 1,000 characters per conversion.

Is Miso One ready for production voice-agent latency requirements?

It’s designed for low-latency, interactive-style workflows, but real latency depends on hardware, serving setup, and prompt length. Validate with your own benchmarks before production.

Last modified

Jun 12, 2026

Date listed

Jun 11, 2026