★ Persona-text casting (A/B)
A technique demo: each host is defined only by a free-text persona paragraph, which is resolved to a base voice + accent + delivery brief; a bickering 2-host conversation is then rendered in one generation. A/B it against the identical script rendered with the persona briefs stripped.
text sent to the model
HANK's persona text: "Raspy older Texan man... dry, deadpan humor, slow unhurried drawl. Mildly cynical but secretly fond of his co-host." -> voice Algenib + West-Texas-drawl brief. MORAG's: "Sharp, fast-talking young Glaswegian woman. Warm, quick to laugh, impatient. Teases Hank relentlessly but clearly adores him." -> voice Kore + Glaswegian brief. One 2-speaker generation, sparse inline tags at real beats.
text sent to the model
Identical script and voices, but rendered WITHOUT the persona-derived accent/delivery briefs and tags — the control clip.
Prebuilt voices
30 named voices ship with the model, each with its own character. Six of them reading the same line — no steering, just the voice.
text sent to the model
Welcome back to the show. Today we are talking about the strangest discovery of the decade.
text sent to the model
Welcome back to the show. Today we are talking about the strangest discovery of the decade.
text sent to the model
Welcome back to the show. Today we are talking about the strangest discovery of the decade.
text sent to the model
Welcome back to the show. Today we are talking about the strangest discovery of the decade.
text sent to the model
Welcome back to the show. Today we are talking about the strangest discovery of the decade.
text sent to the model
Welcome back to the show. Today we are talking about the strangest discovery of the decade.
Emotion — director's notes
The exact same sentence, re-performed from a one-line natural-language direction prefixed to the text. Nothing else changes — same voice (Kore), same words.
text sent to the model
I just got the results back from the lab. You are not going to believe what they found.
text sent to the model
Say this bursting with excitement and joy, almost out of breath: I just got the results back from the lab. You are not going to believe what they found.
text sent to the model
Say this quietly, devastated, on the verge of tears, slowly: I just got the results back from the lab. You are not going to believe what they found.
text sent to the model
Say this seething with barely-controlled anger, clipped and cold: I just got the results back from the lab. You are not going to believe what they found.
text sent to the model
Say this panicked and terrified, voice shaking, out of breath: I just got the results back from the lab. You are not going to believe what they found.
text sent to the model
Say this softly and warmly, like a gentle bedtime story: I just got the results back from the lab. You are not going to believe what they found.
text sent to the model
Say this dripping with sarcasm, thoroughly unimpressed: I just got the results back from the lab. You are not going to believe what they found.
Non-verbal sounds
Tags like [laughs], [cough], [sniffs], [sighs], [gasp], [crying] render as actual vocal events in the speaker's voice — the words around them stay intact, and the tag itself is never read aloud.
text sent to the model
[laughs] Okay, okay — [giggles] I'm sorry, I can't read this with a straight face. [chuckles] Give me a second. Okay. I'm good. I'm good.
text sent to the model
[sniffs] I'm fine, really. [cough] Okay — maybe I'm not fine. [sighs] I should have stayed in bed. [sniffs] Is there any soup left?
text sent to the model
[gasp] No. That can't be right. [crying] He was standing right there... [sighs] and then he was gone.
Pacing & delivery
Whispering, shouting, speed, and dramatic pauses — all tag-driven.
text sent to the model
[whispers] Everyone's asleep. If we're going to get to the kitchen, we move now — and we do not wake the dog.
text sent to the model
[shouting] GOAL! GOAL! I do not believe what we have just witnessed here in the hundredth minute!
text sent to the model
[very fast] Terms and conditions apply, offer not valid in all regions, consult your physician before starting any new exercise program, batteries not included.
text sent to the model
[very slow] Some things... cannot be rushed. Good barbecue. Old whiskey. And this sentence.
text sent to the model
You're asking if I can do it. Let me think. [long pause] Yes. [short pause] The answer is yes.
Accents on demand
Same voice (Kore), same sentence, six accents — steered entirely by a one-line direction. No separate voice models.
text sent to the model
Speak with a thick Glaswegian Scottish accent: I just got the results back from the lab. You are not going to believe what they found.
text sent to the model
Speak with a slow Texas Southern drawl: I just got the results back from the lab. You are not going to believe what they found.
text sent to the model
Speak with a Nigerian English accent, Lagos: I just got the results back from the lab. You are not going to believe what they found.
text sent to the model
Speak with an Indian English accent, Mumbai: I just got the results back from the lab. You are not going to believe what they found.
text sent to the model
Speak with a broad Australian accent: I just got the results back from the lab. You are not going to believe what they found.
text sent to the model
Speak in English with a strong Parisian French accent: I just got the results back from the lab. You are not going to believe what they found.
Creative & character styles
The steering is freeform: character voices, personas, and formats the docs never enumerate.
text sent to the model
[like dracula] Good evening. I have been expecting you. Please — come in. Leave the garlic bread outside.
text sent to the model
Say like an over-caffeinated sports commentator calling the final seconds of a championship: Three seconds left — she takes the shot from half court — IT'S GOOD! IT'S GOOD! THE CROWD IS ON THEIR FEET!
text sent to the model
Say like a deep, gravelly movie-trailer narrator: In a world where every podcast sounds the same... one model dared to clear its throat.
Multilingual (~80 languages)
The input language is auto-detected — no language parameter at all.
text sent to the model
Bienvenidos de nuevo al programa. Hoy hablamos del descubrimiento más extraño de la década.
text sent to the model
Bienvenue dans l'émission. Aujourd'hui, nous parlons de la découverte la plus étrange de la décennie.
text sent to the model
番組へようこそ。今日は、この十年で最も奇妙な発見についてお話しします。
Multi-speaker dialogue — one generation
Two voices rendered in a single pass, so turn-taking prosody emerges: the speakers react to each other. Inline tags work inside the conversation. (API cap: 2 speakers per generation.)
text sent to the model
Read this as a natural, warm podcast conversation - real banter, the hosts reacting to each other. TTS the following conversation: S1: [excited] Okay I have to tell you about this study before I explode. S2: [laughs] You said that last week about the octopus thing. S1: [mock offended] The octopus thing was incredible! [sighs] Fine. This one's better. S2: [skeptical] Go on then, convince me.
text sent to the model
Read this as a tense dramatic scene - quiet intensity building to a breaking point. TTS the following conversation: S1: [calm] You knew. This whole time, you knew. S2: [nervous] I was going to tell you. [sighs] I just needed the right moment. S1: [furious] The right moment was three years ago! S2: [quietly, defeated] ...I know. [long pause] I know.