Scenema Audio

Zero-shot expressive voice cloning and speech generation. Describe how a voice sounds and feels, write what it should say, and the model generates a full vocal performance.

Built by Scenema AI, the AI filmmaking platform. GitHub | Demos & Samples

Language
The model has only been tested with a limited set of languages. The language tag here is used for Whisper validation.
Shot Mode
0.5 3
Preset Prompts