an open-vocabulary sound event detection model
State-of-the-art target speech extractor
Stylized TTS β design voice, accent, and emotion your way
Generate audio from text and reference audio
Generate or edit realistic audio from text prompts
Separate sounds from audio mixtures using text prompts