🦜 parakeet-tdt-0.6b-v3: Multilingual Speech-to-Text Model
Description:
parakeet-tdt-0.6b-v3 is a 600-million-parameter multilingual automatic speech recognition (ASR) model designed for high-throughput speech-to-text transcription. It extends the parakeet-tdt-0.6b-v2 model by expanding language support from English to 25 European languages. The model automatically detects the language of the audio and transcribes it without requiring additional prompting. It is part of a series of models that leverage the Granary [1, 2] multilingual corpus as their primary training dataset.
🗣️ Try Demo here: https://huggingface.co/spaces/nvidia/parakeet-tdt-0.6b-v3
Supported Languages:
Bulgarian (bg), Croatian (hr), Czech (cs), Danish (da), Dutch (nl), English (en), Estonian (et), Finnish (fi), French (fr), German (de), Greek (el), Hungarian (hu), Italian (it), Latvian (lv), Lithuanian (lt), Maltese (mt), Polish (pl), Portuguese (pt), Romanian (ro), Slovak (sk), Slovenian (sl), Spanish (es), Swedish (sv), Russian (ru), Ukrainian (uk)
This model is ready for commercial/non-commercial use.
- Downloads last month
- -
Dataset used to train shahrukhx01/parakeet-tdt-0.6b-v3-fp32-onnx
Evaluation results
- Test WER on AMI (Meetings test)test set self-reported11.310
- Test WER on Earnings-22test set self-reported11.420
- Test WER on GigaSpeechtest set self-reported9.590
- Test WER on LibriSpeech (clean)test set self-reported1.930
- Test WER on LibriSpeech (other)test set self-reported3.590
- Test WER on SPGI Speechtest set self-reported3.970
- Test WER on tedlium-v3test set self-reported2.750
- Test WER on Vox Populitest set self-reported6.140
- Test WER (Bg) on FLEURStest set self-reported12.640
- Test WER (Cs) on FLEURStest set self-reported11.010