Compiled engines for running Whisper with TRT LLM for much faster inference.
AI & ML interests
None defined yet.
models 670
baseten/embedding-smol_llama-101M-GQA
76.6M • Updated
• 20
baseten/qwen3-engine-30A3-repro
Updated
• 1
baseten/whisper_trt_large_v2_251013_NVIDIA_H100_80GB_HBM3_MIG_3g_40gb_0_21_0
Updated
baseten/whisper_trt_large_v3_turbo_251013_NVIDIA_H100_80GB_HBM3_MIG_3g_40gb_0_21_0
Updated
baseten/whisper_trt_large_v2_251013_NVIDIA_H100_80GB_HBM3_0_21_0
Updated
baseten/whisper_trt_large_v3_251013_NVIDIA_L4_0_21_0
Updated
baseten/whisper_trt_large_v2_251013_NVIDIA_L4_0_21_0
Updated
baseten/whisper_trt_large_v3_251013_NVIDIA_H100_80GB_HBM3_0_21_0
Updated
baseten/whisper_trt_large_v3_turbo_251013_NVIDIA_L4_0_21_0
Updated
baseten/whisper_trt_large_v3_turbo_251013_NVIDIA_H100_80GB_HBM3_0_21_0
Updated