Voltar ao marketplace
faster-whisper
Local speech-to-text using faster-whisper, a CTranslate2 reimplementation of OpenAI's Whisper, for fast and accurate transcription with GPU acceleration.
4,554downloads10instalações10estrelas
v1.5.1
cmdopDevelopmentdiarization, multilingual, speech-to-text, subtitles, transcription3/2/2026
Overview
Local speech-to-text using faster-whisper, a CTranslate2 reimplementation of OpenAI’s Whisper, for fast and accurate transcription with GPU acceleration.
Key Features
- Transcribe audio/video files
- Generate subtitles (SRT, VTT, ASS, LRC, TTML)
- Identify speakers (diarization labels)
- Transcribe from URLs (YouTube links and direct audio URLs)
- Batch process files (glob patterns, directories, skip-existing support)
- Convert speech to text locally (no API costs, works offline)
- Translate to English
- Do multilingual transcription (supports 99+ languages with auto-detection)
- Transcribe a batch of files in different languages
- Transcribe multilingual audio
- Transcribe audio with specific terms
- Preprocess noisy audio (before transcription)
- Stream output
- Clip time ranges
- Search the transcript
- Detect chapters
- Export speaker audio
- Spreadsheet output
How It Works
Use the faster-whisper skill to transcribe audio/video files, generate subtitles, and more. The skill uses the faster-whisper model, which runs 4-6x faster than OpenAI Whisper with identical accuracy. With GPU acceleration, expect ~20x realtime transcription.
Use Cases
- Transcribe a meeting or interview
- Generate subtitles for a YouTube video
- Identify speakers in a podcast
- Transcribe a batch of files in different languages
- Transcribe multilingual audio
- Preprocess noisy audio before transcription
- Stream output for real-time transcription
- Clip time ranges for specific sections
- Search the transcript for specific terms
- Detect chapters for a table of contents
- Export speaker audio for separate WAV files
- Spreadsheet output for CSV or spreadsheet format
Avaliações
Nenhuma avaliação ainda.