Faster-Whisper - Translate video and audio files

Found this recently and didn’t see any topics on it so thought I’d share.

It’s a large improvement over the OpenAI Whisper tool I used previously since it’s faster and takes up less resources you can use a larger model

For installation/usage follow the readme files.

GPU execution requires cuBLAS and cuDNN libs

I preferred the standalone .exe instead of dealing with a python installation.


thats awesome, thank you! Works pretty well so far

Thanks! Will give it a shot. I have to run whisper on CPU(AMD GPU) and long VR scenes were taking forever.

Edit: This is amazing. I wouldn’t use Whisper because it would take over my computer for days at a time on some files. Now I can get decent subs in a matter of hours.

Some things I noticed:

  • weird hallucinations when there is little/no speaking. It’ll show up as “Please subscribe”, “Thank you for watching” or something random like that
  • saw some Zero division error when trying to use the batch processing option and there were multiple two or three files that were without dialog
  • translation results can change between runs, if something seems really off try rerunning to see if it picks up a different thing
  • there’s also the option to transcribe only and then use something like DeepL for the translation. Not sure if it’s more or less accurate but it’s an option

For anyone into RJ audio files
If you use .lrc as an output you can use minilyrics MiniLyrics - Show lyrics in iTunes, Windows Media Player, Winamp, etc. to display a floating text box.

Makes them appear almost like a visual novel or game

from: RJ240262

