Faster-Whisper - Translate video and audio files

resel95074 · August 14, 2023, 6:35am

Found this recently and didn’t see any topics on it so thought I’d share.

It’s a large improvement over the OpenAI Whisper tool I used previously since it’s faster and takes up less resources you can use a larger model

https://github.com/guillaumekln/faster-whisper

For installation/usage follow the readme files.

GPU execution requires cuBLAS and cuDNN libs

I preferred the standalone .exe instead of dealing with a python installation.
https://github.com/Purfview/whisper-standalone-win

Ambi · August 14, 2023, 8:12am

thats awesome, thank you! Works pretty well so far

Ufungus · August 14, 2023, 11:51pm

Thanks! Will give it a shot. I have to run whisper on CPU(AMD GPU) and long VR scenes were taking forever.

Edit: This is amazing. I wouldn’t use Whisper because it would take over my computer for days at a time on some files. Now I can get decent subs in a matter of hours.

resel95074 · August 15, 2023, 2:56am

Some things I noticed:

weird hallucinations when there is little/no speaking. It’ll show up as “Please subscribe”, “Thank you for watching” or something random like that
saw some Zero division error when trying to use the batch processing option and there were multiple two or three files that were without dialog
translation results can change between runs, if something seems really off try rerunning to see if it picks up a different thing
there’s also the option to transcribe only and then use something like DeepL for the translation. Not sure if it’s more or less accurate but it’s an option

resel95074 · August 15, 2023, 3:04am

For anyone into RJ audio files
If you use .lrc as an output you can use minilyrics https://www.crintsoft.com/minilyrics/ to display a floating text box.

Makes them appear almost like a visual novel or game

from: RJ240262