How to create subtitles for a scene (2024)

I’m assuming that it worked for full and mergedvad transcription?

I’m surprised that it would work for those but not for singlevad, unless there is a weird .wav file that is created during the process, maybe an empty file or something.

Can you try running whisper-faster directly and tell me if there is an error or something?

In short, this command, in a command prompt, while replacing PATHTOPURVIEW, PATHTOVIDEO, and DATETIME with value for your machine/context:

"PATHTOPURVIEW\Purfview-Whisper-Faster\whisper-faster.exe" --model Large-V2 --language ja --task transcribe --batch_recursive --print_progress --beep_off --output_format json "PATHTOVIDEO_backup\DATETIME-singlevad-*.wav"

You might want to try the command with DATETIME-mergedvad-all.wav to be sure that everything is OK for that one on the command line.

Looks like it was actually processing, the 0/1969 wasn’t updating so I thought it was was looping. All good here! One thing I noticed for the AI prompts, is it explicitly states the man does not speak. I am using this method to translate a normal JAV scene. Do you have an pre-made verbiage that doesn’t have language regarding ignoring the man’s speech? If not I can just edit myself. Thanks!

Are you on an English OS? Because I’m parsing the output of whisper to show the progression. Right now, I’m looking for “Starting …filename…”. If whisper writes something in french, for example, “Démarrage …filename…”, I won’t be able to update the progression but, like you say, it will still work in the background.
Also, singlevad takes more time than the other type for transcription so I can understand why you thought it was stuck on a 1969 files batch.

As for the VR/Not-VR, if you use sonnet-3.5 or any ‘high quality’ AI, you can simply write it in the context.

Something like this in the first subtitle of the file:
{Context:This is not a POV or VR scene, its a 2D scene. There is 3 peoples in the room that will talk. A woman, who's blah blah, her boyfriend, ...}

You might want to add {Talker:Boyfriend} on each subtitle but, with 1969 subtitles, it might be a serious pain in the ass so you can hope that the AI will be able to pick up who’s talking and keep a ‘coherent’ understanding of the scene.

If you have to change the context information later in the file (i.e. if the setup of the scene has changed or something), always start by repeating that part.

Yes I am using an English OS. It was running in the background so we’re all good.

Thanks for the advice with the context, I will try that, though I’m starting to feel the effort isn’t worth it for the length of the video. I might just stick to shorter VR scenes for now haha.

Made a new version: 1.3.7

The main feature of this one is that it’s easier to get translations when a transcription item overlaps multiple timings (see Parts argument in the release note).

1 Like

I’m curious if anyone has tried out the whisper model v3 turbo, it processes the files incredibly quickly, from my experience up to 4 times quicker. I don’t know how well it holds up in terms of accuracy though.

I’m having the same error as the other user. And am missing the
preprocessor_config.json from the folder. How do I change the name of the download you provided? can’t seem to get rid of the .txt at the end

I also can’t figure out how to edit the batch file in the other solution you provided

Best regards!

ignore that, I figured it out but am still getting the error

It seems less accurate than its “source” (Large V3 = 10.3 WER, Large V3 turbo = 12.3 WER, lower is better). reference

You seem to have added a [ at the start of the path in --FSTB-SubtitleGeneratorConfig.json. You have to remove it.

On another news, I created another version with some ‘tweak’ to the TrainingData output.

2 Likes

thank you so much, guess I pressed it accidentally :smiley: