Will need to decide between a “realistic scale / depth of stroke” approach or some “emphasized / enhanced depths” one… We can look into this later on, as this would be a matter of parameter more than algo logic.
Still a couple outliers, but way more manageable than the previous mess.
So far, for my purpose, I like this ‘zoomed in’ view (which I had trouble doing with ‘flat’ output): -filter_complex "[0:v]v360=input=he:in_stereo=sbs:pitch=-35:v_fov=90:h_fov=90:output=sg:w=2048:h=2048
This is seriously cool and looks super promising! That’s some true dedication there especially with the tagging… Wouldn’t mind contributing when the time comes. Keep up the good work!
Happy that you like it
The main issue is that preparing a 3D video to a 2D video requires some camera angle changes: you need to crop video and the action does not consistently happen in the same region (center of the image). You need to control pitch, yaw, roll, FoV to get a good picture.
There are a few good programs for 3D-2D conversion, but the programmers are quite opinionated on how their programs should be used (e.g. restrict max FoV, etc.).
vongoo B9 / VR2Normal · GitLab
This program is made to convert a VR video to 2D. It it MUCH less performant for preview than mpv, but it is designed to generate an ffmpeg script to convert video. Just mark the video with timestamp and camera angles and it will generate the ffmpeg script to convert the video.
The v360 filter I use looks like this: hequirect:sg:in_stereo=2d:out_stereo=2d
(In VR2normal, you can copy paste in the “v360 custom filter”)
Here’s an example, at an exceptional 180* FoV, and I think it still looks quite good.
You can play a little with this, and see what would yield the best results.This would allow to zoom in on the action (e.g. penis) in order to only keep the relevant zone of action in the video to feed to the AI Script program.
Looking at the video of Katrina Jade, here’s what the result looks like:
00:00:00-00:32:35 => Pitch -50, Yaw 0, Roll 0, FoV 100
00:32:35-00:44:55 => Pitch -20, Yaw 0, Roll 0, FoV 100
It literally took 2 min to configure, preview and mark, then 15min of video conversion.
Here’s the preview of what it looks between 00:32:30 and 00:32:40 (the transition from -50 to -20 pitch).
The resulting video would have the penis always zoomed in and centered, which I think can very much help the scripting process.
And the missionary position may be much more manageable like this:
6160 actions rendered VS 6189 in the reference “state of the art” script
Respect (almost) of the no-action zone (close-up)
No good at the initial rubbing when the cock is not seen
I added a speed limiter option to accommodate devices like the Handy, but now, I need to work on the strokes depth which are way too shallow in many cases.
I guess you can use it as an ffmpeg filter. Here’s the command generated by VR2Normal (after I marked the video). You can try with the video filter see if that works.
ffmpeg -hide_banner -loglevel error -stats -y \
-i "SLR_SLR Originals_Vote for me_1920p_51071_FISHEYE190_alpha.mp4"
-map 0:v:0 \
-vf "sendcmd=f='SLR_SLR Originals_Vote for me_1920p_51071_FISHEYE190_alpha.cmd',crop=w=iw/2:h=ih:x=0:y=0,v360=fisheye:sg:iv_fov=190:ih_fov=190:d_fov=100:pitch=-20:yaw=0:roll=0:w=854:h=480:interp=lanczos:reset_rot=1" \
-c:v libx264 -preset fast -crf 24 -tune film -x264-params keyint=600:bframes=7 \
"SLR_SLR Originals_Vote for me_1920p_51071_FISHEYE190_alpha_VR2Normal.mkv"
Where “SLR_SLR Originals_Vote for me_1920p_51071_FISHEYE190_alpha.cmd” is a text file containing:
(note: input=he => hequirect like the app’s UI, output=sg => stereographic like @jambavant suggested)
Also, if the quality of the image is important, it’s also possible to add this at the end to have a better interpolation of the pixels, but it take longer to process: :interp=lanczos
In the UI, I’m pretty sure that FOV is the same as d_fov (i.e. depth FOV).
One of the advantages of the application, from what I can see, besides being able to ‘queue’ different settings for different parts of the video, is that it can compute automatically v_fov & h_fov relative to the width and height that you choose (I see 87.2 HFOV, 49 VFOV in one of the images). If, like me, you are fine with a square image that shows most of the action, you can use 90 / 90 and only play with pitch & d_fov to get a different level of ‘zoom’. As long as we use the “sg” output, the image seems to be relatively ‘unwarped’.
Thanks @SlowTap, your command seems to be similar to what I was using before. Using a flat output and ‘cheating’ on d_fov to try to get a larger view of the scene, but that also warp a little bit the image. With the sg output, we get a unwarped image even when zoomin and out.
Note: the example above is using v360=fisheye:sg:iv_fov=190:ih_fov=190 which wouldn’t work with Ass Zapped.
Looks like we retrieved a penis instance doing so (but lost the pussy box on the way though…).
I will re-run a full detection on the video, and parse the results so as to recreate the funscript.
In the meantime, I reverted some change I made earlier, and I am getting less shallow (yet, still too shallow) actions. Lost some actions also, be upon couple checks, it looks like those were unwanted outliers:
I know what you mean now… I tried VR2Normal and it limits d_fov to 110. I’m guessing you compiled your own version without the limitation. I just wanted to see if I could compute v_fov / h_fov for me but I don’t think that it takes into account ‘sg’ output anyway so it’s not really useful anyway. Using 90/90 in all case seems to work pretty well already.
Ideally, you should not ‘mess’ with the input type & fovs (type=fisheye, iv_fov=190, ih_fov=190 or type=he) since the video has been encoded like that. From my understanding, the input parameter allow ffmpeg to map the original video to a sphere (always the case for all type of projection) and then the output take the image from the sphere and remap it into a different projection.
Which wrapper are you using?
IMO, you should try something like this (if you have those parameters in your ffmpeg wrapper):
– Input
Some homemade Python library mimicking OpenCV cv2 but relying on a ffmpeg subprocess…
I struggled to read h265 at some point with opencv, so I tried to elaborate some alternative with my best buddies (ChatGPT and DeepSeek…).
All my code was initially relying on OpenCV for the frame reading/parsing, so, in order to minimize the rework I created a ffmpeg library that could swiftly replace it in my use case… (might sound moronic, once again, I am not a coder ).
I initially tried with 190/190, but got weird/unexpected results. Will try again after this round of yolo detection is done…
In the “Config” tab, you can set the maximum FoV to 180*, which I think is a hard limit. Same thing in VR Reversal, you can edit the lua script to change the max FoV.
What settings give the best results with detection? Is a zoomed in view better or zoomed out?
If the goal is to analyze the whole video without having to set differents settings for differents part of the video, IMO zooming out is better since it allow to grab all type of actions. The only ‘price’ to pay would be to use a higher resolution so that the AI have enough pixels to recognize stuff.