Funscript AI Generation - VR (& 2D POV now?) - Join the Discord :)

k00gar · December 28, 2024, 7:52pm

Absolutely, with this only comment from my side : most of the relevant info I need to grab (genitals and few other key points) are mostly within the center third of the frame horizontally wise, and in the low half part of the frame vertically wise (if this is understandable at all).

This is the part I want to focus on, as this is the part the penis is located most of the time in VR POV videos.

k00gar · December 28, 2024, 8:02pm

If I may add, the very tricky part is the lower 15th of the frame, during missionary for instance, the penis is located there and the image in that layer is so warped and compressed in size that it makes it terrible to perform image recognition.

Zalunda · December 28, 2024, 8:04pm

Yes, I get it. It’s the beauty of the sg output projection, it keep the proportion intact when changing d_fov, which was not the case with the ‘flat’ projection:

640x640 pitch=-35, d_fov=180:

640x640 pitch=-35, d_fov=140:

640x640 pitch=-35, d_fov=110 => too much zoom in (or need to look down more on the sphere with pitch = -45 or something):

k00gar · December 28, 2024, 8:21pm

Oh boy, I am not in front of my computer right now, but everyone needs you in their team @Zalunda !!

I need to try that and see…

k00gar · December 28, 2024, 10:04pm

Here’s what I got with the following parameters in the ffmpeg reader command:

self.type = “fisheye”

self.iv_fov = 120

self.ih_fov = 190

self.d_fov = 110

Hmm, disappointing:

New:

k00gar · December 28, 2024, 10:15pm

So, this is what I get with those settings @frame 79000:

And here is what I get @frame 79080 (which seems to be close to your timing):

Back to generating YOLO results from scratch again

I really enjoy your support guys! Thank you so much! I hope this gets back to you at some point!

k00gar · December 29, 2024, 12:21am

Here following this new detection batch (decided to have a homemade and local funscript heatmap visualizer instead of funscript.io ):

Hmm, the missionary (“in the dark”) still is a TERRIBLE mess though… Darn, so disappointed…

Time to sleep and think out of the box… So much frustration

vibin · December 29, 2024, 3:37am

I work with these models among several of our own at my day job. Glad to take a look. I have a new studio on the way, if my pro isnt getting beaten up on what i run, I ought to be able to reduce the time with the gear I have My code aint pretty either but it is literally used around the globe in our products.

k00gar · December 29, 2024, 9:22am

Oh nice, thank you.

Do you have an account on GitHub ?

PM me, just give me a couple hours to do some housekeeping and commenting in the code.

k00gar · December 30, 2024, 7:04am

Now incorporated the tracking of the hips center based on the pose estimation model.

Still need to have it part of the position estimation and reinstate the Kalman filtering to fill gaps and smoothen results.

@vibin now has access to the repo

k00gar · December 30, 2024, 3:21pm

@Zalunda

Thanks again for your insight. I am still exploring options, now focusing on how to enhance shadowy areas of the video.

For instance, by applying “,lutyuv=y=gammaval(0.7)” to the ffmpeg command, I get the following results on frame #79080:

With the lutyuv option (original on the left, unwarped on the right):

Without the option:

It looks like it helped enhance the detection in this shadowy/dark area, without altering the rest of the recognition confidences on the rest of the frame.

But there could be better approach, if anyone hase a clue, feel free to share
Once again, trying to keep all that processing at ffmpeg level.

Also, now that I added some pose recognition task so as to get the hips in the evaluation process, the YOLO stage is wayyyy longer (close to 4 hours…) for this 3840x1920 video (focusing on a 1920x1920 panel only).

I decided to go with the largest (“x”) YOLO 11 pose recognition model as I was getting stuttering results for the hips location. Might be mitigated by some moving average or kalman filtering if we try to proceed with a leaner version of the model…

k00gar · December 30, 2024, 5:15pm

Ok, so, here are the results of the experiments of the day.

I actually made a mistake, and run all the detections (detection + pose detection) on a regular (not “unwarped” video…). Boy, I need to start again.

Anyway, I now have this automated report, that compares a reference funscript (in red) with what the algorithm produced (in blue). Sections of 10 seconds, randomly picked within the video.

Quite disappointed that I messed something in the process, the closeup should show no action (now that I think of it, this might have been induced by the moves of the hips in the pose detection, that I should have filtered in case of no penetration… will check the debug logs… makes me want to bang my head to the wall sometimes lol).

I also need to check if the shallow strokes are not limited by a mistake in the speed limit I set with the handy in mind, might have miscomputed or made another dumb move there.

Will try and process it again, based on our “image undistorting” discussed strategy, and will also try with pose detection (both x-large, then with a smaller model for comparison, and lastly, without it).

Generated heatmap:

Results analysis and comparison to reference:

k00gar · December 30, 2024, 5:36pm

Of course that was it…

Re-running a batch, with “image undistorting” and nano version of models for both detection and pose…

Estimated processing: 2h30 vs 4h when using the x-large version of the yolo 11 pose model.

Currently writing methods to deal with the shallow strokes and enhance them without breaking 0 / 100 thresholds and creating flat sections.

In the meantime, I will call my banker to discuss a loan so as to cover the electricity bill

Dimava · December 30, 2024, 7:19pm

TBH you may already open a Patreon (or whatever)
Your stuff seems interesting enough for someone to join

k00gar · December 30, 2024, 7:49pm

Thank you very much for the kind words, I really appreciate.

I did not do that for the money, but more as a personal challenge and to give back to the community.

However, I wouldn’t mind a coin or two for the time and energy spent, and to keep up the motivation but I have no clue how to proceed without making people feel they would have to pay for it.

Anyway, so much to do still… I won’t forget you on the multi axis if we ever get there!

Thank you again!

jcwicked · December 30, 2024, 7:51pm

@k00gar I’m joining this topic late, but it looks like you’ve made a ton of progress on this whole thing. Is the source code on github or anything like that? I bet this could really take off if you let some of us collaborate with you on it.

I saw in an early post of the thread that you are shy about sharing the source code, and that’s totally understandable, but trust me the results you’ve shown so far are more than enough to show how well you’ve done with all of this.

I also agree with the notion of a Patreon or something like that if you want to monetize this work. All the more reason to open up collaboration, if you haven’t already.

k00gar · December 30, 2024, 9:49pm

Thank you very much @jcwicked , the code is on GitHub but as a private repo for now, and I would definitely need another device to parallel some of this activity (training model, detecting, comparing, debugging, troubleshooting) but this is more a dream than anything else.

Anyway, I promised to share the code by end of January if not earlier, but I was hoping to make more progress than I actually did. Will need help for sure.

Actually, I did some housekeeping and commenting yesterday and might be able to share earlier than expected even if not at the maturity level I was targeting.

As per results follow up, this is what I got with the n-detect and n-pose model and some post-automation adjustment of the funscript:

Messy during section 5, need to debug/troubleshoot.

Weirdly reversed during section 6, investigation needed too.

Every time you lift a new stone, more mess under it…

I have a job and a family that might not allow me to keep that effort going much longer, will do my best to dedicate as much time as I can to reach an acceptable state

k00gar · December 30, 2024, 10:07pm

Another random extract for the exact same generation:

k00gar · December 30, 2024, 10:19pm

Now running the detect n-sized model only, without the pose model, will see if we get any significant difference, and if any of it explains the reverse strokes or if I need to look deeper into the code to reverse that.

Next update tomorrow I guess…

Thank you all

sentinel · December 30, 2024, 10:31pm

Just wanted to say that this is really interesting to follow even if I don’t know much about it. Looking forward to see how all this turns out in the long run.