Thanks @sentinel, this is the type of approach I have been trying to implement, by tracking over the last frames the parts that are touching the penis, and summing their absolute movements over time, trying to determine what is the dominant one, and transitioning to it as the lead part. This is still a bit broken, but I will try to refine it.
For this one, we would need some kind of flagging + raw track, and post-processing. Adding to the todo list.
Unfortunately, for this one, I would need a model trained to detect lips/mouth, I have only face for now. And glans, amongst others, but the glans could be occluded by hand and not be in mouth… Need to think about it.
This could be a game changer!
Let me explain, until now, I mainly focused on y-axis / height, and on a combination of the difference of height between parts (base of penis, hand, etc) and absolute y-position of the part itself. I worked with Euclidian distance at some point, but was not happy with the result.
I also thought looking at the x-axis would be for when we would want to work with generating multi-axis scripts, but in fact I understand now that this x-axis movement is key even in a single axis funscript approach.
Wow, thank you for sharing part of your recipe, and thank you all for the valuable feedbacks!
Sidenote, as I am currently in the process of training an Oriented Bounding Box model, we could get an orientation of detected boxes, that could help for first stage multi-axis.
Quick preview, please disregard mistakes as this is very early in the training stage (epoch 30 or so on 200+). So we would move from this:
To something like that (thus allowing to analyze angles and stuff if it ends well…):