Even though I am not a professional programmer, I have put a tremendous amount of effort into automating the generation of multi-axis Funscripts.
I invested heavily in various AI models like Gemini and Claude Opus, relying on “vibe coding” (AI-assisted coding). I also actively researched and applied the latest deep learning and open-source technologies. I went as far as analyzing human skeletal structures, predictive pelvic movements, and anatomical positioning to correlate them with Funscript parameters, even using LLMs to make frame-by-frame corrections.
Through all this trial and error, I managed to build a workflow that generates scripts with about 60% accuracy when I input a target video. However, the sheer variety and complexity of video content make it incredibly challenging to improve beyond that point.
I honestly wonder how much further technology needs to advance before AI can automatically generate multi-axis Funscripts that truly match the quality of hand-crafted ones.