F.A.P.S (Funscript - Audio Processing System) by CaptainHarlock

Hi!

I’ve been working on an application for a few weeks to generate Funscripts based on the music in videos. It’s specially designed for PMV videos, Cockheros, etc., but it can also be used for shorter videos with a single song, as well as hypno/hfo/edging videos.
Honestly, I have no idea about programming, lol, but I had a pretty clear idea of what I wanted, so I used the most advanced paid AI programming models to help me; and it’s been a tough battle, what started as a simple idea became more complicated and improved and more things were added… But I think that in the end, it’s turning out to be an app that works pretty well, has cool features, and can be quite useful.
The idea is to be able to create Funscripts quickly and adjust various parameters, while trying to stay true to the music and ensure that the movement follows it correctly. It doesn’t aim to create perfect Funscripts in every single point, which is really difficult, but it does take a lot of work out of the process. You can always edit the points that don’t look right or add your own blocks (patterns), for example at the end or at intermediate “climax” moments.
Here’s an example: It can analyze a 90-minute video with AI in about 10 minutes, and once analyzed, the funscript is created in seconds.
In other words, you analyze once (this is cached, it will remember what you have already analyzed), and then you can choose the “mood” (type of analysis/use of audio to create the points), as well as various configurable parameters, and generate that funscript in seconds. If you don’t like it, adjust some parameters again, wait a few seconds, and you’ll have a new one!

Here are some screenshots:

Generator:


Moods:

AI Analysis:

Edge & Cum Advanced Editor:


Pattern Editor:

Queue System:

Compare:

History:

Point Editor:

Combinator:

Preview Card:

Recently created moods:
“Fap Mixer”


“SixthSense”

“Pulse”

Here’s a summary of the main features:

F.A.P.S - Funscript Audio Processing System
Automatic Funscript generator based on audio analysis


What is F.A.P.S?
F.A.P.S (Funscript - Audio Processing System) is a desktop application that automatically generates Funscript files by analyzing the audio of your videos. It uses audio processing algorithms and artificial intelligence to create movements synchronized with the music, rhythm, and voice of the content.
Designed for devices like The Handy, F.A.P.S allows generating complete scripts without manual intervention, while offering full control over all parameters for those who want to fine-tune the results.
Developed by CaptainHarlock


Features

:performing_arts: Generation Modes (Moods)
16 different movement styles:

Music-Based (:musical_note:):

  • AUTO - Analyzes each segment and automatically applies the most appropriate mood
  • Normal - Beat-based movement, for music with defined rhythm
  • Relaxed - Slow and smooth waves
  • Hypno - Voice-reactive patterns, designed for ASMR and JOI
  • Edging - Teasing patterns with variations
  • Simple - Direct beat-to-movement conversion
  • Crazy - Intense and fast movement
  • Crazy+Voice - Crazy mode with voice vibrations
  • Slow - Slow and deliberate movements

Voice-Based (:microphone:):

  • Hypno - Voice-reactive patterns with customizable keyword detection via AI transcription, designed for ASMR and JOI
  • Edging - Teasing patterns with edge/denial/release keyword detection and build-up cycles

Advanced AI (:shuffle_tracks_button:/:brain:):

  • Fap Mixer :test_tube: - Granular stem-based mixing with role assignment (primary rhythm, secondary complement, texture, vibration) + GPU voice classification detecting moans, screams, breath and gagging with configurable per-type actions
  • SixthSense :brain: - 7-layer adaptive generation using 6-stem + 5-drum-substem AI separation with section-aware profiles, sub-beat precision and guitar solo detection

JOI Modes (:robot:):

  • JOI Director :test_tube: - Phrase-by-phrase AI director using Whisper transcription + LLM to assign actions (stroke, tease, edge, pulse, vibrate, release…) with 26+ pattern types
  • JOI Patterns :test_tube: - LLM classifies each spoken phrase into pattern categories with intensity and tempo
  • JOI LLM :test_tube::construction: - LLM outputs continuous parameters per phrase generating parametric waveforms
  • JOI LLM_A / LLM_B :test_tube::construction: - Parametric variants with explicit or auto-derived waveform shapes (sine, sawtooth, pulse, vibrate, staircase, shake)

:test_tube: = Experimental :construction: = Work in Progress


:brain: AI Analysis

Integrates several machine learning models (requires additional installation):

  • All-In-One - Detects musical structure (intro, verse, chorus, bridge, solo, outro), beats and song boundaries
  • Beat This! - Complements All-in-One beat detection, recovering missed beats at audio edges (intro/outro fade regions)
  • BS Roformer - Separates audio into 6 stems (vocals, drums, bass, guitar, piano, other) — used by SixthSense and Fap Mixer
  • DrumSep - Decomposes drums into 5 sub-stems (kick, snare, toms, hi-hat, cymbals) — used by SixthSense
  • Wav2Vec2 - GPU-accelerated voice classification with gender detection and vocalization type identification — used by Fap Mixer
  • Whisper - Voice transcription with timestamps — used by JOI modes and Hypno/Edging
  • Shazamio - Identifies songs in DJ mixes and compilations with dual parallel queries and backward verification
  • MusicBrainz / Deezer / iTunes - Song duration lookup chain for boundary refinement in mixes

:level_slider: Adjustable Parameters

Each mood has its own dedicated parameter panel. Common parameters include:

  • Minimum/maximum position (movement range)
  • Detection sensitivity
  • Maximum device speed
  • Minimum interval between points
  • Intensity normalization between songs
  • Double pass mode for more detail (some moods)
  • Inverted mode (flips movement direction)

Advanced moods add specialized controls:

  • Fap Mixer: Stem role assignment, beats per cycle, voice action mapping (per vocalization type), gender-specific responses
  • SixthSense: Per-drum-substem weights (kick, snare, toms, bass), accent boost, fill detection sensitivity
  • Hypno/Edging: Custom keyword lists, detection thresholds, pattern triggers
  • JOI modes: LLM model selection, prompt customization, pattern mapping

All parameters include tooltips with explanations. Default values are pre-tuned for each mood.


:bullseye: Edge & Cum Editor
Visual timeline editor for creating climax sequences:

  • Multiple block types: Edge Base, Fast Vibration, Post-Climax, Silence and Custom
  • 9 available patterns in “Edge Base”: Crescendo, Decrescendo, Normal, Random, Build-up, Denial, Tease, Pulse, Wave Train
  • 8 available patterns in “Fast Vibration”: Stepped, Prog. Stepped, Tip Edge, Base Edge, Tip/Base Edge, Earthquake, Shutter, Surge
  • 8 available patterns in “Fast Vibration”: Slow Up/Down Curve, Slow Up/Down Vibro, Tip Care, Base Care, Milk, Torture, Cool Down, Aftershock
  • 5 intensity levels, vibration available with different strength and frequency levels, in “Edge Base” patterns.
  • Drag and drop blocks
  • Pattern preview

:bar_chart: Metrics on Completion
After each generation displays:

  • Points generated and duration
  • Speed statistics (minimum, maximum, average)
  • Distribution by speed ranges
  • Compatibility estimate for The Handy
  • Option to save report

:globe_showing_europe_africa: Languages

  • English
  • Español

:card_index_dividers: Tabs

  • Generator - Main tab where you load a video/audio file, select a mood, configure parameters, run AI analysis and generate the funscript.

  • Queue - Batch processing queue. Add multiple files with their own configurations and process them sequentially. Supports pause, cancel, time estimation and per-item status tracking.

  • Pattern Editor - Visual editor for creating custom movement patterns. Draw points on a canvas to define waveforms (sawtooth, step, zigzag, pulse) with configurable cycles, duration and position range. Patterns can be Repetitive (looped) or Adaptive (scaled to music). Supports undo/redo. Created patterns become available in mood configurations.

  • Compare - Comparison of two funscripts. Load two .funscript files and visualize them overlaid on a graph with zoom and scroll navigation. Shows statistics like point count, speed differences and similarity metrics.

  • History - Log of all past generations in the current session. Shows filename, mood used, point count, duration and timestamp. Allows re-applying a previous configuration with one click (“Use Config” button).

  • Combinator - Combines parts from funscripts generated with different moods for the same video. Load a video, auto-discovers all mood variants generated for it, then select time ranges from each source and insert them into a combined timeline. Features GPU-accelerated timeline rendering, video playback (VLC), drag markers for range selection and undo/redo. Export the combined result as a single funscript.

  • Point Editor - Full-featured funscript editor optimized for large files (20k+ points). GPU-accelerated timeline with real-time video playback sync. Color-coded velocity indicators. Supports point selection, multi-selection, drag editing, undo/redo (100 levels).


Technical Features

GPU Acceleration

  • CUDA support for NVIDIA GPUs
  • Processing with PyTorch and torchaudio
  • Automatic CPU fallback if no GPU available (Using librosa, but not verified at this time, the best option is to use GPU acceleration)

Cache System

  • Saves AI analysis to avoid repeated processing
  • Harmonic/percussive separation cache
  • “Cache Manager”

Project Management

  • Save/load projects with all parameters
  • Reusable templates
  • Generation history

VLC Integration

  • Synchronized video preview
  • Heatmap visualization

Device Safety

  • Handy Safe mode to limit speeds
  • Real-time speed verification
  • Alerts for problematic segments

Formats

  • Input: MP4, MKV, AVI, WebM, MOV, MP3, WAV, FLAC, OGG, M4A
  • Output: .funscript, .hc (chapters), .fproj (projects)

Requirements

Minimum

  • Windows 10/11
  • 8 GB RAM (16 GB RAM recommended)
  • 10GB for the application. Please note that generated caches can take up a lot of additional space.
  • Nvidia GPU with +4 GB VRAM (+8 GB VRAM recommended) (From 10xx to 50xx series works)

NOTE: @Alexcreeds has taken the time and effort to port the version to ROCm to make it work with certain AMD GPUs. You can also find the link below if you want to try it out.


Installation & First Run

Prerequisites

  • None

Installation

Running the Application

  1. Double-click run.bat
  2. The application window will open — you’re ready to generate!

F.A.P.S v0.8 changelog:

  • Added new “Pulse” mood, designed as an improved version of “Normal.”
  • The application now uses “embedded Python”… No installation required, just extract and run!
  • Added logo for the application, hope you like it, lol
  • Added a new “Preview Card” tab to generate a Card with grid of images, heatmap, song list, and other details.
  • Review and improvement of the “Normal” mood, added new “Spacing” control.
  • Review and improvement of the “Crazy” mood.
  • Review and improvement of the “Crazy+Voice” mood.
  • Review and improvement of the ‘Relaxed’ mood, added new “Onset detection” drop-down menu and “Fill gaps” option.
  • Review and improvement of “AUTO” and its 3 methods of use. Completely rewritten the BPM+RMS method, which now also uses Sections.
  • Removal/Hiding of unused parameters in moods.
  • New song detection system, better and faster
  • New display method for “Edge & Cum” patterns, now correctly represented graphically in the timeline.
  • Ability to disable songs, useful for avoiding processing intros or interludes that are not songs.
  • Option to rename the song block or redo the search through Shazam.
  • Added the option to do “AI Analysis Only” in Queue, with selection of the type of analysis. Very useful for running several AI Analyses automatically, which is what takes the most time, and then having the cache ready to start generating the Funscripts
  • Fixed bug in non-stereo or non-mono audio; now any audio is converted to stereo or mono as needed.
  • Spanish translation for “SixthSense” controls
  • Several missing translations for English

F.A.P.S v0.8.1 changelog:

  • Added missing dependencies (Pillow)
  • Added the \wheels folder and install.bat again

DOWNLOAD LINKS
F.A.P.S v0.8.1:


DONATIONS:

The application is and always will be free, but it has taken a LOT of time and some money to create it. If you like it and think it deserves it, donations are welcome :hugs:

ENJOY !!! :grin: :smiling_face_with_horns: :sign_of_the_horns:

EXTRA: User @Alexcreeds has taken the time and effort to port the version to ROCm to make it work with certain AMD GPUs. Here is his port if you want to try it out:

57 Likes

Sounds good and looks good. Will you make the program open source?

2 Likes

This looks promising!

As Fireblade said, will you make the program open? That would be good.
I am spanish so i would try the program in that language when its released.

1 Like

Yep, it will be free and modifiable if anyone wants to give themselves a headache, lol. It’s written in Python.
It’s my small contribution to this wonderful community that has given me so many great funscripts to enjoy!

Donations are welcome, but always OPTIONAL, in case anyone wants to show their appreciation for the work done and cover some of the costs of using advanced AI models for programming.

For the moment, I don’t plan to upload it to GitHub; I’m going to use MEGA to upload it. It only takes up about 300MB.
When installing, it uses the “venv” environment so as not to affect the system. The space used, taking into account the venv environment, is a total of 6.45GB + the Whisper AI model (I think that’s about 4-6GB more).

If anyone wants to be an early beta tester, send me a private message to talk about it.
I’m not far from giving it the green light. I need to review one of the tabs and do a final overview of the different moods using some test videos. Maybe in 2-3 weeks I’ll be able to release it to the public.

5 Likes

Have you checked out @k00gar’s work? looks like you could help their team

Are you referring to this?

It’s a completely different approach; that program is based on motion detection in the video. My application doesn’t do that; it’s based on the audio in the video, and it’s specially designed to quickly process and create Funscripts for long PMV videos, thanks to automatic song detection and adjustment, song part detection, sound analysis, different moods, etc.

yeah that’s what i am referring to, and I am aware of how FunGen works; I’m just saying that you can also help out that project if you so choose. They are always looking for more volunteers.

Ooo, sounds awesome. Will be subscribing to hear more :slight_smile:

1 Like

Looks good by far,but it is hard to tell unless we can try it. Single axis only or multi-axis ready? Other audio-based approach have lots of noise. Curious on how far can you reach in audio based approach solution. Happy to give a try. Great job. Thanks :slight_smile:

1 Like

I would absolutely love to get my hands on this. You’re doing gods work brother

1 Like

Thank you!
It’s single axis, and designed for use in The Handy, i.e., taking into account its movement limitations (maximum speed of 500mm/s, minimum speed of about 32mm/s, minimum interval above 35ms, 50ms recommended), but in reality it shouldn’t be difficult to adapt it to other devices, including multi-axis ones. The thing would be to think about how to use the axes, lol. In other words, you have to be clear about how to program it, and there are many possibilities… I don’t know, you could say that the rotation speed should be proportional to the intensity of the voice, for example, or that it should rotate every X beats, or that those X beats should be a lower or higher number depending on the BPMs, the energy, or even the type of section (intro, verse, chorus, etc.). There are many possibilities, but I haven’t been lucky enough to try a multi-axis yet, lol

As for noise, I haven’t tried other programs that do this, so I can’t judge or make comparisons, but I suppose that if the intention was to use the entire audio spectrum, it’s more likely that noise will get in and cause the movements to not follow the rhythm of the music properly. The application has several “moods,” each with a different approach, but there are several systems that help make audio detection as accurate as possible:

  • Detects different songs (allowing for easy manual adjustments), so each song is processed separately, avoiding problems with, for example, songs that have volume differences. (Shazamio)
  • Extracts the different components of the audio into their parts: Drums, Bass, Other, and Vocals, avoiding mixing all the audio and allowing you to use, depending on the mood, what best follows the rhythm of the music, or assigning different reactions according to this separation (Demucs).
  • AI analysis of beats, BPM, and other parameters (All-in-One, or BeatsNet)
  • Voice analysis to detect certain words and have the funscript generate a specific type of movement/reaction depending on the type of word; especially designed for non-PMV videos, such as hypno, joi, edging, hfo (Whisper)
  • Then, apart from all that, it has quite a few parameters to adjust it to your liking, or to each video.

Honestly, it’s impossible for me to test every funscript I generate with the application with every change I try with The Handy, but I’ve been able to test some and my sincere opinion is that, although at times it may not be perfect, there will always be a specific point or moment when you’ll think, “This could be better here.” The truth is that there are parts where I’ve thought, “Wow, this is cool here!” lmao
(And the "Edge & Cum - Advanced Editor is sooo cool!)
I like that the funscript reacts to the rhythm of the music, and in most funscripts that’s how it will be. In any case, it comes with a point editor and real-time previewer (like ScriptPlayer) that allows you to edit the funscript and make quick changes, even quickly adding patterns in parts where, instead of following the music, you might prefer it to behave differently.
Also, since everyone has their own tastes, I created several “moods,” so those who like it fast and with vibrations can use “Crazy” or “Crazy+Voice,” those who like to follow the rhythm of the music more can use ‘Normal’ or “Simple,” or those who like slow can use “Relaxed” or “Slow.” For non-PMV videos with voice and teasing, there are the “Hypno” and ‘Edging’ moods. In addition, there is the “AUTO” mood, which has three types of detection to use various parameters and different moods for each song or section.

The truth is that the options are overwhelming, and it would be helpful to have some help testing the generation of funscripts with various music files and getting feedback on them.

3 Likes

I really need this!!! plz let me!!! :heart_eyes:

Sure, no problemo.
I’m trying to get a small group of beta testers to iron out some final details before releasing it to the public.
If you want to participate, send me a message. There are already a few others interested.

Looks interesting. For a project this size a git instance seems unavoidable. Especially for possible future contributions.
With everything written in Python, shouldn’t the program be OS agnostic? Minimal requirements only list Windows 10/11 atm.

The minimum requirements mention that OS because it’s the one it’s been tested on, but if there are ways to use the AI models and libraries used on other systems (such as Linux), there shouldn’t be a problem. The only thing that is programmed (its installation) to use the wheels that will be included in the application files, which are specific to Windows, Python 3.12, and CUDA 12.4 (Pytorch 12.4). Everything is installed in a venv environment so it doesn’t conflict with the system.

If you will still need testers, I’d like to throw my hat in the ring once you’re ready.

For what it’s worth, I have been using a program that takes audio and creates a stepmania/dance dance revolution beatmap automatically. I then run that through Beats4Fun to turn that into a funscript. I love PMVs, but I don’t think I’d be any good at scripting them. This sounds like it would be a whole lot easier, and I’d probably get a pretty solid result this way too.

This project looks very impressive. Although I haven’t used it yet (I’m preparing to install it), I’d like to ask first whether this project supports analyzing pure audio files?

If you like PMVs, I’m sure you’ll like the app. I think what makes it stand out from other programs that create funscripts from audio is the individual use of songs, the separation of stems (instruments), and above all, the variety of “moods” that can be used. I’m sure there will be one to your liking. In addition, within each mood there are options (sliders, checkboxes) that allow you to adjust it to your liking.
But as I mentioned earlier, it’s impossible for 100% of the points created to be perfect and to your liking, and in this case there are two things about the app that are very useful:

  • “Point Editor”: Real-time preview of the funscript alongside the video (similar to ScriptPlayer), and the ability to edit (move, create, delete points), insert preconfigured “patterns” as well as create your own, and a system for detecting potential problems (speed above 500mm/s, speed below 32mm/s, or intervals below 35ms) with auto-correction.
  • “Edge & Cum - Advanced Editor”: Probably in the climax parts, whether at the end or in the middle, you will want it to behave in a certain way to emphasize that climax instead of acting like the rest of the funscript following the music; in this case, you can add pattern blocks, both preconfigured and your own. Some are “adaptive” and adjust to the duration of the pattern, which is fully configurable, and others are “repetitive.”
    In summary, it may not be 100% perfect automatically, but with some simple and quick editing, you can improve those points that you want to be different.

If you are interested in becoming a beta tester, send me a message.

GENERAL NOTE FOR THOSE WHO WISH TO BE BETA TESTERS: Although the application was initially created using the “librosa” library, which is pretty good and uses the CPU, as soon as I wanted to do more and better (AI models, GPU acceleration), I had to use CUDA. So beta testers must have an Nvidia GPU; I think 8GB of VRAM is enough. Perhaps, in the future, a version for AMD GPUs can be created, I’m not sure, we’ll see. For Intel (Arc) GPUs, I don’t think it can be easily adapted, so I’m ruling out this option.
The intention is that, for people who don’t have an Nvidia GPU, there will be a “fallback” and librosa will be used, so the application can also be used simply with a CPU, with some limitations or requiring more calculation time.
The problem is that the most recent tests of recent changes and additions, including “moods,” have always been using CUDA, so right now I can’t guarantee that the fallback to librosa will work, or work similarly to how it would with CUDA. This is something that, honestly, isn’t worth doing now while things are still being adjusted/changed; it’s something I’ll review at the end.

Yep, of course. Although I haven’t had the chance to try it, the application is capable of using audio files alone. In fact, the video is only used for previews, nothing else. The first step of the application is to extract the audio, as that is what it will work with.
I’ll have to try it out, and there may be a small problem with the parts that require loading the video preview, but that’s easy to fix.

1 Like

New feature in the application called “Combinator,” which allows you to mix the parts you want from the different funscripts that have been created using various moods, and combine them to your liking!
This is especially useful if you want to create a final funscript with clearly differentiated parts, allowing you to start with slow parts and gradually increase the intensity to end with much more dynamic and fast parts. Or you can have several parts within the same song to give a different touch to the solo part, for example.

Really easy to use:

  • Choose the video and it automatically searches for any funscripts generated by the application for that video.
  • Choose the one you want to use as a base.
  • Select the part you want from any of the available funscripts and click on “Insert Part.”
  • It allows you to save your session so you can resume your work later.
4 Likes