F.A.P.S (Funscript - Audio Processing System) by CaptainHarlock

F.A.P.S (Funscript - Audio Processing System) is a desktop application that automatically generates Funscript files by analyzing the audio of your videos. Using audio processing algorithms and AI, it creates device movements synchronized with the music, rhythm, and energy of the content.

Designed for devices like The Handy, F.A.P.S can generate complete scripts without manual intervention, while offering full control over every parameter for those who want to fine-tune the results. Multiple generation moods are available, each tailored to different content types and styles.

Developed by CaptainHarlock

Here are some screenshots:

Generator:


Moods:

AI Analysis:

Edge & Cum Advanced Editor:


Pattern Editor:

Queue System:

Compare:

History:

Point Editor:

Combinator:

Preview Card:

Recently created moods:
“Fap Mixer”


“SixthSense”

“Pulse”

“Beat Rider”

Here’s a summary of the main features:

Features

:performing_arts: Generation Modes (Moods) 18 different movement styles:

Music-Based (:musical_note:):

  • AUTO - Analyzes each segment and automatically applies the most appropriate mood
  • Normal - Beat-based movement for music with defined rhythm
  • Relaxed - Slow and smooth waves
  • Simple - Direct beat-to-movement conversion
  • Crazy - Intense and fast movement
  • Crazy+Voice - Crazy mode with voice vibrations
  • Slow - Slow and deliberate movements
  • Fap Mixer :test_tube: - Demucs stem separation with role assignment (primary rhythm, secondary complement, texture, vibration) + optional GPU voice classification detecting moans, screams, breath and gagging with configurable per-type actions

Voice-Based (:microphone:):

  • Hypno - Voice-reactive patterns with customizable keyword detection via AI transcription, designed for HFO/ASMR/JOI content
  • Edging - Teasing patterns with edge/denial/release keyword detection and build-up cycles

Advanced AI (:brain:):

  • Pulse :brain: - 3-layer generation using BS Roformer 6-stem + DrumSep 5-drum-substem separation: rhythm base from beats + drum sub-beat injection, dynamic amplitude from weighted stem RMS, and movement zone shifting by stem dominance
  • SixthSense :brain: - 7-layer adaptive generation using 6-stem + 5-drum-substem AI separation with section-aware profiles, sub-beat precision and guitar solo detection

Video-Based (:bullseye:):

  • Beat Rider :bullseye: - Visual pattern tracking using HSV color detection and line-crossing analysis on video frames, generating movements from detected visual beats in a user-defined region of interest

JOI Modes (:robot:):

  • JOI Director :test_tube: - Phrase-by-phrase AI director using Whisper transcription + LLM to assign actions (stroke, tease, edge, pulse, vibrate, release…) with 26+ pattern types
  • JOI Patterns :test_tube: - LLM classifies each spoken phrase into pattern categories with intensity and tempo
  • JOI LLM :test_tube::construction: - LLM outputs continuous parameters per phrase generating parametric waveforms
  • JOI LLM_A / LLM_B :test_tube::construction: - Parametric variants with explicit or auto-derived waveform shapes (sine, sawtooth, pulse, vibrate, staircase, shake)

:brain: AI Analysis

Uses several AI models and services running locally on your GPU:

• All-In-One — Music structure analysis (BPM, beats, downbeats, sections, key, energy)
• Beat This! + BeatNet — Neural beat and downbeat tracking
• BS Roformer (6 stems) — Vocal, drums, bass, guitar, piano, other separation
• DrumSep (5 substems) — Kick, snare, toms, hi-hat, cymbals separation
• Demucs — 4-stem separation (vocals, drums, bass, other)
• Wav2Vec2 — Voice feature extraction and nonverbal vocalization classification (moaning, screaming, etc.)
• Whisper + Silero VAD — Speech transcription with voice activity detection
• LLM Models (local, quantized) — Text analysis for JOI modes.
• Librosa — Audio feature extraction (RMS energy, onsets, spectral analysis, HPSS)
• Shazamio — Music fingerprint identification
• MusicBrainz / Deezer / iTunes — Song duration and metadata lookup


:level_slider: Adjustable Parameters

Each mood has its own dedicated parameter panel. Common parameters include:

  • Minimum/maximum position (movement range)
  • Detection sensitivity
  • Maximum device speed
  • Minimum interval between points
  • Intensity normalization between songs
  • Inverted mode (flips movement direction)

:bullseye: Edge & Cum Editor
Visual timeline editor for creating climax sequences:

  • Multiple block types: Edge Base, Fast Vibration, Post-Climax, Silence and Custom
  • 9 available patterns in “Edge Base”: Crescendo, Decrescendo, Normal, Random, Build-up, Denial, Tease, Pulse, Wave Train
  • 8 available patterns in “Fast Vibration”: Stepped, Prog. Stepped, Tip Edge, Base Edge, Tip/Base Edge, Earthquake, Shutter, Surge
  • 8 available patterns in “Fast Vibration”: Slow Up/Down Curve, Slow Up/Down Vibro, Tip Care, Base Care, Milk, Torture, Cool Down, Aftershock
  • 5 intensity levels, vibration available with different strength and frequency levels, in “Edge Base” patterns.
  • Drag and drop blocks
  • Pattern preview

:bar_chart: Metrics on Completion
After each generation displays:

  • Points generated and duration
  • Speed statistics (minimum, maximum, average)
  • Distribution by speed ranges

:globe_showing_europe_africa: Languages

  • English
  • EspaĂąol

:card_index_dividers: Tabs

  • Generator - Main tab where you load a video/audio file, select a mood, configure parameters, run AI analysis and generate the funscript.

  • Queue - Batch processing queue. Add multiple files with their own configurations and process them sequentially. Supports pause, cancel, time estimation and per-item status tracking.

  • Pattern Editor - Visual editor for creating custom movement patterns. Draw points on a canvas to define waveforms (sawtooth, step, zigzag, pulse) with configurable cycles, duration and position range. Patterns can be Repetitive (looped) or Adaptive (scaled to music). Supports undo/redo. Created patterns become available in mood configurations.

  • Compare - Comparison of two funscripts. Load two .funscript files and visualize them overlaid on a graph with zoom and scroll navigation. Shows statistics like point count, speed differences and similarity metrics.

  • History - Log of all past generations in the current session. Shows filename, mood used, point count, duration and timestamp. Allows re-applying a previous configuration with one click (“Use Config” button).

  • Combinator - Combines parts from funscripts generated with different moods for the same video. Load a video, auto-discovers all mood variants generated for it, then select time ranges from each source and insert them into a combined timeline. Features GPU-accelerated timeline rendering, video playback (VLC), drag markers for range selection and undo/redo. Export the combined result as a single funscript.

  • Point Editor - Full-featured funscript editor optimized for large files (20k+ points). GPU-accelerated timeline with real-time video playback sync. Color-coded velocity indicators. Supports point selection, multi-selection, drag editing, undo/redo (100 levels).

  • Preview Card - Generates visual preview/info cards for sharing funscripts.
    Extracts a grid of video frames (configurable rows/columns with frame offset control)
    and composites them with optional song list, speed heatmap and statistics. Supports vertical
    (9:16) and horizontal (16:9) layouts, dark/light themes, and exports as JPG.
    Also includes a Multi Funscript Card mode that auto-discovers all mood variants
    generated for the same video and creates a combined comparison card.


Technical Features

GPU Acceleration

  • CUDA support for NVIDIA GPUs
  • Support for modern AMD GPUs (Experimental, thanks to @Alexcreeds )
  • Processing with PyTorch and torchaudio
  • Automatic CPU fallback if no GPU available (Using librosa, but not verified at this time, the best option is to use GPU acceleration)

Cache System

  • Saves AI analysis to avoid repeated processing
  • Harmonic/percussive separation cache
  • “Cache Manager”

Project Management

  • Save/load projects with all parameters
  • Reusable templates
  • Generation history

Device Safety

  • Handy Safe mode to limit speeds
  • Real-time speed verification
  • Alerts for problematic segments

Formats

  • Input: MP4, MKV, AVI, WebM, MOV, MP3, WAV, FLAC, OGG, M4A
  • Output: .funscript, .hc (chapters), .fproj (projects)

Requirements

Minimum

  • Windows 10/11
  • 8 GB RAM (16 GB RAM recommended)
  • 10GB for the application. Please note that generated caches can take up a lot of additional space.
  • Nvidia GPU with +4 GB VRAM (+8 GB VRAM recommended) (From 10xx to 50xx series works) (Experimental support for AMD GPUs added in v0.9)

NOTE: @Alexcreeds has taken the time and effort to port v0.8 to ROCm to make it work with certain AMD GPUs. You can also find the link below if you want to try it out.


Installation & First Run

Prerequisites

  • None
  • IMPORTANT: Don’t use spaces in the path

Installation

Running the Application

  1. Double-click run.bat
  2. The application window will open — you’re ready to generate!

Changelog v0.8
  • Added new “Pulse” mood, designed as an improved version of “Normal.”
  • The application now uses “embedded Python”… No installation required, just extract and run!
  • Added logo for the application, hope you like it, lol
  • Added a new “Preview Card” tab to generate a Card with grid of images, heatmap, song list, and other details.
  • Review and improvement of the “Normal” mood, added new “Spacing” control.
  • Review and improvement of the “Crazy” mood.
  • Review and improvement of the “Crazy+Voice” mood.
  • Review and improvement of the ‘Relaxed’ mood, added new “Onset detection” drop-down menu and “Fill gaps” option.
  • Review and improvement of “AUTO” and its 3 methods of use. Completely rewritten the BPM+RMS method, which now also uses Sections.
  • Removal/Hiding of unused parameters in moods.
  • New song detection system, better and faster
  • New display method for “Edge & Cum” patterns, now correctly represented graphically in the timeline.
  • Ability to disable songs, useful for avoiding processing intros or interludes that are not songs.
  • Option to rename the song block or redo the search through Shazam.
  • Added the option to do “AI Analysis Only” in Queue, with selection of the type of analysis. Very useful for running several AI Analyses automatically, which is what takes the most time, and then having the cache ready to start generating the Funscripts
  • Fixed bug in non-stereo or non-mono audio; now any audio is converted to stereo or mono as needed.
  • Spanish translation for “SixthSense” controls
  • Several missing translations for English
Changelog v0.8.1

F.A.P.S v0.8.1 changelog:

  • Added missing dependencies (Pillow)
  • Added the \wheels folder and install.bat again
Changelog v0.9
  • New mood: “Beat Rider,” which takes a radically different approach from the rest; it doesn’t analyze audio but rather visual data, and is designed to quickly generate funscripts for videos that use a beat indicator (such as Cockheros and similar)

  • “Edge & Cum”: Added “Generate & Preview” inside “Edge & Cum”: Allows fine-tuning of Edge & Cum patterns, and the edits can be saved to apply them to any generated Funscript (baked points); these edits are also saved when the project is saved

  • “Edge & Cum”: New method in “Analyze & Suggest” for automatically adding blocks, based on visual analysis (note: the video must contain relevant text such as “cum,” “stop,” etc. This is configurable)

  • “Point Editor”: Added option to move forward/backward by song

  • “Point Editor”: Increased the maximum zoom in

  • “Point Editor”: Added volume control for the preview

  • “Point Editor”: When jumping between problem points (error detection in “Advanced”), the sliders and video time are now updated in real time

  • “Preview Card”: Added an option to generate animated GIFs, with various quality profiles, including one designed to create GIFs compatible with the EroScripts preview

  • “Fap Mixer”: Added a dropdown menu for selecting stems per block directly in the main UI (previously this had to be done in Configure AI)

  • “Fap Mixer”: Added a new, improved method for assigning stems to blocks

  • “Fap Mixer”: Added a “Simplif” slider to prevent the excessive creation of useless points (RDP filter)

  • “Normal”: Added “By Section” checkbox that automatically adjusts the “Spacing” value based on section type (starting from the current value in Spacing)

  • “Simple”: Added “Beats by Section” checkbox that automatically adjusts the “Beats” value based on section type (starting from the current value in Beats)

  • “Simple”: Added “RMS Windows by Section” checkbox that automatically adjusts the “RMS Windows by Section” value based on section type (starting from the current value in RMS Window)

  • ‘Slow’: Fixed an issue in the formula that caused flat or very slow-moving parts of the funscript

  • “Slow”: Added “By Section” checkbox that automatically adjusts the intensity of the movement based on the section type

  • ‘Slow’: Added “Min RMS Energy” slider to set a minimum movement

  • “Slow”: Fixed an issue with the Ease-In that generated many consecutive useless points

  • “Relaxed”: Added a dropdown menu to add extra movement to slower sections (4 patterns to choose from; can be applied only on ups, only on downs, or both, with a threshold control for applying it)

  • “Crazy”: No longer generates movement in parts of songs that the user has disabled (intro, between songs, ending)

  • “Crazy”: Added an RDP filter to prevent the creation of unnecessary redundant points

  • “Crazy”: Added a filter to prevent the generation of “flat zones” that caused short stutters

  • “Crazy+Voice”: Added an RDP filter to prevent the creation of unnecessary redundant points

  • “Crazy+Voice”: Added a filter to prevent the generation of “flat zones” that caused short stutters

  • “Hypno”: Critical, the controls for “keywords” weren’t actually working; they’ve been reviewed and improved

  • “Hypno”: Priority system for “keywords”, higher positions have higher priority

  • “Hypno”: Now, if “Detect Voice” is disabled, the generated cache (without voice detection) is saved, and when the same video is loaded again, the funscript can be generated directly without having to rerun “AI Analysis” without voice detection

  • “Hypno”: Added a new “Threshold” slider that controls the sensitivity for zone changes (low/medium/high)

  • “Hypno”: Added additional filters such as RDP, “flat zones”, and others to reduce the number of points while maintaining the movement.

  • “Hypno”: Fixed an issue with the installer that installs an incorrect version of Whisper (this could also affect “Edging” or the JOIs)

  • “Heatmap”: Resolution enhanced

  • Fixed MANY minor issues

  • Added more missing translations

  • Added VLC libraries: Avoids known compatibility issues with the user’s VLC versions

  • Added CPU fallback for All-in-One: Avoids a known issue with Nvidia 10xx Series GPUs (and likely other models not compatible with All-in-One via GPU)

  • Installation: There are now two installation files. “Install_Nvidia.bat,” which is the official installer currently in use. And “Install_AMD.bat,” which is experimental and based on the port created by user @Alexcreeds

DOWNLOAD LINKS

Old Versions

F.A.P.S v0.8.1:

8.18 GB file on MEGA

F.A.P.S v0.9:


DONATIONS:

The application is and always will be free, but it has taken a LOT of time and some money to create it. If you like it and think it deserves it, donations are welcome :hugs:

ENJOY !!! :grin: :smiling_face_with_horns: :sign_of_the_horns:


EXTRA: User @Alexcreeds has taken the time and effort to port the version to ROCm to make it work with certain AMD GPUs. Here is his v0.9 port if you want to try it out:


Here a demo video showing how to use the app.
Just over 1 hour (real time) showing the AI video analysis process, song adjustment, use of musical moods (Pulse, SixthSense, Normal, Relaxed, Slow, Simple, Crazy, Crazy+Voice, Fap Mixer, AUTO), using “Edge & Cum” to add patterns, using “Point Editor” to preview the generated funscript and make adjustments, and using “Preview Card” to generate Cards or animated GIFs.

76 Likes

Sounds good and looks good. Will you make the program open source?

2 Likes

This looks promising!

As Fireblade said, will you make the program open? That would be good.
I am spanish so i would try the program in that language when its released.

1 Like

Yep, it will be free and modifiable if anyone wants to give themselves a headache, lol. It’s written in Python.
It’s my small contribution to this wonderful community that has given me so many great funscripts to enjoy!

Donations are welcome, but always OPTIONAL, in case anyone wants to show their appreciation for the work done and cover some of the costs of using advanced AI models for programming.

For the moment, I don’t plan to upload it to GitHub; I’m going to use MEGA to upload it. It only takes up about 300MB.
When installing, it uses the “venv” environment so as not to affect the system. The space used, taking into account the venv environment, is a total of 6.45GB + the Whisper AI model (I think that’s about 4-6GB more).

If anyone wants to be an early beta tester, send me a private message to talk about it.
I’m not far from giving it the green light. I need to review one of the tabs and do a final overview of the different moods using some test videos. Maybe in 2-3 weeks I’ll be able to release it to the public.

5 Likes

Have you checked out @k00gar’s work? looks like you could help their team

Are you referring to this?

It’s a completely different approach; that program is based on motion detection in the video. My application doesn’t do that; it’s based on the audio in the video, and it’s specially designed to quickly process and create Funscripts for long PMV videos, thanks to automatic song detection and adjustment, song part detection, sound analysis, different moods, etc.

yeah that’s what i am referring to, and I am aware of how FunGen works; I’m just saying that you can also help out that project if you so choose. They are always looking for more volunteers.

Ooo, sounds awesome. Will be subscribing to hear more :slight_smile:

1 Like

Looks good by far,but it is hard to tell unless we can try it. Single axis only or multi-axis ready? Other audio-based approach have lots of noise. Curious on how far can you reach in audio based approach solution. Happy to give a try. Great job. Thanks :slight_smile:

1 Like

I would absolutely love to get my hands on this. You’re doing gods work brother

1 Like

Thank you!
It’s single axis, and designed for use in The Handy, i.e., taking into account its movement limitations (maximum speed of 500mm/s, minimum speed of about 32mm/s, minimum interval above 35ms, 50ms recommended), but in reality it shouldn’t be difficult to adapt it to other devices, including multi-axis ones. The thing would be to think about how to use the axes, lol. In other words, you have to be clear about how to program it, and there are many possibilities… I don’t know, you could say that the rotation speed should be proportional to the intensity of the voice, for example, or that it should rotate every X beats, or that those X beats should be a lower or higher number depending on the BPMs, the energy, or even the type of section (intro, verse, chorus, etc.). There are many possibilities, but I haven’t been lucky enough to try a multi-axis yet, lol

As for noise, I haven’t tried other programs that do this, so I can’t judge or make comparisons, but I suppose that if the intention was to use the entire audio spectrum, it’s more likely that noise will get in and cause the movements to not follow the rhythm of the music properly. The application has several “moods,” each with a different approach, but there are several systems that help make audio detection as accurate as possible:

  • Detects different songs (allowing for easy manual adjustments), so each song is processed separately, avoiding problems with, for example, songs that have volume differences. (Shazamio)
  • Extracts the different components of the audio into their parts: Drums, Bass, Other, and Vocals, avoiding mixing all the audio and allowing you to use, depending on the mood, what best follows the rhythm of the music, or assigning different reactions according to this separation (Demucs).
  • AI analysis of beats, BPM, and other parameters (All-in-One, or BeatsNet)
  • Voice analysis to detect certain words and have the funscript generate a specific type of movement/reaction depending on the type of word; especially designed for non-PMV videos, such as hypno, joi, edging, hfo (Whisper)
  • Then, apart from all that, it has quite a few parameters to adjust it to your liking, or to each video.

Honestly, it’s impossible for me to test every funscript I generate with the application with every change I try with The Handy, but I’ve been able to test some and my sincere opinion is that, although at times it may not be perfect, there will always be a specific point or moment when you’ll think, “This could be better here.” The truth is that there are parts where I’ve thought, “Wow, this is cool here!” lmao
(And the "Edge & Cum - Advanced Editor is sooo cool!)
I like that the funscript reacts to the rhythm of the music, and in most funscripts that’s how it will be. In any case, it comes with a point editor and real-time previewer (like ScriptPlayer) that allows you to edit the funscript and make quick changes, even quickly adding patterns in parts where, instead of following the music, you might prefer it to behave differently.
Also, since everyone has their own tastes, I created several “moods,” so those who like it fast and with vibrations can use “Crazy” or “Crazy+Voice,” those who like to follow the rhythm of the music more can use ‘Normal’ or “Simple,” or those who like slow can use “Relaxed” or “Slow.” For non-PMV videos with voice and teasing, there are the “Hypno” and ‘Edging’ moods. In addition, there is the “AUTO” mood, which has three types of detection to use various parameters and different moods for each song or section.

The truth is that the options are overwhelming, and it would be helpful to have some help testing the generation of funscripts with various music files and getting feedback on them.

3 Likes

I really need this!!! plz let me!!! :heart_eyes:

Sure, no problemo.
I’m trying to get a small group of beta testers to iron out some final details before releasing it to the public.
If you want to participate, send me a message. There are already a few others interested.

Looks interesting. For a project this size a git instance seems unavoidable. Especially for possible future contributions.
With everything written in Python, shouldn’t the program be OS agnostic? Minimal requirements only list Windows 10/11 atm.

The minimum requirements mention that OS because it’s the one it’s been tested on, but if there are ways to use the AI models and libraries used on other systems (such as Linux), there shouldn’t be a problem. The only thing that is programmed (its installation) to use the wheels that will be included in the application files, which are specific to Windows, Python 3.12, and CUDA 12.4 (Pytorch 12.4). Everything is installed in a venv environment so it doesn’t conflict with the system.

If you will still need testers, I’d like to throw my hat in the ring once you’re ready.

For what it’s worth, I have been using a program that takes audio and creates a stepmania/dance dance revolution beatmap automatically. I then run that through Beats4Fun to turn that into a funscript. I love PMVs, but I don’t think I’d be any good at scripting them. This sounds like it would be a whole lot easier, and I’d probably get a pretty solid result this way too.

This project looks very impressive. Although I haven’t used it yet (I’m preparing to install it), I’d like to ask first whether this project supports analyzing pure audio files?

If you like PMVs, I’m sure you’ll like the app. I think what makes it stand out from other programs that create funscripts from audio is the individual use of songs, the separation of stems (instruments), and above all, the variety of “moods” that can be used. I’m sure there will be one to your liking. In addition, within each mood there are options (sliders, checkboxes) that allow you to adjust it to your liking.
But as I mentioned earlier, it’s impossible for 100% of the points created to be perfect and to your liking, and in this case there are two things about the app that are very useful:

  • “Point Editor”: Real-time preview of the funscript alongside the video (similar to ScriptPlayer), and the ability to edit (move, create, delete points), insert preconfigured “patterns” as well as create your own, and a system for detecting potential problems (speed above 500mm/s, speed below 32mm/s, or intervals below 35ms) with auto-correction.
  • “Edge & Cum - Advanced Editor”: Probably in the climax parts, whether at the end or in the middle, you will want it to behave in a certain way to emphasize that climax instead of acting like the rest of the funscript following the music; in this case, you can add pattern blocks, both preconfigured and your own. Some are “adaptive” and adjust to the duration of the pattern, which is fully configurable, and others are “repetitive.”
    In summary, it may not be 100% perfect automatically, but with some simple and quick editing, you can improve those points that you want to be different.

If you are interested in becoming a beta tester, send me a message.

GENERAL NOTE FOR THOSE WHO WISH TO BE BETA TESTERS: Although the application was initially created using the “librosa” library, which is pretty good and uses the CPU, as soon as I wanted to do more and better (AI models, GPU acceleration), I had to use CUDA. So beta testers must have an Nvidia GPU; I think 8GB of VRAM is enough. Perhaps, in the future, a version for AMD GPUs can be created, I’m not sure, we’ll see. For Intel (Arc) GPUs, I don’t think it can be easily adapted, so I’m ruling out this option.
The intention is that, for people who don’t have an Nvidia GPU, there will be a “fallback” and librosa will be used, so the application can also be used simply with a CPU, with some limitations or requiring more calculation time.
The problem is that the most recent tests of recent changes and additions, including “moods,” have always been using CUDA, so right now I can’t guarantee that the fallback to librosa will work, or work similarly to how it would with CUDA. This is something that, honestly, isn’t worth doing now while things are still being adjusted/changed; it’s something I’ll review at the end.

Yep, of course. Although I haven’t had the chance to try it, the application is capable of using audio files alone. In fact, the video is only used for previews, nothing else. The first step of the application is to extract the audio, as that is what it will work with.
I’ll have to try it out, and there may be a small problem with the parts that require loading the video preview, but that’s easy to fix.

1 Like

New feature in the application called “Combinator,” which allows you to mix the parts you want from the different funscripts that have been created using various moods, and combine them to your liking!
This is especially useful if you want to create a final funscript with clearly differentiated parts, allowing you to start with slow parts and gradually increase the intensity to end with much more dynamic and fast parts. Or you can have several parts within the same song to give a different touch to the solo part, for example.

Really easy to use:

  • Choose the video and it automatically searches for any funscripts generated by the application for that video.
  • Choose the one you want to use as a base.
  • Select the part you want from any of the available funscripts and click on “Insert Part.”
  • It allows you to save your session so you can resume your work later.
4 Likes