Sorry for full on ai writing but It’s late and I am waitign for 2 hrs of debugging right now.
Hey eroscripts. Been lurking here a while pulling scripts for the handy (Thank you to all you script makers paid or unpaid, I have spent way too much money on patreons haha)
I’ve been building Flowstate for the past few months — a self-hosted AI companion that runs on your own hardware and ties together voice-cloned personas, image-aware conversation, and AI-driven Handy control. It’s not vibes — there’s a real funscript pattern library indexer feeding the JOI Director, and a hardware velocity safety cap so the AI can never accidentally push past 400 mm/s.
Tonight I’m running a 6-agent verification squad over the codebase (basically 6 instances of Claude tearing through 63 tests across the app) and posting this while waves complete. Figured it’s a good moment to share where it’s at.
What’s working right now (personally tested)
- Voice-cloned personas — drop a
.wav, get a unique voice per character. Real-time streaming TTS, switches voice mid-conversation when you swap persona. - Roleplay flow — the persona stays in character, responds to dirty talk requests, picks up scenario hints. Honest take: she’s a bit on the coy/teasing side right now (see image 3 — “let’s just enjoy the tease” instead of going full filth on demand). Tuning the brain temperature + system prompts to be filthier is on my list.
- JOI Director chain — Gemma vision model decides the move (sp/dp/rng) per turn, syncs to your Handy via the v1 API. Hardware-capped at 67 sp (= 400 mm/s) so it can’t hurt you.
- Pattern library indexer — I added scripts snippets provided by @Slibowitz into a Python indexer that computes peak velocity, position envelope, action density. Each pattern gets mapped to handy sp/dp/rng bands so the AI dispatches from real human-authored data, not made-up numbers.
- Vision pipeline — point it at a folder of persona photos, it tags each one (skin level / mood / expression / intensity) AND writes an uncensored prose caption. During roleplay the persona picks the right photo to send based on the current mood. See image 4 — that photo was pulled from her own library because the conversation was matching “tease” mood.
- Live transcript / mic always-on — I can just talk and it transcribes (Faster-Whisper locally). Image 1 shows “Talk dirty to me” being picked up via voice.
- 3-machine LLM cluster — main PC (5090, 32 GB) runs the brain + scenario writer, laptop (8 GB) holds the vision/director, third PC handles only TTS. All federated through one LM Studio endpoint.
- 180 funscripts indexed into mood data —
patterns_index.jsonwith sp/dp/rng bands per named pattern. The AI dispatches from named moods likedoggy_slow,cowgirl_fast,bj_tongue_tease.
What needs work (honest list)
- TTS latency — works (image 1 shows the live transcript path firing), but the engine swap when the TTS server unloads/reloads between Chatterbox/IndexTTS2/Fish is 3-5 seconds. Hoping the verification agents help me trim that.
- Persona aggressiveness ceiling — the brain (qwen2.5-7b-uncensored) is willing but still hedges into “let’s tease first” mode a lot. Need to tune the system prompts to drop the safety hedging completely.
- JOI scenario re-injection bug — the scenario string isn’t auto-clearing per turn (verification just caught this — gets re-injected every reply until you toggle off). Real bug, fixing it.
- “Stop” voice command doesn’t fully halt the device — currently maps to “afterglow” mood (sp 8-20) instead of true sp=0. Only the red Stop button does true zero. Considering whether to harden voice “stop” to force sp=0.
- No persona creator UI yet — currently you edit JSON to make a new persona. Wizard is in the queue.
- Mobile build is rough — there’s a portable build path but no installer yet.
Tech stack
- Backend: FastAPI + WebSocket, Python 3.11
- Frontend: React + Vite + Zustand + Tailwind
- LLM serving: LM Studio cluster mode (free, GUI). Works with any OpenAI-compatible endpoint.
- TTS: I strongly recommend Pinokio for one-click TTS installs — Chatterbox / IndexTTS2 / F5-TTS / Fish Speech / VibeVoice all install through it without Python env wrangling. Flowstate talks to the Ultimate-TTS-Studio Gradio server.
- Hardware: Handy v2 API (HAMP mode), 400 mm/s hardware safety cap baked into the controller.
- STT: Faster-Whisper local, en-AU/en-US, always-on mic with adjustable noise gate.
- Vision: any vision-capable LLM (I use amoral-gemma3-12b-vision).
Hardware reality
What I’m running it on:
- Main PC: RTX 5090 (32 GB VRAM)
- Laptop: ~8 GB VRAM
- Third PC: ~12 GB VRAM (TTS only)
You don’t need three machines. A single 16-24 GB GPU running smaller quants of each model works fine — you’d just trade off some speed. 12 GB card with a 7B brain + a 4B vision + a small embed handles the whole stack on one PC.
Patreon interest gauge
I’ve been building this solo and I’m close to a state where I’d feel comfortable letting other people use it. If anyone here is interested in supporting development and getting access to a fully-tested WIP, I’m considering opening a small Patreon ($5-15/month range depending on tier). This is not a done deal if you read later I am still not sure what I should do mainly because I have spent days and days and days, also, a lot of irl money on gemini and claude api keys/subcsriptions.
What patrons would get — the skeleton of Flowstate:
- The full Flowstate framework (FastAPI backend + React UI)
- The funscript pattern indexer (drop your own scripts, get them mapped into AI-dispatchable moods)
- The JOI Director module + Handy connection layer with safety cap
- The vision + caption pipeline
- A few starter personas tuned to different VRAM budgets (low / mid / high) with template system prompts
- Setup guide for LM Studio + Pinokio TTS + persona creation
What’s NOT included:
- My specific model picks (you choose your own from LM Studio’s catalog)
- My voice clones (you source or record your own
.wav) - My personas (you build your own from the templates)
Not committing to anything yet — gauging interest. Drop a reply if this scratches an itch, especially if you’ve got specific features you’d want or specific funscript pattern types you’d want supported. The pattern library is the easiest thing to extend if you can send me some to analyse and add to the databse (The scriptdatabase would be sold as a .zip, I am contemplating uploading the skeleton and tutorial to github but the actual scripting analysing I guess like blood and muscles? behind a paywall? Not sure yet, Still not at a stage to upload it.
Cheers, and thanks to this community for the pattern inspiration —@Slibowitz gave me script snippets for this project. They’re short scripted sequences taken from actual scripts – Thank you! The rest I had taken from OFS scritpacks and other free software addons.