There is already several AI stuff online that can control your toy. Almost always behind a paywall… Which for me personally is an instant turn-off and something I will never use.
That is why I’ve been considering installing a local AI (LLM) on my laptop for a while now.
I just discovered ‘Ollama’, which is an easy tool. There will certainly be other and better ones, but I still have to investigate that. (You can definitely help me.)
But now I’d like to find a good unlimited/uncensored AI model.
First of all, to be able to have nice erotic & naughty conversations locally. And so I can connect it (via other software like OSRChat) to my OSR2+, and then my OSR can be controlled during that conversation.
But even better would be if that AI model could also edit my own photos or videos. And it could also generate pornographic photos and possibly videos.
And the ultimate would, of course, be a naked AI avatar/girlfriend on your screen/VR headset in real time, with which you can speak and which controls your toy in real time.
But I’m afraid we’ll have to wait a while for that…
Are there any other people working on this and have already achieved all sorts of things?
Please share your setups and results?
Tips and more information are very welcome.
Which AI models are good for this?
You wont be able to do anything quickly with laptop hardware. Even LLM chat wont be fast, we are talking probably 30 seconds or more for a single response. The other things you are talking about are a pipe dream to run with any reasonable speed on a laptop.
to answer your actual question certain models are better than others, I think you want to look for quantized models. A place to find open source models is Models – Hugging Face there are many uncensored models available there. Generally the bigger the base size of the model the slower it runs, but the better the quality.
This site: https://eu.daimonia.app/ provides a good framework for an LLM to control toys and lets you setup everything using local resources if you want, but its for vibe toys/estim. OSRchat seems like a better choice for your use case.
without knowing your hardware its hard to make a recommendation, and I’m not up to date on all the latest developments. I can say even with a 3090 desktop card with 24gb vram, using an LLM locally is not exactly good. While certain models can run quickly on local hardware (Mistral-Nemo-Instruct-2407.Q5_K_M is one example that runs quickly on my 3090) they struggle with things like context in longer conversations and repetitiveness. Granted my hardware is showing its age, but when you are using the online services your inputs are getting processed by clusters of 5090s or even more powerful enterprise level hardware to get responses to you that fast.
TBH from the way this post is worded it sounds like you don’t really understand how local LLM works. You’ll have to figure this out on your own, no one is going to do your research for you, and tell you what works best for your particular setup. There are guides out there than can get you started (there are some decent ones on eu.daimonia), but knowledge is earned, and you’ll have to experiment a lot.
My advice is to google what models can run on your GPU. VRAM is probably the biggest factor. try them out in olama and see for yourself how the performance is.
I have dabbled in this a bit, and found its not really worth it. Even if you do get it working, LLMs are too stupid to be very interesting for long IMO. Especially the open source uncensored models that can actually run on local hardware that doesn’t cost over 10k.
Thanks for bringing me back down to earth. Secretly, I was afraid of this.
Indeed, I still have TOO little knowledge about this and I was TOO enthusiastic about the ideas.
Of course, I have to do my own research, but my intention was more like to help myself along the way, as you already did. And to talk about this topic among people who have already tried this.
Anyway thank you so much for all the information you’ve already given me and for taking the time to respond in detail, and also give me a reality check.
In the meantime I have experienced that it works too slowly locally indeed.
My hardware is also a few years old.
CPU: AMD Ryzen 9 5900HX
GPU: RTX 3080 16GB
So it does seem like it might not be worth it (yet) with our current hardware.
I’m still going to do some further research and try things out.
So tips and/or experiences are still welcome here.