OSRChat makes your OSR device smarter. It can connect to any LLM as long as it follows the OpenAI API specification. You can also connect to Ollama for local deployment, ensuring 100% privacy. Additionally, you can customize prompts to build your own conversational scenarios, similar to SillyTavern.
If you plan to use Ollama, it is recommended to use the huihui_ai/glm-4.7-flash-abliterated:q4_K_S model (requires 22 GB of VRAM). Also, make sure to load the model once on the Ollama side before making a request, to avoid the first conversation getting stuck or becoming unresponsive.
The current prompts are not perfect, so you may need to test and adjust them yourself.
OSRChat is essentially a server, and you can call its API from other projects. However, please do not expose it to the public internet—only use it within a secure and controllable local network, as I did not take any network security issues into account when developing it.
I don’t trust any cloud service providers, even though they repeatedly promise privacy and security. By using OSRChat together with Ollama or vLLM, all data can be processed entirely within your local network, ensuring 100% privacy. OSRChat also does not record any chat history; if you really need to keep it, I’ve provided an export button that lets you save the current conversation.
You may have noticed a row of buttons at the top of the Chat page. The upward arrow button allows you to export the full context of the current conversation, along with the system prompt, as a JSON file. You can modify the exported JSON file (for example, by editing the LLM’s responses), and then use the downward arrow button to import the modified file. This will allow you to achieve the functionality you described.
Being able to modify the context more quickly would indeed be a great improvement. I will consider implementing this in a future version.
I actually tried developing it based on SillyTavern. However, SillyTavern is simply too bloated—so many settings are piled together that it’s hard to know where to start. I don’t want to add even more complexity to it.
My product design philosophy has always been “simple is better,” which is why OSRChat exists. The OSRChat installation package is only 19.3MB.
To support more LLM APIs, I used the OpenAI library. Since the release of GPT-3.5, many providers have adopted or made their APIs compatible with the OpenAI API specification so that they can be easily swapped with one another. This includes commonly used local deployment tools such as Ollama and vLLM.
As far as I know, KoboldCpp is compatible with the OpenAI-style API (the base URL might be http://localhost:5001/v1), so it should be able to connect directly.
I’m glad you like it.
Regarding prompts, since everyone has their own preferences and prompts may need slight adjustments depending on the model being used, I can only provide the most basic and widely compatible prompts as examples.
You can open the prompts directory from the tray icon’s right-click menu and modify them according to your preferences.