Merging of the Hordes. The AI Horde is live!

A while back (gosh, It occurs to me this project is half a year old by now!) I took significant steps to join the two forks I had made of the AI Horde (one for Stable Diffusion and one for Kobold AI) as they diverging code was too difficult to maintain and keep up to parity with features and bug fixes I kept adding.

Then later on, I realized that my code just could not scale anymore, so I undertook a massive refactoring of the code-base to switch to an ORM approach. Due to the time criticality of that refactor (at the time, the stable horde was practically unusable due to the sheer load), I focused on getting the stable horde API up and running and disregarded KoboldAI API, as that was running stable on a different machine and didn’t have nearly as much traffic to be affected.

Once that was deployed a number of other fires had to be constantly be put out and new features on-boarded as Stable Diffusion is growing by leaps and bounds. That meant I never really had a time to onboard the KoboldAI to the ORM as well, especially since the code required refactor to allow two types of workers to exist.

Later on, I added Image Interrogation capabilities as well, which incidentally required that I set up the horde to handle multiple types of workers. This lead me to figuring out how to do ORM class inheritance (which required me figuring out polymorphic tables and other fun stuff) but it also meant that a big part of the groundwork was laid to allow me to add the text workers (which is the kind of thing that does wonder to get my ADHD brain to get over its executive dysfunction).

Since then, it’s been constantly on the back of my mind that I need to finally do the last part and merge the two hordes into a single code base. I had kept the KAI horde into a single lonely branch called KAI_DO_NOT_DELETE (because I deleted the other branch once during branch cleanup :D) and the single-core horde node running. But requests for improvements and bug fixes on the KAI horde kept coming, and the code base was so diverged by now, that it was quite a mess to even remember how to update thing properly.

The final straw is when I noticed the traffic to the KAI Horde had also increased significantly, probably due to the ease of using it through KoboldAI Lite. It was getting closer and closer to the point where the old code base would collapse under its own weight.

So it was time. I blocked my weekend off and started the 4th large refactoring of the AI horde base. The one which would allow me to use the two horde types which were mutually exclusive in the past, at the same time.

This one meant a whole new endpoint, new table polymorphism and going through all my database functions to ensure that all the data is fetched from all types of polymorphic classes.

I also wanted to make my endpoints flexible as well, so it occurred to me it would be better to to have say api/v2/workers?type=text instead of maintaining api/v2/workers/image and api/v2/workers/text independently. This in turn run into caching issues, as my cache did not recognize the query part to store independently (and I am still not sure how to do it), so I had to turn to the redis cache.

That in turn caused by bandwidth to my redis cache to skyrocket, so now I needed to implement a local redis cache on each node server as well, which required rework for my code to handle two caches at the same time. It was a cascading effect of refactoring 😀

Fortunately I managed to get it all to work, and also updated the code for the KoboldAI Client and its bridge to use the new and improved version2 of the API and just yesterday, those changes were merged.

That in turn brought me to the next question. Now that the hordes were running together, it was not anymore accurate to call it “stable horde”, or “koboldai horde”. I had already foreseen this a while ago and I had renamed my main repo to the AI Horde. But I now found the need to also serve all sorts of generative AI content from the main server. So I made the decision to deploy a new domain name. And the AI Horde was born!

I haven’t flipped all the switches needed yet, so at the moment the old https://stablehorde.net is still working, but the eventual plan is to make it simple redirect to https://aihorde.net instead.

The KAI community is happy and I’m not anymore afraid they’re going to crash and burn from a random DB corruption and they can scale along with the rest of the Horde.

Now onward to more features!

The Kudos-based economy for the KoboldAI Horde

Work continues on the KoboldAI Horde apace, and in the past week I’ve added oauth auhentication but kept the anonymous access live as well. Let me tell you, figuring out how to use the flask.dance module for discord was a complete PITA.

But I got it done, and even added github authentication (and hopefully google soon, if they stop asking for silly things)

The authentication was added to allow me to build the next step, which is to be able to balance usage from the community.

The thing I expect to happen is that as the service becomes more popular, we might have way more clients trying to use the horde than we have actual contributing servers, and therefore we would run into performance bottlenecks. This could become unworkable especially if things like bots start using this service anonymously etc.

To provide some automatic balance, I have now introduced the concept of Kudos into the mix. A Kudos is a reward gained when someone contributes to the well-being horde in some way. There are two built-in methods to gain Kudos at the moment:

  1. Any time your KoboldAI server generates text for someone, you receive tokens based on the amount of text generated and the power of your model.
  2. KoboldAI servers get “uptime” rewards every 10 minutes they stay online and check-in looking for more requests. This incentivizes people leaving their instance running even in non-busy times.

On the flip-side, people using the Horde as clients are likewise consuming Kudos to do so. Every prompt you send, will reduce your balance by the same amount of Kudos rewarded to the server which fulfilled your request. Effectively you’re “thanking” the people running the instance which helped you.

But unlike typical currencies, you can still use this service even if you don’t have any Kudos. You simply go negative, which represents your “debt” to the community. This allows people to use the Horde anonymously as well (through the common Anonymous account) which simply keeps going more and more in negative Kudos.

But to avoid abusing this service by people who want to take and give nothing back, I needed to find a way to disincentivize negative kudos balances. And I just implemented the solution to do that which is a fairly simple one.

You kudos balance corresponds to your priority in the queue of requests

This means that if there’s plenty of servers around and not a lot of requests, nothing much changes, even for anonymous clients and they get to enjoy the full speed of the horde.

However on times of congestion, the requests sorted by kudos-order. The people with the most kudos get to have their requests fulfilled first, while those with the least, such as the Anonymous account, will get their generated last.

With some other tools provided such as bridge-based priorities, and server-owner-preference, this should allow owners of servers to make sure that not only “good” actors are the first to use their resources, but also to find their own ways to provide a service to selected members of the community. For example someone might set their family members to have priority on their server gens, even if those members are deep in negative kudos.

Along with that, I’ve deployed a kudos transfer service where peers can send kudos directly to others. So someone who has accumulated a large amount, might use that to reward good behavior in the community, such as writing documentation, or helping others. The kudos transfer can also happen via REST API, which can allow even automated tools to participate, such as discord bots which might trigger on special smileys assigned to posts and give a small kudos reward to the recipient.

There are of course many other things to consider, such as actions by malicious actors who might try to game the system. I really hope I won’t have to deploy countermeasures for people abusing community goodwill.

But until then, I really enjoy seeing the excitement from the KAI community, especially as heavy models like the fresh Erebus 13B are deployed into the horde. With the kudos infrastructure in place, there is now potential for new dynamics and reward systems to emerge organically from the community which will serve as an excellent guide on how to best proceed with this project.

The KoboldAI Horde: AI writing for everyone through mutual aid

Those who’ve followed my progress in developing Hypnagonia would know by now that I’ve been working on the integration of KoboldAI with Godot for the automatic generation of stories. This solution while working never quite satisfied me, as it required quite a lot of investment on behalf of a player who wanted to generate stories.

That approach left me unsatisfied. As it happened I had an idea to use the new KoboldAI API to create a distributed cluster based on consumer GPUs in people’s PCs. The concept being that one would connect their PC in some fashion to a server software and whenever someone sent a request for generation, an available PC would pick it up, generate it, and send it back to the requester.

The requester themselves would therefore not need a powerful GPU (or skill to use Google Colab) to use KoboldAI, while people who’s PC is otherwise sitting idle, would be supporting the other members of the community.

So after some discussions with members of the community, I decided this idea had legs and I wanted to build it before I continued with Hypnagonia. I’ve been working non-stop on this for the past week and I now feel it’s at a pretty good state.

Introducing the KoboldAI Horde!

This is a python server which you run on a server somewhere and it provides an interface with which people can request GPT writing generations. The second part is the bridge, which is what people who have their own KAI instances run, in order to connect the KAI server to the server. The last piece of software of course is the KoboldAI client which runs the models generating the text.

Until now, the only option for people without such powerful hardware would typically to use a service like NoveAI or AI Dungeon, where one has to pay for the service. They also provided a fancy interface and tools to help people write their individual stories. Aside from the cost however, there were also limitations. Dungeon AI recently imploded due to their own actions, and a lot of people were left without a home.

And of course, for a free software video game, such services are prohibitive, as I can’t pay out of pocket for the text generations of dozens of players (or imagine if my game ever became actually popular).

After a bit of hacking of the KoboldAI client itself, I also added functionality to allow someone to use their own KoboldAI client to talk to the Horde directly. Which means someone with a potato PC, can still use KoboldAI directly.

Now there are some obvious limitations as you might expect. First of all, the availability of the models in the horde is dependent on who is supporting it at the moment. If only people with weak GPU are in, the best you’ll find is a Nerys 2.7B. If however some of the big chonkers are in, you might be able to use KoboldAI as if you have a 20B model behind it.

The other problem is of course questions of privacy. Even though the text is encrypted during transfer, and even though the horde does not store any prompts or generations, at the end destination, the owner of the server can see all your prompts if they so wish, even if they don’t know who you are. So for those wanting the really steamy stuff, or perhaps those who write stories using real people and locations, this might be a deal breaker.

However for more normal stories, or for more generic generations such as for utilities and apps, this is hardly a concern.

Finally we’re still new, so currently everyone is playing nice. But there’s always the potential of bad actors, who might food the horde with garbage generations just to shut it down. There’s nothing I can do about someone dedicated to do that except go into a whitelist mode, and have the horde become an invite-only service, much like a private bittorrent tracker. In fact, anyone could set up their own private instance for their own community, for those situations where they want only trusted people to use it.

Eventually my plan is to follow the bittorrent approach for usage and contributions as well, where people will only be able to use the horde if they contribute somehow to the KAI community. Whether by adding their own GPU to the horde, or by writing documentation, etc.

But ultimately I want this to be a mutual aid based service. Where those who have the capacity help those who do not, and the latter find some other ways to make it up, for the uplifting of everyone in the KoboldAI community.

And before anyone suggests to tie this to some web3 token: Fuck that shit! I will work integrate with any cryptocurrencies! That is poison for our communities AND for our ecosystem.

The owner of the KAI client repo was also kind enough to redirect their domain to my KAI Horde instance, so now we have a very fancy url as well.

Official KoboldAI Horde

Already this is working amazingly well and I’m working on improvements daily. I’m excited to see what kind of doors this will open.