The AI Horde Worker Moves to a Completely New Inference Backend

Close to a month an a half ago, our last remaining maintainer for the nataili library dropped out and we were left functional but “rudderless” as far as inference goes. We could continue operations, but we couldn’t onboard new features anymore as neither me nor any of the remaining regulars have ML knowledge.

In desperation, I asked one of our regulars, Jug, who had been helping out with some python work on the worker if he thinks it would be possible to switch to the ComfyUI software as a backend, as it had some good ideas and was modular enough to be of use to us

To my surprise Jug not only thought it was a good idea, but jumped with both legs in the deep and started hacking around to make it work. Not only that, but we managed to suck-in another regular developer in, Tazlin, who also started helping us with design best practices. As a result, the new library we started developing was built from the ground up to have extensive coverage support which will make us discover regression bugs that much easier.

First steps were to develop feature parity, and that required not only to wrangle the comfyUI pipelines to be called from nataili, but also to port features which we were using in the AI Horde Worker, such as clip, over to the comfyUI.

This early phase was were I could still provide some help, as I’m pretty good at porting features and writing tests for them, and then integrating stuff into the AI Horde Worker, but still the lion’s share of the work on hordelib was being done by Jug, with Tazlin making the code much more reliable and maintainable.

A couple of weeks in, we had almost all the features we needed, but this is where the tricky business started. First we noticed is that comfyUI was not handling multi-threading well, which make sense as it’s meant to be used by a single user on a single PC. That added massive amounts of instability, because our AI Horde Worker is using threads for everything, to nullify latency delays.

So the next phase for about two more weeks was stabilizing the thing, which required a much deeper dig into the comfyUI internals to wrangle individual processes into a multi-threaded paradigm.

Finally that was done, about 1 month after I inquired about moving to comfy. Then we discovered the next problem: Due to all the mutex locks to prevent multi-threaded instability, the whole things was now much slower than nataili was. Like significantly so!

So another two weeks were spend of figuring out where to slowdowns occurred in our implementation and tweaking things to work more optimally, and even trying to figure out if there was indeed a slowdown in the first place as comparisons with the nataili was difficult to achieve.

We even built a whole benchmark suite to see overall speeds in inference, without getting confused with HTML and model loading latency.

But beta testers were still informing us of a seemingly lower kudos reward, so then we suspected the old way of calculating kudos was not applying well to the hordelib inference, due to it working differently. For example it has no slowdown for weights, but control-net types gave out different speeds than we expected, even different speeds per control type.

To track this down, Jug trained a new Neural Network for figuring how much time a generation is expected to take, rather than try to time each individual feature. The new model was so successful at 96% accuracy, that we decided to onboard it onto the AI Horde itself, as a way to calculate kudos more accurately.

This investigation did point us to some things that worked unexpectedly within comfyUI, for example longer prompts than 77 tokens tended to be quite slower, which was a quality thing after speaking with the comfyUI devs. We did discover a workaround for the AI Horde but it’s these sort of things that are introducing unexpected slowdowns compared to before. We’re going to continue looking for and tweaking things as we discover them.

The good news is that the overall quality of images using the comfyUI branch has increased across the board. Not only that but weights not only don’t add extra slowdown (so the extra kudos cost is removed), but they can also exceed 1.3 without causing the image to distort, which is how most other UIs are using them anyway.

The big change is that images with the same payload and the same seed, will look different in comfyUI compared to nataili. This is simply due to the way inference works and something we’ll have to live with.

1.0.0

So now we have the three pillars built: Parity, Stability and Speed; it’s time to go live!

The hordelib has been bumped to 1.0.0 and the AI Horde Worker to 21.0.0. When you run update-runtime next time, you’ll automatically be switched to the new inference backend but you may need to update your bridgeData.yaml file ahead of time.

Very shortly

  1. Set the vram_to_leave_free and ram_to_leave_free to values that work for you.
  2. rename nataili_cache_home to cache_home
  3. You can delete any unused keys (like disable_voodoo)

Also as a user of the AI Horde, keep in mind that the new Workers do not yet support tiling and pix2pix

But not only if the new inference available for the AI Horde, but also for everyone else. Due to the generic way we’ve built it, any python project which needs access to image generation can now import hordelib from pypi, and get access to all the multi-threaded text2img and img2img functionality we provide!

What’s next

With the move to hordelib, we are now effectively outsourcing our inference development upstream, which allow us to get to use new developments in stable diffusion as they get on-boarded into their software. Hopefully development of ComfyUI will continue for the foreseeable future as I am really not looking forward to changing libraries again any time soon >_<

This also means that we now finally have the capability to onboard LoRas and textual inversion as well which have been requested for a long time, but we never had the capability in our backend. Likewise with new Stable Diffusion models and all the exciting new developments happening practically weekly.

It’s been a lot of hard work, but we’re coming out of it stronger than ever, thanks to the invaluable help of Jug, Tazlin and the rest of the AI Horde community!

Key Sharing

The AI Horde is built around the concept of Mutual Aid, to allow people who have, to aid with those who have not. It’s just that it is about aiding for one specific purpose, of using generative AI.

A lot of the design decisions of the AI Horde have been added to facilitate this purpose, such as kudos transfers, which have in turn been turned into things like discord emojis etc. And I’m always looking for ways to reinforce this behavior.

To this end, I am excited to announce a new feature on the AI Horde: Shared Keys

What are shared keys?

In short, they are API keys which can only be used to generate images and text, and not valid for doing any other operations, such as transferring kudos or rating images. The idea here is that someone can created a shared key to give to friends and family, to allow them to use their account priority and to lower the on-boarding requirements of registering their own accounts and not worrying that they might leak it.

Whenever a shared key is used, the kudos is consumed from the origin account and the priority used for that generation is the same as the owner’s. The generation also shares concurrency with that account so if you are planning to share with a lot of people they might end up getting in each other’s way.

Shared Keys can also be given an optional kudos limit, and an expiry date, after which they stop working. A kudos limit doesn’t affect their priority, just prevents the shared key form being used once that limit has been reached.

How do I create an API key?

Until UIs add the option to create them, the simplest way is to use the API web interface directly: http://aihorde.net

Alternatively you can open a console terminal and send a CURL call like so:

curl -X 'PUT' \
  'https://aihorde.net/api/v2/sharedkeys' \
  -H 'accept: application/json' \
  -H 'apikey: YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
  "kudos": 10000,
  "name": "Mutual Aid"
}'Code language: PHP (php)

just add your own API Key and change the kudos limit and name as you wish. You can also set kudos to -1 to allow unlimited sharing with that key.

We also provide an endpoint to check how much a key has been utilized until now

{
  "id": "4cb776de-31f0-4895-9fc3-b2e1d17a64f0",
  "username": "db0#1",
  "kudos": 2320,
  "utilized": 7684
}Code language: JSON / JSON with Comments (json)

How do I use a Shared Key

Simply use it place of a normal API key to the UI of your choice.

Can I modify a Shared key

Yes, you can both “top-up” existing keys, add/remove expiry, or delete them altogether.

What’s next?

The Shared Keys are designed to be pretty open in their usage. I expect use cases around “service accounts” for communities where people are pooling their kudos somehow, but I am also curious what other emergent uses people will come up around this system.

And If you have a user-case which requires tweaking of this functionality, do let me know!

What About Paid Services on Top of the AI Horde?

While the AI Horde will always be free for all, anyone can develop frontend for it and ask their users to pay for its use. This blogpost explains why this is OK so long as they give back as much as they take and how this is enforced.

Recently, a paid service built on top of the AI Horde was announced on reddit’s /r/stablediffusion and a big discussion opened on the ethics of charging people for money for access to the free compute provided by the AI Horde. I’ve talked about this in my discord with some users who were concerned, but I foresee it’s a subject that will keep coming up. So It’s a good time to clarify my position on this subject, “officially” as they say.

When I initially envisioned the AI Horde, this sort of question was foremost on my mind. “How to I prevent abuse of a crowdsourced system with unrestricted access for everyone?” My answer to this question was the Kudos system, which is baked-in on every usage of the AI horde.

Due to the “protection” of the kudos system, we can offer the AI Horde service as an open API for everyone, for any purpose. Knowing that whatever they do, they’ll have to either support the health of the service, or go back to the end of the queue. This allows us to not worry about who or how they’re using the service, because the kudos requirements are inescapable. This bears reiterating:

WE DO NOT CONTROL HOW OTHERS INTEGRATE WITH THE AI HORDE

Because we cannot control people, I am cognizant that people might try to charge money for their services based on the horde (which again, we cannot stop!) or even other technologies we wholly reject (like blockchain). But It doesn’t matter how someone uses the AI Horde; so long as they remain within the limits of the Kudos system, they will have to provide more to the AI Horde than they take out, which balances things out for everyone.

This is the practice of all open paradigms out there. They all rely on volunteer effort but allow people to find business models which can make them money, so long as they respect the open paradigm.

For example, the AI Horde is modeled after BitTorrent. It would be just as absurd to claim that the BitTorrent protocol itself is flawed because a Torrent client is charging money to their users, adding malware or integrating blockchain. Those users still have to play by the BitTorrent protocol and by whichever tracker rules they’re based on.

Likewise, even the most hardcore copyleft licenses like GPL explicitly allow commercial use of the software. Because people need to eat! It would likewise be absurd to say that the Linux kernel is unethical, because companies are making money selling stuff built on top of the Linux Kernel!

So knowing that open systems cannot control how other use them, and that the actions of integrators do not represent flaws of open system itself, we instead ask people to act in good faith. We request people to give back to he AI Horde as much, or more than they take. This means that everyone benefits. We likewise block registrations outside of the AI Horde and inform anyone registering that they can always use the AI Horde for free. This ensures that the owner of each service competes with every other free AI Horde UI out there. If their users still want to give them money after that, then they are obviously bringing something valuable to the table for those users. And again, that is OK with us, so long as they give back to the AI Horde according to their usage.

Finally, whatever one does, remember, they cannot escape the kudos system. A super popular front-end to the AI Horde which does not have at least a net zero consumption, will quickly find itself with such high queue times that will drive everyone else off their service.

The AI Horde is absolutely built to combat corporate influences and enshittification, however it is still an open service, and therefore it cannot control who uses it, without sacrificing that openness it is built on, or adding moderation overheads so massive that it would shutter the service.

Does that mean that everything goes? No of course not! As with the anti-CSAM filter, there’s a few rules that are of existential importance to the health of the AI Horde. For example another one is how one treats kudos themselves: I routinely remind people to not consider them a currency and to not assign any monetary value to them. The reason being that the exchange of kudos for money would introduce such immense perverse incentives into the equation, that it would cause the AI Horde development and moderators to switch full-time to countering scams and exploits instead of trying to improve the service. This is such a thick red line that I’m prepared to go to extremes to enforce it, even up to disabling kudos transfers altogether!

Fortunately until now people are following these directives, but what if tomorrow a service appeared whose business model relied on selling kudos they generated to their users, or which allowed people to bypass the anti-CSAM filter somehow? Well that would force me to take active means to counter such a service explicitly, which would easily escalate into an endless cat&mouse game at the detriment of the service. But it would be a necessary course of action. But the existence of a generic paid service however, outside of the violation of those rules for the AI Horde, does not necessitate it, precisely because it’s not an existential concern which would warrant the massive amount of resources that would have to be assigned to counter it.

All that said, I know people are still going to oppose the mere existence of integrations which found a way to make money using the AI Horde as a backend, even if those give back more than they take. Even if they help pay for the development and infrastructure of the AI Horde for the benefit of all. That is OK. Everyone should follow their conscience and values. I have even provided tools and controls for Workers to limit their exposure to practices they do not support, but even if those are not enough, then it is OK to not be part of the AI Horde.

This is also a reason why the AI Horde is Free/Libre Software. If someone else has a different ethical system on how crowdsourced compute resources like these should be handled, they are always welcome to host their own version of the AI Horde, in the same sense that anyone can host their own BitTorrent tracker, with any rules they want! I do honestly believe the current approach of the AI Horde, with unrestricted access is the way to go to democratize AI, but maybe I’m just wrong. It remains to be seen.

However, I do want to ask that people to do not share FUD about who we are affiliated with and what practices we support. The exact stance that we have, is what I have explained above.

At the end of the day, thousands of people are getting free Generative AI output currently and we do not plan to stop this access, ever. No matter who, or how they integrates into us. The AI Horde will always have a way to use it for free without restrictions!

State of the AI Horde – 26/03/2023

Things are progressing very rapidly in this dawn of the AI and likewise for the AI Horde. I thought it would be a good idea to post about all the things that changed and improved in recent days for our service.

More Requests. More statistics.

I’ve deployed endpoints to measure the usage of the AI horde. Now that one month has passed, we can take a look.

  • Per day, we are averaging 356,378 images (3.7 terapixelsteps) and 45,248 texts (4 megatokens)
  • In the past month, we produced 11,475,183 images, generating a staggering 127.6 terapixelsteps. Text has also picked up significant speed since merging the hordes with 1,241,895 generated texts for a total of 112.8 megatokens!

Top 10 Stable Diffusion models

The AI Horde offers close to 200 models at the same time. Our statistics allows us to see how the popularity of the various models changes day to day and month to month. The below are just the top 10 models being used.

  • Deliberate 22.2% (2550591)
  • stable_diffusion 15.1% (1730426)
  • Anything Diffusion 11.0% (1257688)
  • Hentai Diffusion 4.1% (468473)
  • Realistic Vision 3.0% (338742)
  • Counterfeit 2.7% (310337)
  • URPM 2.6% (297853)
  • Project Unreal Engine 5 2.5% (289006)
  • waifu_diffusion 1.8% (211572)
  • Abyss OrangeMix 1.8% (205268)

For the longest time SD 1.5 (stable_diffusion above) was king, but in the past month, Deliberate has confidently taken the lead and has been leading the pack with a staggering 20% of all image requests passing through the AI Horde! This speaks very highly for the popularity of the model

Top 10 Text models

Almost as many text models exist for the AI Horde, but they’re more varied. However last months saw the release of two big milestones, the Pygmalion models for chat-like generation, which happened after the gimping of the Character AI models. The new Llama model was also released, bringing unparalleled miniaturization of the model size, allowing consumer GPUs far more coherence.

  1. PygmalionAI/pygmalion-6b 52.4% (651566)
  2. KoboldAI/OPT-13B-Erebus 14.0% (174393)
  3. KoboldAI/OPT-6.7B-Erebus 6.7% (83249)
  4. KoboldAI/OPT-6.7B-Nerybus-Mix 3.8% (46747)
  5. KoboldAI/OPT-13B-Nerybus-Mix 2.8% (35110)
  6. KoboldAI/OPT-13B-Nerys-v2 2.7% (33667)
  7. Facebook/LLaMA-13b 1.9% (23367)
  8. KoboldAI/OPT-6B-nerys-v2 1.9% (23232)
  9. OPT-6.7B-Nerybus-Mix 1.6% (19268)
  10. KoboldAI/OPT-2.7B-Erebus 1.0% (12464)

We can see Pygmalion has immediately dominated text generation, with Mr.Seeker’s storytelling models mopping up the rest, but the Llama Ascendancy is just beggining!

Ratings, botting and counter-measures

A few months ago we started collecting ratings for the LAION non-profit to help improve the models existing in the commons, as the success of midjourney has a lot to do with them training their models with the best images their previous generation created.

The initial design was very simple to allow integrators to onboard it fast and giving good kudos rewards for those helping us. Unfortunately people almost immediately started abusing this by creating bots to rate randomly, therefore poisoning our collection’s accuracy.

I always knew this was a possibility but I was hoping I wouldn’t be forced to add countermeasures quite so soon. So I spent quite a few days adding a captcha mechanism (along other things) to block at least the low hanging fruit.

It immediately led to a drop in ratings per day which automatically shows just how much damage botted ratings were doing

New Features

We are fortunate enough to have gathered some great collators for the inference aspect of the AI Horde. So I wanted a big shout-out.

  • ResidentChief has stepped up strongly to help add new features and squash bugs in the nataili library. As a result the AI Horde now supports inpainting on many more models, a lot more post-processors, such as more upscalers and background removers, controlnet improvements, and so many other stuff too numerous to mention. They’re a beast!
  • Jug has been working on improving the AI Horde worker practically non-stop. Giving us a great terminal control, and improving the webui. Plus a lot of bugfixes and improvements in the bridge part of things
  • Tazlin who’s been doing a great deal of tech support in the channels as well as helping me detect and figure out malicious ratings. And also sending some code improvements as well!
  • Aes Sedai who’s been putting a ton of work on improving the moderation capabilities of the AI Horde with a custom frontend.

And of course all the frontend integrators like rockbandit, aqualxx, sgt.chaos and concedo, who’ve been keeping the frontends up to date, with a lot of features smartly using the capabilities of the AI Horde in ways even I had not expected!

CI/CD and pypi

I finally got around to adding CI/CD pipelines for AI Horde Worker and nataili. Now they will be automatically versioned when the right tag is applied to a PR. The Nataili package has also been republished to pypi and will also automatically receive new versions whenever we publish a new release on GitHub.

The notifications also automatically publish a notification on discord, so people can be aware when something new is up.

Alchemists

Using the new post-processing improvements from ResidentChief, I’ve expanded the interrogation worker so that it can now perform post-processing on images, as well as img2text operations. Unfortunately the previous name didn’t fit so well, so now I’ve renamed it to “Alchemist”, to signify it’s capability to convert images to something else.

Likewise, the official names for image worker is now “Dreamer” and text worker is now “Scribe”. Why not 🙂

Final Word

The pace of progress in this space is mind-blowing. I can’t wait to see what we achieve together in the coming days!

Merging of the Hordes. The AI Horde is live!

A while back (gosh, It occurs to me this project is half a year old by now!) I took significant steps to join the two forks I had made of the AI Horde (one for Stable Diffusion and one for Kobold AI) as they diverging code was too difficult to maintain and keep up to parity with features and bug fixes I kept adding.

Then later on, I realized that my code just could not scale anymore, so I undertook a massive refactoring of the code-base to switch to an ORM approach. Due to the time criticality of that refactor (at the time, the stable horde was practically unusable due to the sheer load), I focused on getting the stable horde API up and running and disregarded KoboldAI API, as that was running stable on a different machine and didn’t have nearly as much traffic to be affected.

Once that was deployed a number of other fires had to be constantly be put out and new features on-boarded as Stable Diffusion is growing by leaps and bounds. That meant I never really had a time to onboard the KoboldAI to the ORM as well, especially since the code required refactor to allow two types of workers to exist.

Later on, I added Image Interrogation capabilities as well, which incidentally required that I set up the horde to handle multiple types of workers. This lead me to figuring out how to do ORM class inheritance (which required me figuring out polymorphic tables and other fun stuff) but it also meant that a big part of the groundwork was laid to allow me to add the text workers (which is the kind of thing that does wonder to get my ADHD brain to get over its executive dysfunction).

Since then, it’s been constantly on the back of my mind that I need to finally do the last part and merge the two hordes into a single code base. I had kept the KAI horde into a single lonely branch called KAI_DO_NOT_DELETE (because I deleted the other branch once during branch cleanup :D) and the single-core horde node running. But requests for improvements and bug fixes on the KAI horde kept coming, and the code base was so diverged by now, that it was quite a mess to even remember how to update thing properly.

The final straw is when I noticed the traffic to the KAI Horde had also increased significantly, probably due to the ease of using it through KoboldAI Lite. It was getting closer and closer to the point where the old code base would collapse under its own weight.

So it was time. I blocked my weekend off and started the 4th large refactoring of the AI horde base. The one which would allow me to use the two horde types which were mutually exclusive in the past, at the same time.

This one meant a whole new endpoint, new table polymorphism and going through all my database functions to ensure that all the data is fetched from all types of polymorphic classes.

I also wanted to make my endpoints flexible as well, so it occurred to me it would be better to to have say api/v2/workers?type=text instead of maintaining api/v2/workers/image and api/v2/workers/text independently. This in turn run into caching issues, as my cache did not recognize the query part to store independently (and I am still not sure how to do it), so I had to turn to the redis cache.

That in turn caused by bandwidth to my redis cache to skyrocket, so now I needed to implement a local redis cache on each node server as well, which required rework for my code to handle two caches at the same time. It was a cascading effect of refactoring 😀

Fortunately I managed to get it all to work, and also updated the code for the KoboldAI Client and its bridge to use the new and improved version2 of the API and just yesterday, those changes were merged.

That in turn brought me to the next question. Now that the hordes were running together, it was not anymore accurate to call it “stable horde”, or “koboldai horde”. I had already foreseen this a while ago and I had renamed my main repo to the AI Horde. But I now found the need to also serve all sorts of generative AI content from the main server. So I made the decision to deploy a new domain name. And the AI Horde was born!

I haven’t flipped all the switches needed yet, so at the moment the old https://stablehorde.net is still working, but the eventual plan is to make it simple redirect to https://aihorde.net instead.

The KAI community is happy and I’m not anymore afraid they’re going to crash and burn from a random DB corruption and they can scale along with the rest of the Horde.

Now onward to more features!

The 150Mbit/s problem

Recently my provider send me a nastygram about my Database VPS using too much bandwidth, 150Mbit/s or more, over 10 days, and how they have already throttled it to 100Mbit/s to avoid affecting other customers.

This caught me by surprise as I know that my Database is the central location where all my nodes converge to pull data, but the transfer between them should be just text. 150 Mbit/s would be insane quantities of text.

Fortunately my provider has also crashed my DB just a day before on the weekend, and their lack of response outside working hours forced me to urgently deploy a new VM with a new postgres DB until they recovered and I had switched all my nodes to use that already. Nevertheless, on checking the new DB, I discovered that it too was using the same incredible amount of bandwidth constantly. This meant that my new DB VM was also on a timer as Contabo throttles you, if your VM takes too much bandwidth for 10 days in a row. I had to resolve this fast.

First order of business was to swap the code so that all source images and source masks used for img2img are also stored in my cloudflare r2 CDN. The img2img requests are about 1/6 of the total stable horde traffic, but until now they were stored as base64 strings inside the database, which means that whenever those requests were retrieved, say for a worker to pick one, they transferred all that data back and forth.

This made a small dent in the traffic, but not nearly enough. I was still at 150Mbit/s outside rush hours and 200Mbit/s during peak demand.

I started getting a bit desperate as I expected that was a big part of this. At this point I decided to open a discord thread to ask my community for help in debugging this as I was getting out of my depth. The best suggestion came from someone who told me to enable pg_stat_statements.

That, along with the query to retrieve the most used queries, lead me to find a few areas where I could do some improvements through caching. One was the worker retrieving the models from the DB all the time for example, which I instead made it cache on redis.

Unfortunately none of my tweak seemed to do much difference in the bandwidth. I was starting to lose my mind. Fortunately someone mentioned the amount of rows retrieved so I decided to sort my postgres statements by amount of rows retrieved and that finally got me the thing I was looking for. There was one statement which retrieved a number of rows two whole orders of magnitude more than every other statement!

That statement was an innocent looking query to retrieve all the performance statistics for all workers, which I then created an average for. Given hundreds of workers and 20 rows per worker, and this statement being retrieved once per second per status check on a request, you can imagine the amount of traffic it generated!

My initial stop was to cache it on redis as well. That just shifted the problem because while there wasn’t a load on the DB, the amount of traffic stayed the same, just on a different VM. So the next step was to try and cache it locally. I initially turned to python cachetools and their TTL function caching. That seemed to work but it was a big mistake. The cachetools are absolutely not thread safe and my code relies on python waitress WSGI server, and that one spawns dozens of threads all over the place.

So one day later, I get reports of random 500 errors and other problems. I look in to find my logs spewing hundreds of logs about missing key errors on the cache. Whoops!

I did make an attempt to see how I can make cachetools thread safe, but that involved adding python locks which would delay everything too much, on ultimately something that is not needed. So instead I just created my own simple cache using global vars and built-in types which are thread-safe by default. That solved the crashing issue.

But then I remembered that I’m stupid and instead of pulling thousands of rows of integers so I can make an average using python, I can just ask PostgreSQL to calculate the average and simply return that final number to me. Duh! This is what happens when my head is not yet tuned into using Databases correctly.

So finally, I wiped my internal cache and switched to that approach. And fortunately that was it! My Mbit/s on my database server dropped from 150 average, to 15! a 10x reduction!

New Discord Bot for the Stable Horde

For a few months now the Stable Horde has had its own Discord bot, developed by JamDon. One of the most important aspects I wanted for the bot (and the reason for its original creation), was the ability to be able to gift kudos to people via emojis, which would serve as a way to promote good behavior and mutual aid.

In the process, the the bot received more and more features, such as receiving the functionality of being able to generate images from the Stable Horde, or getting information about the linked horde account etc.

Unfortunately development eventually slowed and then 2 months ago or so, ago JamDon informed me that they do not have time anymore to continue development. Further complicating things was the fact that the bot was written in JavaScript which I do not speak, which made it impossible for me to continue its development on my own. So it languished unmaintained, as the horde got more and more features and other things started changing. It was the reason why I couldn’t make the “r2” payload parameter true by default for example.

The final straw was when our own bot got IP banned by the horde because it was a public bot and had been added to a lot of servers, which we do not control. And apparently people there attempted to generate unethical images, which the horde promptly blocked. Unfortunately that meant that the bot image generation also stopped working everywhere every time this happened.

At the same time, another discord regular had not only developed their own discord bot based on the stable horde, but a whole JavaScript SDK! The bot was in fact very well developed and had most of the features of the previous stable horde bot plus a lot of new stuff like image ratings. The only thing really missing which was really important, was the ability to gift images via emojis, which was the original reason to get as discord bot in the first place 🙂

Fortunately with some convincing and plenty of kudos, zelda_fan agreed to onboard this functionality, as a few other small things that I wished for (like automated roles), and the Stable Horde Bot was reborn!

Unfortunately this did mean that all existing users were logged out and had to log in once more to be able to use the functionality, and it’s commands did change quite significantly, but those were fairly minor things.

Soon after the new bot was deployed, it was also added to the official LAION discord as well, so that their community could use it to rate images as well. I also checked and the bot has been already added to 365 different servers by now. Fortunately its demand is not quite as massive as it’s not prepared to scale quite as well as the stable horde itself.

BTW If you want to add the bot to your own discord server, you can do so by visiting this link. If you want to be able to transfer kudos, you’ll need to contact me so I onboard your emojis though. But other functionality should work.

Stable Horde receives stability.ai processing power!

A week ago I mentioned that we had begun a collaboration with LAION to provide them with ratings on images. The amount of ratings we have received since then has blown away all our expectations! In just a week, you’ve all rated close to 130.000 individual images! As a comparison, the LAION-aesthetics v2, which was instrumental for training Stable Diffusion v1.x, used less than 600K rated images. We’ve reached 1/4 of that amount in a week!

Needless to say, these amounts seemed to turn some heads to the power of mutual aid provided by the stable horde, and some gears were set in motion.

LAION spoke with stability.ai directly and arranged that it would likewise benefit them to support the health of the stable horde itself. Since stability.ai is set to be the most direct beneficiaries of a better trained the laion-aesthetics v3 it makes perfect sense.

I was not privy to the discussions that happened, but I was happy to learn that Tom, the CTO of stability.ai arranged to provide us with some sponsored resources in the form of 4 VMs with RTX4000s Nvidia GPUs!

Quite surprisingly I had to deploy the VMs myself, so I crafted the most optimal setup for taking advantage of those 8Gb of VRAM through my experience with my own RTX2070. Each of them has been loaded with standard stable_diffusion 1.5 and 2.1 and each of them then has 8-10 other finetuned models to help cover the versatility provided by the Stable Horde. Granted, we are serving close to 100 different models currently, but the fact that those workers will remain running consistently 24/7, should help provide cover and allow other workers to switch to less supported models as well.

I hope this is the start of a fruitful collaboration between the stability.ai and the Stable Horde. The way I see it, the current scenario is a win-win for everyone. We get a more consistent service which allows more people to use it and makes them more likely to rate images to give back, which are then fed back to LAION and by extension stability.ai.

The Stable Horde has its first chrome extension!

About a week ago I deployed image interrogation to the stable horde, allowing low-powered GPUs and high-powered CPUs to also be able to become productive contributors on the horde and generate kudos for their owners.

A few days ago, the extension I talked about was finally released once more, relying on the Stable Horde this time: GenAlt

GenAlt is an extension that allows visually impaired people to generate alt-text for any image they encounter on the internet, giving them freer access to an area they were previously excluded. The extension’s description goes more into length about its stated purpose so I urge you to share it so that people who need it can find it

The first release of the extension was setup to automatically pick up every image displayed in the webpage and send them over to the horde for captioning it. That mean that simple scroll through twitter would lead to hundreds of images being sent to the horde for captioning per person!

That in turn led to the stable horde ending with 2000-4000 images to interrogate in its queue. Even with my own worker handling 20 threads at a time, it was just impossible to clear them all, which effectively meant the interrogation service became unusable. To top it off, as the stable horde started deleting expired interrogations, the extension received 404 responses, but unfortunately didn’t take that as a sign to abort polling for them.

At one point we had almost maxed out our available connections to each stable horde backend. But fortunately we kept chugging without much impact. It was one hell of a stress test though!

So I asked the developer to switch it to be triggered with a button or an image-hover action, which while not as user friendly, certainly wouldn’t completely flood the horde. That change (along with fixing the 404s) was finally deployed yesterday and that took care of the flooding issue.

An example of the GenAlt new trigger context menu

Now finally the horde is easily handling the captions as they trickle in at a controllable amount. The developer is planning some more updates, such as triggering it on mouse-hover instead of a specific context menu button, which is not as easy to access, and possibly we can onboard translating the captions before we send them back.