Embedded QR Codes via the AI Horde

Around the same time last year, the first controlnet for generating QR codes with Stable Diffusion was released I was immediately enamored with the idea and wanted to have it ASAP as an option on the AI Horde. Unfortunately due to a lot of extenuating circumstances [gesticulates wildly] I had neither the time, nor the skills to do it myself, nor the people who could help us onboard it. So this fell on the wayside while way more pressing things were being developed.

Today I’m very excited to announce that I have finally achieved and deployed it to production! QR code generation via the AI Horde is here!

To use is fairly simply, assuming your front-end of choice supports it. You simply provide the text that you want represented as a QR code and the AI Horde will generate a QR code, and then using controlnet, will generate an image where the QR code is embedded into it, as if it’s part of the drawing. You can scan the examples below to see it in action.

You’ll notice that unlike some of the examples you’ll find online elsewhere, the QR code we generate is still fairly noticeable as a QR code, especially when zoomed out, or at a distance. The reason for this is that the more fitting you make to the image, the less likely it is that the QR code is scannable. The implementation I followed to achieve this result is specifically tailored to sacrifice “embedding” for the purpose of scannability.

So when you want to generate QR codes, you need to keep in mind that this is a very finicky workflow. The diffusion process can easily “eat” or modify some components of the QR code so that the final image is not readable anymore. The subject matter and model used matters surprisingly much. Subjects which are somewhat noisy (such as the brain prompt in the featured image above) tend to give enough to the model to work with to reshape that area in a way that creates a QR code. Wheres no matter how hard I tried, I couldn’t get it to generate a QR code with an anime model and an anime woman in the subject.

Along with the basic option to provide the QR Code text, you can also customize some more areas from it. For example you can choose where the QR code will be placed in the image. By default we’ll always display it in the center, but sometimes the composition might be easier if you choose to place it on the side, or to the bottom. You can choose a different prompt for the anchor squares, increase or decrease the border thickness, and more. Your front-end should hopefully be explaining these options to you.

If you want to try and make some yourselves right now, I’ve added the necessary functionality to my Lucid Creations front-end already, so feel free to give it a try right now.

Continue reading further to get some development details.

The road leading to me making this feature available was fairly long. Other than all the other priorities I had for the horde, we also had the misfortune that one of our core contributors on the backend/comfyUI side, went suddenly missing at the end of summer. As I am still more focused the middleware/api and infrastructure (plus so much more, halp!) and Tazlin is focused on efficiency, and code maintenance & quality, we didn’t have the necessary skills to add something as complex as QR code generation.

Once it was clear that our contributor wasn’t coming back and nobody else was stepping up to help, I finally accepted that if I want it done, I have to learn to do that part myself as well. So in the past few months I embarked on a journey to start adding more and more complex comfyUI workflows. First came Stable Cascade which required me to build code which can load 2 different model files at the same time. Then Stable Cascade Remix which required that I wrangle up to 5 source images together.

Note that I’m mostly re-using existing fairly straightforward ComfyUI workflows which do these tasks. I don’t have the bandwidth to learn ComfyUI itself that much. But the work of making said workflows function within the horde-engine with payloads that are send via the AI Horde REST API is quite a complex amount of work on top of those. As I hadn’t built this “translation layer”, I was avoiding that area of the code until now, and this work helped me build up enough knowledge and confidence to be able to pull of translating a much much more complex ComfyUI workflow like the QR codes.

So after many months, I decided it was finally the time to tackle this problem. The first issue is getting an actually good QR Code ComfyUI workflow. Unlike the previous workflows I used, it’s surprisingly difficult to find something that works immediately. Most simple QR Code workflows both required that one generates the QR image externally and generated mostly unscannable images.

I was fortunate enough to run into this excellent devlog by Corey Hanson who not only provided instructions on what works and what doesn’t for QR codes, but even provided a whole repository with prebuilt ComfyUI workflows and a custom node which would also generate a QR code as part of the workflow. Perfect!

Well, almost perfect. Turns out the provided ComfyUI workflows were fairly old, and at the rate GenerativeAI progresses even a couple of months means something can easily be too stale to use. On top of that they were using a lot of extra custom nodes in their examples that didn’t parse, which a ComfyUI newbie like me had to untangle. Finally those workflows were great, especially for local use, but a bit overkill for the horde usage.

So first order of business was to understand, then simplify the workflow to just do the bare needed to get a QR code. Honestly it took me a bit of time to simply get the workflow running in ComfyUI itself and half-way understand what all the nodes were doing. After that I had to translate it to the horde-engine format, which by itself required me to refactor how I parse all comfyUI workflows to make it more maintainable in the future.

Finally QR codes require a lot more potential text inputs, which I didn’t want to start explicitly storing in the DB as new columns as they’re used only for this specific purpose. So I had to come up with a new protocol for sending an open ended amount of extra text values. Fortunately I had already the extra_source_images code deployed so I just copied part of the same logic to speed things up.

And then it was time for unit tests and the public beta and all the potential bugs to fix. Which is when I realized that the results on SD 1.5 models were a bit…sucky, so I went back to ComfyUI itself and actually figured out how to make the workflow work with SDXL as well. The results were way more promising.

Unfortunately while the SDXL QR Codes are way nicer, the requirements to generate them are almost tripled compared to SD 1.5. Not only does one need to run SDXL models, but SDXL controlnets are almost as big as the models themselves. The QR code controlnet is 5G on its own, and all that needs to be loaded in VRAM at the same time as the mode. All this means that even middle-range GPUs struggle to generate SDXL QR codes in a reasonable amount of time. This meant that I also had to adjust the worker to give the option for people serving SDXL models to skip SDXL controlnet, and also properly route this switch via the AI Horde.

Nevertheless, this an areas that makes the AI Horde shine, as those with the necessary power, can support those who need it. Most people will find it really hard or frustrating to generate even a single QR code, never-mind an SDXL one, only to discover that it’s unscannable, but through the horde they can easily generate dozens with very little expertise needed and find the one that works for them.

So It’s been a long journey, but it’s finally here, and the expertise I gained by achieving it also means that I now have enough knowledge to start adding more features via ComfyUI. So stay tuned to see more awesome workflows on the AI Horde!

Image Remix on the AI Horde

The initial deployment of the Stable Cascade (SC) on the AI Horde supported just text2image workflows, but that was just a subset of what this model can do. We still needed to onboard the rest of its capabilities.

One such capability was the “image variations” option, which allows you to send an image to the model, and get a variation of that image, perhaps with extra stuff added in, using the unClip technology. This required quite a bit of work on hordelib so that it uses a completely different ComfyUI workflow but ultimately this was not so much harder than just adding the img2img capabilities to SC.

The larger difficulty came when I wanted to add the feature to remix multiple images together. The problem being that until now the AI Horde only supported sending a single source image and a single source mask, so a varying amount of images was not possible at all.

So to support this, I needed to touch all areas of the AI Horde. The AI Horde had to accept and upload each of them on my R2 bucket and provide individual download links. The SDK had to know to expect and provide methods to download those images in parallel to avoid delays, to the reGen worker had to be able to receive those images and send them to hordelib which should know how to dynamically adjust a comfyUI pipeline on-the-fly to add as many extra nodes as required.

So after 2 weeks of developing and testing, we finally have this feature available. If your Horde front-end supports the “remix” feature. You can send up to 1-6 images to this workflow along with a prompt, and it will try its best to “squash” them all together into one composition. Note that the more images you send, and the larger the prompt, the harder it will be for the model to “retain” all of them in the composition. But it will try its best.

As an example, here’s how the model remixes my own avatar. You’ll notice that the result can understand the general concepts of the image, but can’t follow it exactly as it’s not doing img2img. The blur is probably caused by the need to upscale my original image, which is something I’d like to fix on the next pass.

Likewise, this is the Haidra logo

And finally, here’s a remix of both logo and avatar together

Pretty neat, huh?

This ability to send extra source images also lays the groundwork for the Horde to support things like InstantID, which I hope I’ll be able to work on supporting soon enough.

Stable Cascade on the AI Horde!

A while ago Stability.ai released a new model on a different architecture, that seems to provide very promising results and very fast training: Stable Cascade. I really wished to offer it on the AI Horde so after getting explicit permission from Emad in Reddit PMs (due to its more restrictive license for APIs), I set out to implement it.

Unfortunately the Stable Cascade model and ComfyUI workflow require the use of two different checkpoints, which went against the AI Horde worker paradigm at the time, which expected one file per model, so I had to make multiple changes in a lot of packages which expected this paradigm. The Worker, hordelib, the model reference and its SDK, all of them required tweaking to avoid crashing.

Fortunately, while the changes were complicated, I managed to implement them without much debugging. I did initially run into some troubles with the image quality being garbage, which turned out required ComfyAnon tweaking the implementation on ComfyUI a bit, but once that was done, everything fell in place and now you can use the AI Horde to request Stable Cascade images and therefore check the capability of this model, even if you don’t have 20G VRAM to spare.

You can try it out on Artbot

Alongside Stable Cascade, I thought it’s high time we start expanding our SDXL model selection, so the following models have also been onboarded.

  • Juggernaut XL
  • Anime Illust Diffusion XL
  • Pony Diffusion XL
  • Animagine XL
  • DreamShaper XL (Lightning version)

We quickly realized that we also need to expand our model reference to better inform people of the requirements for some of these models. For example Pony Diffusion XL doesn’t work unless you set clip_skip to 2, and DreamShaper requires low steps, cfg and specific samplers. If you know to set those settings correctly, you’ll get amazing images, else you get hot garbage. Soon the horde will be warning you when trying to use a model outside its specifications.

Other than that, we haven’t been completely idle. Some other notable achievements in the previous weeks are:

Firstly, the AI Horde now supports an educator role for accounts. If you are an education institution and you want to use one of the AI Horde free tools for the classroom, you can request your account to be set as an educator, which will force all your requests to be SFW and increase your account’s concurrency.

I also spent some time improving the AI Generation of the Mastodon bot @dungeons, so that it gets nicer images for each campaign protagonist. Will admit I had a lot more fun than I should improving the versatility and variability of the generations and tweaking then results for each model. You can see (or follow) the results in the dedicated account replying with those images.

On the worker side, Tazlin has also been very busy improving the efficiency of our generations. We have added now some improvements such as downloading the loras for the next job, while performing the inference for the previous one, or adding more efficiency for those people with more powerful machines.

I’m now hard at work trying to onboard more Stable Cascade capabilities as they are added to ComfyUI and to add support for more advanced workflow capabilities.

Error Codes and Styling

Does the above image look scary? If so, you might just just be a software developer!

The above is the result of a long-time coming, but massive pull request to standardize the formatting of the AI Horde code. I’ve been meaning to do this ever since I discovered the black and ruff tools, but I’ve been procrastinating for almost as long. Well, I finally somehow got my ass in motion to do it. Including writing tests, and doing some careful regression testing, It took me like a week in total. And I still didn’t apply all of the ruff checks either.

What this means is that from now on, anyone sending a change, can simply run ruff . --fix && black . and it will automatically format all changes to match our standards. Making the code predictable to read and reducing some bad programming practices and potential tech debt.

Also, as a software dev, finally doing this kind of operation is so satisfying. Not much fun to do, but you’re very happy to have this done. What’s a good analogy for this? a peeling session (post your best analogies in the comments)?

Soon after, I also deployed another change that might be useful for AI Horde integrators out there. I have now added unique error return codes to each error message from the horde. This should make it easier to parse the various errors the horde might spit out with code, instead of having to parse an error message which might potentially change in the future. It also allows you to do things like error code translations (although I think it might be useful to allow people to send translations for the various RCs to the horde as PRs, so that we don’t force every frontend to reinvent them)

I also wrote a README page detailing all the existing RCs.

There’s also been the various bugfixes and improvements on the worker, sdk and hordelib code. Remember to update your reGen worker regularly!

Once again, many thanks to NLNet for providing the funding for such “necessary chore” tasks like these. These kind of things are not a ton of fun to do, as they don’t add any new functionality to the project, but they massively help future development by reducing tech debt.

Webhooks on the AI Horde

Today I am excited to announce that I have deployed a new feature which allows you to specify a webhook when requesting a generation on the AI Horde. If you do that, once each generation is completed on the AI Horde, it will send POST request to the specified url, with a payload matching the request type.

Apropos, it’s a good time to announce I have started writing some integration information for the AI Horde, which contains information about the available API, and SDKs, and of course, the new webhooks. Feel free to send PRs to improve it!

This new functionality can allow a few more efficient ways of using the AI Horde. For example you could avoid polling the AI Horde every second or so, and rely on webhooks, and only do a manual poll every 30 seconds or so, if the requests have not webhooked over to you yet. The AI Horde will retry a webhook 3 times before giving up, so in case of network issues etc, you can always check the status manually as usual. This approach would reduce the load on the AI horde, while at the same time giving you faster results. It’s what I call a win-win!

Of course, not all clients can support webhooks, so for those who can’t, the existing functionality will continue working as usual.

Ludicrous Speed!

One very useful feature I’ve been meaning to support for the AI Horde for a while has been request batching. Request batching is the function to generate multiple Stable Diffusion images in parallel, by using internal mechanism to the ML libraries, instead of splitting them into multiple processes. Due to the re-using common parts of the request, it allows the GPU to generate each extra image with just 20% slowdown, instead of 100%, so long as you stay within your GPU’s power.

Soon after we finished adding LCM support, I turned my view to making this a possibility, as between these two features, it could massively increase the overall speed at which the AI Horde completes requests. The only problem is the overall complexity to handling this in the Inference.

Today I’m proud to announce that the AI Horde natively supports smartly batching multiple images in the same request when possible which can result in massive improvements in overall speed! Read on for more details of how we achieved it.

By relying on ComfyUI, the most difficult part was done, and earlier work done by Tazlin and Jug had already prepared the ground to use our hordelib library to handle sending such batched requests to the comfy engine, but I still had a lot of work to do to not only allow the AI Horde accept and queue such loads properly, but also for the worker to be able to understand payloads for multiple images.

Fortunately due to the new setup of the reGen worker, being able to adjust it to accept one job for multiple images and then submit multiple image results at the end was easier than I expected. Of course doing multiple image submissions was the hardest part and I had to basically refactor that whole area of the code.

The AI Horde queuing part was not as code intensive. Making a worker pick up multiple requests when possible was not particularly hard, but not giving the worker more than it can “chew through” was. You see your worker might be able to do 1 image at 2048×2048, and it might be able to do 20 images at 512×512. However give it 20x2048x2048 and it will fall down and die! So this required a bit of fancy footwork. The way I solved this is that the worker declares how many batched images they can do along with its max resolution. The horde then assumes that the worker can safely achieve their max batching at 1/3rd of their max resolution. After this part, as the requested resolution of a job increases, the horde will smartly reduce the amount of batches from a job it will give that worker.

Practically this means that when I declare I can do 20 batches and my max resolution for one image is 2048×2048, then I will pick my full 20 images at 512×512 but will only pick 7 images at my full resolution.

Therefore the AI Horde will continue smartly slicing a request for multiple images into a number of jobs. Only this time instead of each job being 1 image, it can be multiple. Effectively this means that the horde is able to way more efficiently utilize the maximum processing power of each worker and therefore the overall performance improves!

There were a few hiccups along this development as well. For one I realized that the hordelib code did not handle batching for img2img requests at all, so I had to pull up my sleeves and jump into the way hordelib translates requests to comfyUI nodes and figure it all out. It took me a while but now that I understand this better, it will make it easier for me to add even more fancy additions to our comfy workflows!

Another somewhat important problem is that the seed returned by batched requests in comfy is not accurate. The explanation of this is a bit too technical, but at the end of the day, there is extra variable when trying to replicate an image generated via a batch, on top of the generation seed. Currently the horde will return the relevant “batch_id” in the generation metadata, which I hope in the future to use so I can add a way to replicate images from batched requests as well.

For now, if you need to ensure you can always replicate your images via a seed, the best way to do it is to request them using the new disable_batching keyword on your request. Setting this to true will make your request always split to 1 image per job, which is the way the horde used to work until now. However since disable_batching is significantly less optimal than batching, it is only available to trusted users (i.e. those who’ve been running workers for a while) and patreon supporters.

Of course you can continue manually splitting your requests to 1 image per request, but that already has increased kudos costs, and in the future this might get disincentivized further for the health of the AI Horde.

Between batching and LCM proliferation, we’re already starting to see significantly improved generation times on the AI Horde. To the point that with enough priority, you can receive 20x1024x1024 images in less than a minute! A small problem is that currently one of our most popular frontends, Artbot, defaults to manually splitting each request to 1 per image. Nevertheless, Its developer Rockbandit is already hard at work making their requests batching-compatible and once that happens, I expect the overall speed with massively improve!

LCM and multiple versions of LoRas on the AI Horde

Finally 2024 is here and this allowed me a bit of free time to work on some of my NLNet tasks. The first thing on my list to tackle was adding LCM support on the AI Horde as it provides massively reduced steps, which for a crowdsourced service like ours, it makes all the difference in how much we can deliver.

For those who don’t know, LCM is a new breakthrough in Stable Diffusion that allows to “finetune” the model in such a way where an image can be generated using 10% of the steps previously required. So an image which would require 30 steps to converge, now needs just 3! That is a massive boost for lower-range GPUs. For high range GPUs, it starts avenues such as video generation as an image can happen at millisecond speeds!

Given the benefits, I wanted to work on this as soon as possible, and given the flexibility of the FOSS GenAI technology enthusiasts, we already had a great way to use LCMs, by using LoRas to turn any SD model into an LCM version.

However there was a snag. You see while the AI Horde already supports all LoRas on CivitAI, we never supported different versions of each, as we never expected anyone would want to use more than the latest. Unfortunately people on CivitAI started using the versioning system as “alternative” versions. And the LCM LoRa was using the same approach, where there was a version for each different sampler.

So the first order of business had to be to allow the AI horde to understand and support all LoRa versions of each LoRa! This took the better part of a full work-week of development and debugging, and then another week of troubleshooting and fixing in beta.

The good news is that this lead to us also identifying and squashing a very frustrating long-running bug where workers would rarely return previous images they’d generated instead of the ones requested. Getting someone else’s image is something we definitely don’t want to ever happen so we’re very happy we figured it out.

With that out of the way, I simply had to update the AI Horde itself to be able to handle the payload for specific LoRa versions, and then add support for the LCM sampler and then some ways to urge users to switch to it.

If you’re an AI Horde integrator, we strongly suggest you change your default settings to utilize LCM LoRas in your generations. You can get them from the same API you receive the model details, under the modelVersions key. To use them, you need to send the exact version ID as a string (found in modelVersions[#]['id]) this won’t accept a version name. You will also need to set is_version: true for the LoRa payload. This will tell the worker to look for a version instead of a LoRa ID.

Sending the LoRa name or ID will continue working as usual, grabbing the latest version (modelVersions[0]) from that list, so you existing implementations should continue working as usual.

Also we recently added AlbedoXL in our model list, to provide a better baseline for SDXL generations than basic SDXL 1.0 which requires a refiner to work. Using Albedo you can get generations that do not require a refiner in your workflow at all and get much less “fuzzy” generations in the process!

AI Horde to receive NLNet grant!

Back in July I first discovered NLNet and decided to apply for their NGI Zero Core grant to help me help me continue developing the AI Horde. Today I’m excited to announce that the AI Horde has officially been greenlit as one of the projects which will receive the August 2023 grant!

You can see the entry for AI Horde on NLNet here: https://nlnet.nl/project/AI-Horde/

AI Horde has been a passion project from the start but it’s difficult to maintain the level of intensity I had for such a length of time. With my patreon funding significantly dwindling month-to-month, multiple of our backend developers dropping, and the need for me to also keep up with some of my real life duties as well as my other FOSS projects, development of the AI Horde has sadly slowed in recent months. While It rmains extraordinarily stable, it has had little to announce, and some new features are slower to “cook” and reliant in the work of valuable backend volunteers like Tazlin.

I am hoping that the addition of NLnet funding should help reintroduce some of that momentum as the release of the grant is contingent on specific milestones being reached.

I have plans for 5 roughly grouped areas of development, for which I have thought of various tasks to receive “bounties” for in the scope of this grant: AI Horde, Dreamer, Scribe, Alchemist and Godot Engine

The AI Horde tasks will focus on improving the middleware itself. Such as extending the shared key functionalities, improving the API documentation etc.

The Dreamer, Scribes and Alchemists tasks will focus on adding new functionalities to the the official workers, such as more generation workflows, batch processing etc.

Finally the Godot Engine tasks will improve the existing toolset of the AI Horde to support game development, such as improving my AI Horde Client, migrating it and Lucid Creations it to Godot Engine 4, etc.

The good news is that due to the way NLNet works, I had to submit stuff to work on, but I couldn’t work on them before the AI Horde was officially accepted (or rather, I could, but I couldn’t receive a “bounty” for them). Now that this is locked-in, I can start working on some of these with the added incentive of getting a reward at the end which can alleviate some of my ADHD executive disfunction. So if all goes well, you should start seeing more activity from me soon.

As always, if you want to support the AI Horde and my work in FOSS and the open commons, please do consider funding me at:

These funds go towards paying for the existing infrastructure first, and motivation continuous development second.

PS: Interestingly enough, it’s my birthday today too 😀

Year One of the AI Horde!

The AI Horde has turned one year old. I take a look back in all that’s happened since.

Can you believe that I published the first version of the AI Horde exactly 1 year ago? The first version published was called the KoboldAI Horde as it was built around KoboldAI Client specifically, and it had very little traction. But almost as soon as I built it, Stable Diffusion came out, I forked the project to handle SD, and my life was never the same again!

Since the start of its life the AI Horde has generated ~84M images and ~30M texts, free for everyone! We’ve been making and will continue making hundreds of thousands of images and text per day. We now have literally dozens of third party UI and bots integrated directly into the AI Horde.

A quick recap of all that’s happened this year

  • Sept 2022 – KoboldAI Horde is launched. KoboldAI gets integrated Horde support. Stable Horde is later forked from KoboldAI horde and launched. Lucid Creations is published.
  • Oct 2022 – We get our first raid. First countermeasures are developed. Artbot is published. Stable UI is published 1 hour later. Our first official discord bot is published. Img2Img is developed. Stable Horde fulfills 1 Million requests.
  • Nov 2022 – Teams are added. Post-processing is developed, Mastodon bot is launched.
  • Dec 2022 – Stable Horde reaches its limits and is massively refactored to scale better. Reddit bot is launched. KoboldAI lite is launched. Stable Horde is in the news for the first time
  • Jan 2023 – Image interrogation is developed. First third party mobile app is launched. Worker UI is developed. We start collaborating with LAION to gather Aesthetic ratings. The first third party chrome extension is launched. We start collaborating with Stability.ai. We replace the discord bot with different codebase.
  • Feb 2023 – Stable Horde and KoboldAI Horde are merged into the single unified AI Horde.
  • Mar 2023 – AI CSAM filter is developed. First “State of the AI Horde” is published. Ratings receive bespoke countermeasures.
  • Apr 2023 – AI Horde breaks away from the nataili inference developer due to toxicity.
  • May 2023 – AI Horde switches from the nataili backend to using comfyUI as a backend. Shared keys are developed. Massive documentation update. Ratings are published onto Huggingface. LoRa support is added.
  • Jun 2023 – AI Horde starts supporting all LoRas in CivitAI. hordelib receives a DMCA from the previous nataili developer
  • Jul 2023 – hordelib is recovered from DCMA limbo. SDXL Beta is added on the AI Horde with collaboration from stability.ai. Haidra is announced and we become part of Nivenly
  • Aug 2023 – Textual Inversion support is added. AI starts supporting all TIs on CivitAI

Writing this down, just makes me realize how much stuff has been happening constantly! Not a single month where nothing much happened. The most relaxed would be Dec-Jun where I spent basically 1 month in bed sick, but I still managed some big results!

The birthday celebration event is still ongoing at the time of publishing this in the AI Horde discord where I’m staying online in a voice chat and answer questions or just chat with people. We have plenty of threads for generating and sharing art etc.

I also asked the AI Horde community to tell me what the AI Horde means to them, and I wanted to quote some of them here:

The AI Horde’s (completely free and open) Image and Text generation services rely on volunteers donating their computing resources, and those volunteers do so without profit motive or selfish reasons. The result is huge number of people – tens of thousands, at least – have been able to get access to a technology that may be otherwise of their reach. I find this to be a hugely inspiring display of altruism and goodwill on the part of everyone involved, and I am deeply humbled and honored to be part of that ecosystem of people

Tazlin – AI Horde backend developer

AI Horde took KoboldAI from personal software to an online platform and allowed us to build a version people don’t have to install. Now its not just used for personal use, but also as a testing platform by model developers for their latest models before public releases.

Henky – KoboldAI Lead

Thanks to Stable Horde, I can generate new storyboard images for existing film concepts making the process of hashing out scenes and sequences for film projects easier. I’m thankful for everyone involved, the contributors and software developers for the tools, and to db0 for making all of this possible.

Mr.Happy

I think horde is a great way to make AI accessible to the general public without the need for complicated installs or expensive hardware. It’s certainly a great and low friction way to get people introduced to generative AI without the limitations and restrictions of closed off/ corporatized AI services like chatgpt… it’s been a vital backbone of the FOSS AI ecosystem over the past year.

Concedo – KoboldAI Lite developer

AI Horde is a powerful tool for breaking the monopoly of large corporations and providing AIGC with an independent platform to avoid the risk of being controlled by these corporations. This platform offers more affordable, free, and diverse AI services. Many AI drawing service providers now operate their own GPU cloud services and charge users high fees while controlling, castrating, and modifying AI drawing models, which severely infringes on user rights. The emergence of AI Horde allows AIGC to retain all potential while providing new ideas and solutions for expanding AI application scenarios, making it easier for ordinary people to access various AI technologies.

Pawkygame – AIpainter developer

For me, the whole community that’s been built around the Horde highlights some of the best about free and open source software. There are so many enthusiastic and knowledgeable people who are wiling to help each other out or discuss the latest in the world of generative AI. It’s been an awesome community to be a part of and one of my favorite little corners of the internet.

Rockbandit – Artbot developer

Stable Horde is an excellent tool for generating images, useful for creating reference material for traditional or digital art, creating art itself through the platform, or using it as a starting point to learn about AI. And the fact that it’s a free platform for image generation is a significant blow to companies that seek to monetize AI access.

It’s great for those who lack artistic skills like myself, and more skilled artists can use it for inspiration, or take a hybrid approach by making something manually and using image to image generation.

The various distributed workers available means that the barrier for entry for joining the horde is a device with an Internet connection, and I firmly believe the collaborative nature of the Horde would make Tim Burners-Lee proud.

It’s an excellent first step towards a bright future, full of opportunity made possible by free and open access to advanced technology. Now we just have to hope that the rest of humankind doesn’t trip.

CrabMan314

The AI Horde is the best way to burn us worker GPU for other’s happiness

Namplop

Here’s to another year of AI Horde!

The AI Horde now seamlessly provides all CivitAI Textual Inversions

Almost immediately after the AI Horde received LoRa support, people started clamoring for Textual Inversions which is one of the earliest techniques to fine-tune Stable Diffusion outputs.

While I was planning to re-use much of the code that had to do with the automatic downloading of LoRas, this quickly run into unexpected problems in the form of pickles.

Pickles!@\

In the language of python, pickles is effectively objects in memory stored into disk as they are. The problem with them is, that they’re terribly insecure. Anything stored into the pickle will be executed as soon as you load it back into RAM, which is a problem when you get a file from a stranger. There is a solution for that, safetensors, which ensure that only the data is loaded and nothing harmful.

However, while most LoRas are a recent development and were released as safetensors from the start, textual inversions (TIs) were developed much earlier, and most of them are still out there as pickles.

This caused a big problem for us, as we wanted to blanket support all TIs from CivitAI, but that opened the gate for someone to upload a malicious TI to CivitAI and then request it themselves through the horde and pwn all our workers in one stroke! Even though technically CivitAI scans uploaded pickles, automated scans are never perfect. All it would take is someone to discover one way to sneak by an exploit through these scans. The risk was way to high for our tastes.

But if I were to allow only safetensors, only a small minority of TIs would be available, making the feature worthless to develop. The most popular TIs were all still as pickles.

So that meant I had to find a way to automatically convert pickles into safetensors before the worker could use them. I couldn’t do it on the worker side, as the pickle has to be loaded first. We had to do it in a secure location of some sort. So I built a whole new microservice: the AI Hordeling.

All the Hordeling does is provide a REST API where a worker can send a CivitAI ID to download, and the Hordeling will check if it’s a safetensor or not, and if not, download it, convert it to safetensor and then give a download link to the safetensor to the worker.

This means that if someone were to get through the CivitAI scans, all they would be able to exploit is the Hordeling itself which is not connected to the AI Horde in any way and can be rebuilt from scratch very easily. Likewise, the worker ensure that they will only download safetensor files which ensure they can’t be exploited.

All this to say, it’s been, a lot more work than expected to set up Textual Inversions on the horde! But I did it!

So I’m excited to announce that All textual inversions on CivitAI are now available through the AI Horde!

The way to use them is the very similar to LoRa. You have to specify the ID or a unique part of its name so that the worker can find them, in the “tis” field. The tricky part is that TIs require that their filename in the prompt, and the location of the TI matters. This complicates matters because the filename is not easy to figure out by the user, especially because some model names have non-Latin characters which one can’t know how they will be changes when saving on disk.

So the way we handle it instead is that one needs to put the CivitAI model ID in the prompt, in the form of “embedding:12345”. If the strength needs to be modified, then it should be put as “(embedding:12345:0.5)”. On the side of the worker, they will always save the TIs using their model ID, which should allow ComfyUI to know what to use.

I also understand this can be quite a bother for both users and UX developers, so another option exists where you allow the AI Horde to inject the relevant strings into the prompt for you. You can specify in the “tis” key, that you want the prompt to be injected, where, and with how much strength.

This will in turn be injected in the start of the prompt, or at the end of the negative prompt with the corresponding strength (default to 1.0). Of course you might not always want that, in case you want to the TI to be places in a specific part of the prompt, but you at least have the option to do it quickly this way, if it’s not important. I expect UX designers will want to let the users handle it both ways as needed.

You are also limited to a maximum of 20 TIs per request, and there’s an extra kudos cost if you request any number of TIs.

So there you have it. Now the AI Horde provides access to thousands of Textual Inversions along with the thousands of LoRa we were providing until now.

Maximum customization power without even a need for a GPU, and we’re not even finished!