Image Interrogation Progress

For the past 2 weeks I’ve been trying to build the new feature of the horde which will allow even people with a low powered GPU, or just CPU to join the horde and provide a service to gather kudos.

It is called “Interrogation” as it will “interrogate” source images to discover aspects about them, such as an image caption, or whether they are displaying NSFW content etc. This feature can then be on-boarded into new or existing tools, such as perhaps automatically captioning images for micro-blogging services as an accessibility feature, or a browser plugin for parental controls etc.

However what I thought would be a bit tricky but doable soon keeps running into various snags. First, I lost one complete week of work from my vacation by getting the nastiest cold I’ve had for the past 10 years at least. Flattening me for almost 9 days. Then it was holiday period where I had to put more attention to family and friends.

Now that I’m finally able to concentrate more on it, I find it’s actually an order of magnitude more complex than I initially expected, requiring me to actually have to update my existing database tables (always a risky proposition) and also redesign my approach to use such things as “Polymorphic tables” so that I don’t end up duplicating hundreds of lines of code between similar classes.

And while I’m doing this, the horde ML backend has been receiving a surprisingly increased pace of improvements, recently implementing depth2img, adding diffusers to voodoo-ray, CodeFormers and a ton of urgent bug-fixes and other improvements.

To say my attention has been split is an understatement.

But I’m slowly but surely making more progress. I hope to have something out soon-ish. I do wonder if it will require a complete horde downtime this time for the DB upgrades. Never done that before. Kinda scary, not gonna lie…

depth2img now available on the Stable Horde!

Through the great work of @ResidentChiefNZ The Stable Horde now supports depth2img, which is a new method of doing img2img which better understands the source image you provide and the results are downright amazing. This article I think explains it better than I could.

See below the transformation of one of my avatars into a clown, a zombie and a orangutan respectively.

To use depth2img you need to explicitly use the Stable Diffusion 2 Depth model. The rest will work the same as img2img. 

Warning, depth2img does not support a mask! So if your client allows you to send one, it will just be ignored.  

If you are running a Worker you can simply update you bridge code and you must update-runtime as it uses quite a few new packages. Afterwards add the model to your list as usual. 

We recently also enabled diffusers to be loaded into voodoo ray, so this will allow you to not only keep the depth2img in RAM along with other models, but also the older inpainting model! Please direct all your kudos to @cogentdev for this! I am already running both inpainting, depth2img, sd2.1 and 15 other 1.5 models on my 2070 with no issues!  

If you have built your own Integration with the stable horde such as clients or bots, please update your tools to take into account depth2img. I would suggest adding a new tab for it, which forces Stable Diffusion 2 Depth to be used and prevents sending an image mask. This is to avoid confusion. This will also allow you the opportunity to provide some more information about the differences between img2img and depth2img.  

Enjoy and please shower the people behind the new updates with Kudos where you see them!

The Stable Horde is in the news!

A new article has been published in about the Stable Horde!

Overall a very well researched article. I can’t find any issues with it. Personally I would liken the AI Horde technology as a mix between BitTorrent and Folding@Home, but the former has some negative connotations for many people.

Some things I could address from the article

It’s not entirely clear whether every fork of Stable Diffusion should work, but you can try.

There’s no “forks” of stable diffusion. There’s checkpoints and multiple models and the horde supports every .ckpt model and some diffusers models. I suspect the author confused Stable Diffusion the model, with clients and frontends using it, like automatic1111.

There is a tiny bit of a catch: the kudos system. To prevent abuse of the system, the developer implemented a system where every request “costs” some amount of kudos. Kudos mean nothing except in terms of priority: each request subtracts kudos from your balance, putting you in “debt.” Those with the most debts get placed lowest in the queue. But if there are many clients contributing AI art, that really doesn’t matter, as even users with enormous kudos debts will see their requests fulfilled in seconds.

Indeed each request consumes kudos to fulfill, but you don’t actually go in debt. While we do record the historical amount of kudos you’ve consumed for statistics, your actual total as a registered user never goes below 0. This means as a registered user, you will always have more priority than an anonymous user (who typically remains at -50 kudos). Your kudos minimum also allows you to generate with slightly higher resolution and steps than an anonymous user.

Images won’t automatically download, but you can go to the Images tab and then manually download them.

That it totally dependent on the client. It works this way for Artbot, but Lucid Creations for example is a local application, so the images are saved with a button click. Other clients might save automatically.

Other than that, great article!

To be honest, I’ve been actually quite surprised that nobody has written about the SH until now. The SH went live in early September, soon after Stable Diffusion came out, and we’ve generated 13 Million images until now (or approximately $50K of value) but none of the big AI and AI Art focused news reports has given a single mention of it! Now, I am not one for conspiracy theories, but it sounds extraordinary unlikely that absolutely nobody in the scene has noticed us until now or felt we are newsworthy, especially since many people have directly tweeted to some of the big AI and Stable Diffusion players about it.

Oh well, a PC-magazine is the first to report on the Stable horde. So be it! I wonder how many people will discover the Stable Horde from it.

Some napkin math

Stable Horde has generated ~180 Terapixelsteps of images. Assuming each image is 512x512x30 that is like 22 million images (higher resolutions have an exponential difficulty).

Using the current cost of, the Stable Horde has generated for free a value of close to $45000! Using the old costs (Stable Horde has been up almost as long), this is closer to $230.000 All this value has been given out voluntarily, with no ads or fine print.

Taking into account the post-processors allowed and the exponential difficulty of higher resolution images (Stable Horde allows up to 3072×2048), these numbers can easily be doubled.

For reference, in its stable horde lifetime, my patreon account has made $500,most of which has gone to infrastructure costs.

Codeformer and Reddit Bot for the Stable Horde

I haven’t been able to improve the Stable Horde a lot lately. I was planning to do a lot of work during the week leading to Christmas, but unfortunately the universe had another idea and not only infected me with the nastiest cold I’ve had for decades, but my whole family as well, including the visiting Grandma!

So instead of adding necessary new features, I’ve been instead flattened at bed, trying to muster enough concentration to do some basic updates and answer questions.

Nevertheless, there’s a few improvements added, mostly through the work of some members of the community.

First is the addition of the CodeFormer face-fixing post-processor which seems massively better than the GFPGAN model. Now all clients can request that an image be bassed through CodeFormer for an immediately improvement in faces. Soon I plan to allow this to run in isolation as well

The other new thing is improvements on the workers themselves, allowing them to pickup and perform jobs more efficiently.

The other big news I have is that wrote and unleashed the first Reddit bot for stable Diffusion. That was initially created as an entry for the Ben’s Bites Hackathon since I couldn’t submit the Stable Horde itself (I didn’t win btw), but it was quite an eventful release. My initial release got caught by the automated reddit anti-spam filter, shadow-banning my account and banning my subreddit. Then I refactored the bot to use my own R2 CDN and released it with a new account while asking for a reddit review on my original account. Fortunately my bot account and subreddit got unbanned and I finally released it a third time properly, and it’s been up ever since!

The way the bot is created you can request images from it all over reddit, and it will post the images in its own subreddit for everyone to see and vote on.

There’s also been a lot of new models and styles onboarded, which are also used by my reddit and mastodon bots.

The next plan now is to allow image interrogation on the stable horde, as well as direct image post-processing (without stable diffusiion), so as to allow even people with low-powered machines to be able to contribute for kudos.