last week I wrote how we started creating a new dataset of stable horde images to provide to LAION. Today I am proud to announce that we have further deepened our collaboration by setting up a mechanism which will allow the Stable Horde community to contribute dataset aesthetic ratings for LAION datasets!
Me along with hlky from Sygil.dev have used the last weekend to deploy a new service which allows us to aesthetically rate images from LAION’s multiple datasets. We deployed an API and thus allowed any client to interact with it. You can read the details of how it works on the blog I linked above, so I’m not going to repeat everything.
This is exciting for me because the Stable Horde has suffered from a distinct lack of visibility. None of the major AI-focused media (newsletters, YouTubers etc) have mentioned us to date. The very first coverage we got was from a PC magazine!
All that is to say that it’s been an uphill struggle to get the Stable Horde noticed in a way that will lead to more workers which will allow us to democratize access to AI for everyone. So I am very happy to pivot the amazing stable horde community in such a positive work which will bring more attention to what we’re trying to achieve.
We are still hard at work tweaking the information we store for each rating. For example we store the amount of images they had generated at the time of the rating, which will allow researches to filter out potentially spammy users.
We are also adding more and more countermeasures, as there’s always the fear that someone will just script random ratings to get kudos. Even though the Stable Horde is free to use without kudos and even though kudos has no value, people do strange things to see “numba go up”. Now I don’t particularly care if people harvest kudos like this, but I very well do care about our ratings being poisoned by garbage.
So if you’re someone who wants to make an exploit script to harvest kudos via ratings, please just join our discord instead. The kudos flow like candy when you’re active! And you will also not be harming the AI community itself.
Already our exported dataset has grown to 80K shared images. We have 20K ratings on the LAION datasets within 2 days. For comparison some of the biggest rated datasets have just 175K ratings which were done by paid workers (and we all know how motivated they are to be accurate). Our kudos incentives and community passion to improve AI is surprising even my wildest expectations to be honest!
Here’s to making the best damn dataset that exists!