Friday, August 16, 2024

Can filtering save AI from "model collapse"?

It's no longer that new, after several reports and studies since 2023, but an increasing number of researchers are warning about what they are calling "model collapse" in artificial intelligence (AI) circles. Circles that, more and more, are becoming everyone's circles, as AI is gradually being built into everything we use.

AI is all very clever, but it relies on human input to learn what it knows, basically content on the internet. But, increasingly, that content is AI-generated, and some of it is pretty degraded stuff. Because, as has also been reported ad nauseam, not everything that AI produces is of good quality, or even vaguely correct.

AI language models like ChatGPT are being "trained" on increasingly error-prone "synthetic" AI-generated material scraped from the internet. Studies have shown that if an AI model uses as few as 10 iterations of its own material, the resulting output can end up completely nonsensical, exhibiting an apparent obsession with something that wasn't in the original source material at all. 

Thus, one example started with: "Some started before 1360 - was accomplished by a master mason and a small team of itinerant masons, supplemented by local parish laborers, according to Pointz Wright. But other authors reject this model, suggesting instead that leading architects designed the parish church towers based on early examples of Perpendicular". After 10 "generations" of training on this, one AI model came up with: "architecture. In addition to being home to some of the world's largest populations of black @-@ jackrabbits, white @-@ tailed jackrabbits, blue @-@ tailed jackrabbits, red @-@ tailed jackrabbits, yellow @-". Hmm. Confused?

This is what researchers are calling "model collapse" but it has also been likened to ourobouros (the snake eating its own tail, from antiquity) or AI eating itself. The particular example described above is perhaps an exaggeration, and is certainly low stakes. But you can see how there is the potential to exacerbate things like racial and gender stereotypes (which AI has already been accused of), and other compounded errors.

So, filtering of AI training inputs is a whole burgeoning area of research now. This can be very labour intensive, which all but negates the value of having AI at all, I would have thought. It is hoped to be able to automate this filtering process, but you can see the logical hole this is going down, I am sure.

Whether or not model collapse can be tamed, AI these days is increasingly looking less like a magical solution to all evils and more like a source of evils, not to mention an unprecedented hog of investment dollars, water (for cooling those data centres) and electricity (for powering them). Its early promise is starting to seem wildly overblown, and it is beginning to look more like an enormously expensive parlour trick.

No comments:

Post a Comment