Feeding the Monster - Generative AI and the Rest of Us



Scared of ChatGPT? Don't be. It's only a wee puppy that just wants to play. The hellhound it may grow into is another matter...

Daemon

Have you ever written something and posted it on the Internet? LinkedIn, Facebook, Twitter? A self-hosted blog page? If you are reading this, chances are very good that you have. And if you have, chances are very good that a generative learning model like ChatGPT has assimilated your content.

Anyone born before 2000 (and likely most born after) will have come across the favorite antagonist of Star Trek - The Next Generation: The Borg, a collective of cybernetically augmented, alien and human organisms that grow the hive by „assimilating“ captured individuals. This assimilation isn’t just physical - the knowledge of new hive members is fed into the collective to make it stronger and more „intelligent“. Indeed, the initial plans for the Borg in the Star Trek storyline had to be toned down, because the creators were afraid the odious concept of an über-intelligent swarm-like enemy that felt neither fear nor pain would be too much for most fans of the show.

If you’ve played with a generative AI, such as ChatGPT, you may have let your mind trick you into believing that the resulting output, be that text or images, is something the AI has created. Fact of the matter, however, is that an AI algorithm, no matter how advanced, is merely a statistical device that acts on gigantic input sourced from content on the Internet, including yours and mine. In that sense, it is not that much different from The Borg.

To be able to make a decision on whether this technology is for the better or worse of humanity, it is important to understand the technology on a level that will allow us to evaluate its propensity to change our lives.

Any AI system needs a so-called „training set“, a mass of specially organized data the AI can use to make decisions. There are different types of AI technologies, from simpler machine learning systems that need pre-sorted data to function to very complex „Generative Adversarial Networks“ (GANs) in which two or more AI systems play data off one another and basically train each other.

In any of these systems, training data needs to be categorized. There is no sense in feeding an AI pictures of dogs and cats so that it can „learn“ which is which without labelling those pictures accordingly. The same goes for internet textual or image content: in order for the AI to gain an advantage in adding text or text snippets to its training set, it needs to „know“ what that text is about.

In the case of ChatGPT, for example, this content is scraped from web pages in a very similar fashion that the various search engines do: start with a well-linked web page and, after ingesting it, follow each link on that page to other pages. This is called „web crawling“. Nowadays, it is easy to find web pages, of course, because the various search engines have already done all of that work - however, using results from a search engine will induce bias (mainly towards advertising content), so it is important to give appropriate weight to content captured directly.

Just like search engines, pages need to contain some sort of content classification, otherwise the usefulness is decreased. Just like modern search engines, AI training web crawlers will use linguistic modeling to interpret the textual contents of a page (images work completely differently, so we‘ll not consider generative image AI at this point).

The interaction with ChatGPT is quite astounding. If you ask it to write some programming code to solve a particular math problem, it will do so with incredible accuracy. Ask it to write a poem in the style of Romanticism about a field of flowers, it will output a well-made text comparable to those of Keats, Shelley or Blake. There is a reason why the emergence of ChatGPT immediately raised concerns in the academic community. Who is to tell whether the poem handed in by an English Lit student for a grade really is a child of their imagination and not that of an AI?

ChatGPT isn‘t the monster alluded to in the title. The monster is what will come next. Just as the Borg found their strength not in quick reactions or great fighting ability, but in their combined knowledge. Every AI researcher‘s goal is to generate the ultimate AI, the „General Intelligence,“ and the path towards that goal is to combine or cascade different AI solutions. Seeing that AI runs on computer hardware, the interaction of different AI solutions runs at ever increasing speed as the hardware gets faster. Humans are limited in the speed of exchanging information in writing (and reading) or conversing - AI is not. Data exchange will be nearly instantaneous and cascaded decisions in this „gestalt intelligence“ unstoppable. The only control we will have is any filters we add to the data communicated between the AI units. Is it sounding more like The Borg yet?

It is unfortunate, that so many articles (and even Wikipedia entries) tend to anthropomorphise AI. It has become common „knowledge“ that AI „learns“ or has „intelligence“. Neither is true. AI is a statistical analysis machine, no matter what methodology is in use. It doesn‘t have emotion, nor does it generate „intelligent“ decisions. It „reacts“ statistically on the basis of input it has received.

Is there reason for concern? Of course there is. Just like any technology that represents a paradigm shift, there is the risk of GenAI being used for malignant purposes. If you look back at previous instances of game-changing technology, the cycle is likely always the same: everyone has an opinion on it without understanding the basics of the technology. Prognoses of abuse always sell more print, but after the technology becomes commonplace, everyone thinks “oh that, well, yea, so what?”.

The first innovation of this type was certainly the containment and control of fire. We’re talking tens of thousands (possibly hundreds of thousands) of years ago. Was humanity scorched out of existence? Quite the contrary. Science has determined that our current cognitive capabilities probably would not have evolved (as quickly?) if it wasn’t for the ability to cook food.

Next, the wheel. A boon to anyone that carried heavy things before, the wheel made the wagon, pulled by a cow or oxen, possible. Yes, it was used in war machines (anything from Roman war wagons to huge catapults) and still is. No one in their right minds would want to ban wheels from their lives in this day and age.

One of the most exciting technologies was Gutenberg’s invention of the printing press. The first customer? The Catholic Church that produced thousands and thousands of indulgence certificates to generate cash for the Holy See. Anything from communist leaflets to outright hate literature (think of “Mein Kampf”) has been produced and distributed using this technology. Would you outlaw it because of that?

The list goes on - there are two sides to every coin. Yin and Yang. It is especially problematic if politicians (that, for the most part, don’t understand AI even on a basic level), besieged by “consultants”, try to regulate the technology using laws.

It is mainly up to us in what details of our private lives we decide to make public by publishing it anywhere on the internet (even on this channel). Just be aware that anything that you have uploaded and will upload in the future will feed the monster.

Year

Categories

Tags