Channelling my inner corporal Jones, I feel it necessary to shout a resounding “Don’t panic!” in response to recent headlines on the front pages of formally well-respected “newspapers”. For example consider this one which appeared in the Times on 31/5/23: “AI Pioneers fear extinction”. What was in view was not their extinction at all but ours as the sub-headline made clear: “Our creations are as great a threat to humanity as nuclear war or pandemics, say hundred of experts in call to regulate tech”. This was followed by another front-page story on 6th June “Two years to save the world, says AI adviser”. The AI adviser in question was Matt Clifford, and the person he advises is the Prime Minister, Rishi Sunak. Mind you, all was not what it seemed. Mr Clifford felt it necessary to take to twitter to make clear that the headline did not reflect his view, and was not a fair summary of the interview he had given (you can find the details in this twitter thread). What has prompted many of these stories (besides the desire of politicians in various jurisdictions to deflect attention from other current difficulties) is the bursting into public consciousness of “generative AI” systems like “ChatGPT” and “Bard”.
First of all it’s just worth taking a breath and being clear what we are discussing. “AI” sounds like a thing, and in the minds of some it has taken on the characteristics of a malevolent personality. But, at least in the form of these “large language models” (LLMs, the family to which ChatGPT and Bard belong) this is far from the truth. Basically were are talking about large machine learning systems which work as pattern recognizers. Using a supercomputer (i.e. a very big and very expensive computer), software is designed which can be “trained” on vast amounts of data (digitised books, information from the internet, and other inputs) to attach statistics to patterns of output given specific inputs. Many of the techniques involved are not particularly new, although machine learning has been developing rapidly over the last few decades. The new factor is the vast amount of computing power which can now be deployed. Simpler systems for carrying out specific tasks have been available for a while and have been applied in areas like medical diagnosis (e.g. spotting tumours in mammograms) or security systems (detecting someone walking across a lawn in a video). But analysing language is much more complicated, requiring much more computing power to detect and define relevant patterns.
The scale of these recently developed LLM’s is huge. Think in terms of all the information on the interweb and the digitised content of several large University libraries (i.e. petabytes). This is boiled down to parameters represented in the software. In the case of ChatGPT we are talking on the order of 200 billion parameters! This complexity and the computing power that it requires allows for the production of quite sophisticated output given various language prompts. But it also creates a problem. To really exploit these systems takes a degree of skill in asking them the "right" questions; a whole new discipline of “probe design” has emerged to deal with this. There are only so many times that you can ask Bard for five jokes about John Calvin before it gets boring. But this is actually something worth doing, because it illustrates something else – you’re unlikely to get any real rib-ticklers back. Because the LLM doesn’t do funny (or knowledge, or insight). It does pattern recognition (i.e. what other words are connected with “john calvin” and/or “joke”) and information processing (which are statistically the most likely combinations of words and word arrangements that are like “jokes” encountered in training).
You might be inclined to argue that humour is just too hard a test. Even humans have a hard time defining what is and what is not funny. And for many tasks, information processing is just fine. So, generating 5000 words on agriculture in 19th century Bulgaria might be a more useful task (should you need to write an essay on such a topic). One can see why schools and Universities are having to think hard about whether they should routinely allow such systems to be used. But this is nothing new. The issue used to be the use of calculators in maths lessons and exams. And education is not (or should not) only be about producing buckets of information (informational widgets). That is, after all, exactly the kind of thing that computers are good at. It should be about the development of insight and wisdom and the correct selection and application of knowledge within appropriate ethical boundaries. Although perhaps that actually requires a combination of classroom education, example, age and experience. We have always employed teachers and lecturers (worth their weight in gold in my view); we have never just sat students down next to piles of books (or iPads) even although some AI proponents think this should be one of its uses. But not just education is having to think about how to use (or not) the new generative AI systems.
The editor of the Financial Times, Roula Khalaf, in a letter published at the end of May (26/5/23) on “..generative AI and the FT”, felt it necessary to say that while generative AI “has obvious and potentially far-reaching implications for journalists and editors in the way we approach our daily work” there were problems with it. “They [AI systems] can fabricate facts..and make up references and links. If sufficiently manipulated, AI models can produce entirely false images and articles. They also replicate the existing societal perspectives, including historic biases.” This hints at issues like how LLMs are trained. Their inputs are “cleaned” and edited, they have already been shaped by someone, somewhere. But who decides what goes in and what is left out? You and I aren’t told; someone in Google or at Microsoft has decided using criteria that are not public. And how different inputs are weighted (ie what’s really important and what is less so), is also determined by others, locking in certain kinds of analysis. All of the information that might be used to train an LLM, even if not cleaned, edited or prepared in some way, already has embedded in it various biases and prejudices, including all those that fall in the collective blind-spot of our particular age. The point is not only that we are not told about these and any of the values applied or other tweaks performed to ensure that outputs are acceptable. The point is that we don't know and can't know what they are; exactly what is being "learned" in machine learning systems is not knowable by us. It is hidden by design. Khalaf continued that “FT journalism in the new AI age will continue to be reported and written by humans”; educated, experienced, mentored and quite possibly aged humans (I assume). Knowledge is more than words, and wisdom is more than knowledge. All of this is only a big problem if you spend you life looking at a screen and your only friends are sentences.
But LLMs are only one kind of AI. Perhaps there are other kinds that are more dangerous. Words matter otherwise I wouldn't be wasting my time writing this. And I should point out that it really is me, and not an AI system. Bad words and misinformation are dangerous. But compared to climate change, war, pandemics and famine, which cumulatively are already killing millions in today’s world, they are not that dangerous. If you don’t want to be constantly misinformed and outraged, go cold turkey on twitter, read books and talk to real people. Before you mention Arnie the Terminator, no AI is truly autonomous. Somebody has programmed it, prompted it, directed it. And, by and large, if you pull the plug you can stop it. It’s people who start and stoke trouble. The heart of our problems will always be the fleshy hearts of people, not silicon in whatever form.
No comments:
Post a Comment