Checking AI: the empirical proof?
These are still early days for post-Turing-test interactions, but actually computers have been simulating human writing for many decades. Twenty or more years ago they would have given human-like responses by pulling appropriate answers from some vast compendium of actual writing.
About ten years ago, systems began constructing responses algorithmically, using hand-crafted rules or statistical models that combined words or phrases according to syntactic templates, without true understanding of language or context.
And now? Now they generate text dynamically using large language models that predict the most likely next word based on context, rather than following fixed rules. Thanks to the immense processing power of modern machines, even my humble laptop can respond uniquely and seemingly effortlessly to complex requests. There’s little doubt that machines have now come remarkably close to mastering natural language dialogue.
All of which has set the alarm-bells ringing, particularly in academe. In pre-AI days, the likelihood was that derived answers would find a match somewhere on the internet, and a huge educational side-business of plagiarism checkers was born. The market leader, ‘Turnitin’, became part-and-parcel of the university experience for lecturers and students alike from the mid-2000s to the late 2010s. But now, they’ve capitulated. Their new software is designed to proactively help those students who do use AI to do so better and more ethically. Or something like that.
On the other hand, detectors such as GPTZero, developed by Princeton student Edward Tian in 2023, have also failed the litmus test because of their tendency (20% of the time, in the case of GPTZero) to assume that original, well-formed and articulate human text has been produced using AI.
A business professor of my acquaintance here in Prague set his students writing on the topic of the recent decline in Czech GDP. 90% of the answers that came back confidently and elegantly delineated the various factors that might have led to such an economic downturn. Just 10% of them pointed out that in fact GDP has grown, not declined, in this period. A (deliberately) inaccurate prompt had led their AI assistants down the wrong research path.
So is there a foolproof way of testing for AI-generated answers? Well, it’s tough. But my proofreading experience has given me one potential pinprick test that works for me nearly all the time. Want to know what it is? You’ll have to subscribe to find out!
Keep reading with a 7-day free trial
Subscribe to English Wanted to keep reading this post and get 7 days of free access to the full post archives.



