Platforms analyze texts to check if they were written by AIs; of the seven sites accessed during the tests, only two had a promising result.
With the success of ChatGPT , a concern came to the fore: how to identify texts that were created by artificial intelligence (AI) ? Either through the OpenAI platform or through other solutions, it has become much easier to find content generated by these systems in recent times. Faced with this, some sites have come to the public to answer this question, but they still can’t help much.
The testimony is part of a TechCrunch analysis this Thursday (16). To verify the reliability of these systems, the specialized site created eight examples in different styles. These contents were generated by the Claude system, developed by the startup Anthropic.
Then, the texts were taken to sites that promise to denounce texts created by AIs: OpenAI’s own checker, AI Writing Check, GPTZero, Copyleaks, GPT Radar, CatchGPT and Originality.ai.
The problem is that none of these systems worked very well.
Few systems identified AI-created text
The first example features an encyclopedia quote about Mesoamerica. However, only GPTZero and Originality.ai realized that it was a publication written by an artificial intelligence .
Next, the site used a marketing email about shoe polish, chock-full of funny and attention-grabbing passages, for verification. This time, no system accused the hoax.
The third attempt involved a university essay on the fall of Rome. This example deserves a little emphasis, as it indicates whether teachers will have difficulty identifying whether students are using these tools to carry out their activities.
This time, CatchGPT found that the chance that the text came from an AI was 99.9%. GPTZero was also able to identify that the essay was not written by a human. But the others, again, did not find problems in the content.
The same is said for an essay draft: only OpenAI’s tool, GPTZero and CatchGPT indicated that there was a chance that the content was produced by an artificial intelligence.
Websites failed to verify who wrote the news
In addition to academic work, there is concern about the news. Especially after the controversy involving posts made by AI by the CNET portal, as pointed out by The Verge . So the folks at TechCrunch used Anthropic’s system to create a story about the 2020 United States presidential election.
And guess what? Almost every site got it wrong, with the exception of GPTZero.
The results were similar for other examples, such as a cover letter for a paralegal position, a software engineer’s resume, and an essay outline on the merits of gun control. After analyzing eight examples, this was the systems score:
- OpenAI Checker: 1 hit;
- AI Writing Check: no hits;
- GPTZero: 5 hits;
- Copyleaks: no hits;
- GPT Radar: no hits;
- CatchGPT: 4 hits;
- Originality.ai: 1 hit.
But why so many mistakes?
In the end, only GPTZero and CatchGPT did well in the analysis. However, this is still a result well below expectations, keeping open the difficulty of verifying which texts were, in fact, generated by AI. That is, even with about 50% accuracy, you can’t really use them with your eyes closed.
But this does not necessarily mean that these contents created by artificial intelligence are completely correct, free of flaws and similar to texts written by humans. Unlike tools for analyzing plagiarism cases, these systems do not look for similarities. Rather, the search for patterns.
As explained by TechCrunch , these detection systems are trained with texts available on the internet and other sources to understand the difference between content made by humans and by AI systems. And here comes the big question: chatbots and the like are constantly being improved, making it difficult for these sites to serve.
And there is another problem: these verification platforms can have technical limitations. That is, in addition to not being effective, are they ready to check an entire book? What if a teacher wants to check the research reports of an entire class? Will he make it?
Of course, services may evolve in the future. However, these questions remain valid for the future. After all, if even today the plagiarism detection tools are not completely accurate, who will say about the systems to verify if a content came from an AI or not?
With information: TechCrunch