How to Identify Text Generated by AI?

The very recent and very meteoric rise of AI writing tools like ChatGPT has generated a large number of articles, essays, and content that is straight out of ChatGPT, i.e. generated by AI. For various reasons, you may not want to use AI-generated text (plagiarism, inaccuracy, etc.). This has led to the need to find ways to detect text generated by AI.

How do we know if a text was generated by AI or by a human?

As with the boom of AI writing assistant, we are seeing a rise of AI text detection tools and ways to decipher whether text was generated by AI or by a human.

Let’s take a look.

Why You Should Be Aware of AI-generated Text

There are several reasons why you would want to know whether, for example, an article was written by AI or by a human:

Authenticity – false information, fake news, misleading content

  1. Privacy – could be used to impersonate a person
  2. Quality – poor quality, incoherent, lack of creativity
  3. Originality – plagiarism
  4. Compliance – in some industries using AI-generated content is regulated

The above screenshot is from ChatGPT for the prompt: What should the intro of this article be?

How to Detect AI-generated Text

There are a few ways to (attempt to) detect AI-generated text. These fall into two categories: HUMAN and AI.

AI Detectors by AI

Perhaps ironically, AI is building AI detection tools – to detect its own output based on Machine Learning and large language models.

Currently (although things are moving very fast) several AI ‘detectors’ by AI are flooding into the market. Some of these tools are free, and some are subscription fee-based.

Here are some of the AI-detecting tools you can test and use:

1. Originality.ai

According to its developers, Originality.ai uses machine learning technology and advanced linguistic analysis techniques to evaluate text authenticity. It is a feature-rich plagiarism checker that can also determine if a piece of content is written by a human writer, ChatGPT, and other AI tools with 94% accuracy. It is a paid tool with advanced features such as team management, full website scan, scan history, etc. It is also available as a chrome extension.

2. GPTZero

Princeton University student Edward Tian created GPTZero. This is a fundamental tool that evaluates the content based on perplexity and burstiness.

Perplexity refers to the complexity of sentences. Humans write complex sentences and often use different words to refer to the same things. So, lower perplexity can often indicate the use of AI. AI tools write simple, smaller sentences using repetitive words or phrases such as ‘it’ instead of synonyms or relevant words.

Burstiness refers to unevenness in the writing tone and style. Although every person has a unique writing style, the style is more complex and unpredictable. The tone varies too, and so does the choice of words. Sometimes, the sentences could be short, and sometimes they could be long. GPTZero can detect these spikes to indicate if a human or AI wrote the text.

3. GPT-2 Output Detector

GPT-2 is a free AI content detection tool (so far). Being a simple tool, it is meant for small-scale content publishers and website owners who often have little scope of work. It, however, uses advanced parameters such as machine learning and natural language processing to predict the text’s authenticity. You can use it to identify what percentage of content was AI-generated.

4. Giant Language model Test Room (GLTR)

GLTR is one of the most-advanced AI-generated text detection tools. Being a simple and free AI text detection tool, GLTR is more suitable for small-scale content publishers. It inspects the text and looks for elements that language models use. GLTR presents the analysis in histograms, which may be challenging to interpret.

AI Detection by a Human

There are a few limitations with AI detectors by AI – the biggest issue being accuracy. The AI dectection tools are NOT always accurate – they can mislabel both AI-generated and human-written text.

Just like plagiarism tools, AI-text detection tools have different levels of accuracy. Some tools may have an accuracy rate as low as 50%. Others may have higher accuracy, but none of them are 100% accurate, even though they may claim otherwise.

For this reason, using a human in the loop, a real person, to proofread and edit text is a must. Perhaps the goal shouldn’t be to read the text for the sake of trying to determine whether it was written by an AI or a human. Rather the purpose of the human editor is to make sure the text is fitting, high quality, and appropriate for what it is intended for.

Final Thoughts

Detecting text generated by AI may be necessary. However, it can be challenging to determine whether some text is written by an AI or a human. So, instead of focusing on analyzing each word and the strings of words, it may be much more important to determine if the text is accurate, of high quality, and has all the other criteria that you require for the text. This, at the moment anyway, is best done by a human editor.

Alie Jules
AI Educator and consultant