Perspective

LLM Hallucinations: Examples and Their Causes

LLMs are a powerful tool. But they hallucinate, i.e., they make things up. The fastest and most effective way to minimize chatbot hallucination is to use Alhena AI (Formerly Gleen AI).

Hallucinations in large language models are the equivalent of a drifting mind in the generative AI field. While they might seem harmless and even humorous, they can also have serious implications.

One of the significant challenges in LLMs is generative question answering, where the model generates both questions and answers, often leading to inconsistencies and inaccuracies.

In this article, we’ll explore what an LLM hallucination is. We’ll also discuss why it happens to help you address the problem.

A large language model or LLM is a type of artificial intelligence (AI) algorithm that recognizes, decodes, predicts, and generates content.

While the model derives some knowledge from its training data, it is prone to “hallucinate.” A hallucination in LLM is a response that contains nonsensical or factually inaccurate text.

Introduction to Hallucinations in LLMs

Hallucinations in Large Language Models (LLMs) refer to the phenomenon where these AI systems generate text that is inaccurate, irrelevant, or nonsensical.

This issue is a significant challenge in the development of AI Application built on top of LLMs, as it can lead to the spread of misinformation and undermine the trustworthiness of AI-generated output.

LLM hallucinations can occur due to various factors, including training data issues, model limitations, and primarily the probabilistic nature of language generation. All the generative AI models (LLMs being a type of generative model) are built to generate the next token. As such

Hallucination is more of a feature of LLMs, rather than a bug. If the models stopped creating new tokens, we would lose all the magic of AI.

However, whats problematic is when we use these LLMs in mission-critical Applications such as in Customer Support, eCommerce and others - any wrong output can lead to serious consequences.

As such, understanding the causes and consequences of LLM hallucinations is crucial for developing effective mitigation strategies and improving the reliability of AI systems.What Is an Example of an LLM Hallucination?

A recent Tidio survey found that 72% of users trust AI to provide factual and reliable information. Moreover, 75% of those respondents reported AI to mislead at least once.

These hallucinations often result in made up content, where the model generates false or inaccurate information due to insufficient data or flawed training.

Types of Hallucinations

There are several types of hallucinations that can occur in LLMs, including factual inaccuracies, nonsensical responses, Source Conflation and irrelevant content. These types of hallucinations can have significant consequences, particularly in real-world applications where accuracy and reliability are critical.

1) Source Conflation

Source conflation involves the model generating factual contradictions. This LLM hallucination problem occurs because the model tries to combine extracted details from various sources.

Sometimes, an LLM can even make up sources. (See examples below.)

2) Factual Error

Language models cannot differentiate between a truth and a lie. As a result, LLMs can generate content with no factual foundations.

The reliability of AI's outputs is crucial, as inaccuracies can lead to significant real-world consequences.

Examples of LLM hallucinations with factual errors is less common when trained on accurate data.

Remember, however, that pre-trained models like GPT-o3 and GPT-4.5 are trained on the entire internet, and the internet contains many factual errors. As such, it’s still good practice to fact-check everything an LLM generates.

3) Nonsensical Information

LLMs simply predict the next most probable word in a sentence.

Most of the time, the content they generate makes sense. However, they can also produce grammatically correct text that doesn’t make sense. Worse yet, LLMs can produce responses that sound convincing and authoritative, when in fact the response has no factual basis whatsoever.

4) Irrelevant Content

Irrelevant content refers to the generation of text that is unrelated to the input prompt or topic.

What Is an Example of an LLM Hallucination?

Generally, examples of hallucinations in LLMs are harmless and even humorous. Yet, in an interview with Datanami, Got It AI co-founder Peter Relan said that ChatGPT “makes up stuff” 20% of the time.

Here are some cases wherein hallucinations in LLMs could have serious implications:

1) Cursor's AI Hallucinates a Security Policy - Causing Cancellations

In one widely reported instance, Cursor's AI chatbot invented a new company policy regarding device logouts when a user was experiencing issues switching between devices. This led to user confusion, dissatisfaction and eventual cancellations.

2) Lawyers Who Cited Fake Cases

According to a Forbes article, two lawyers might face sanctions for citing six non-existent cases. Steven Schwartz, one of the lawyers involved, said that he sourced the fake court cases from ChatGPT.

This case underscores the risks of relying on AI for legal research without proper verification.

3) Falsely Accused Professor

The Washington Post reported that a ChatGPT response accused a law professor of sexual harassment. The LLM even cited a non-existent Washington Post article as the information source.

This example illustrates the dangers of incorrect outputs generated by AI, which can lead to serious reputational damage.

4) Inaccurate Summarization of a Court Case

When asked for a summary of the Second Amendment Foundation v. Ferguson case, ChatGPT responded with factually inaccurate information. The response mentioned that SAF founder Alan Gottlieb sued Georgia radio host Mark Walters.

Moreover, the LLM hallucination mentioned that the latter defrauded and embezzled funds from the foundation. Consequently, Walters filed a lawsuit against OpenAI LLC, claiming that every detail in the summary was false.

Impact of Hallucinations on AI Systems

The impact of hallucinations on AI systems can be significant, particularly in applications where accuracy and reliability are critical. Hallucinations can lead to the spread of misinformation, undermine the trustworthiness of AI-generated content, and damage the reputation of organizations that rely on AI systems. Furthermore, hallucinations can also have real-world consequences, such as financial losses, legal issues, and harm to individuals or communities. Therefore, it is essential to develop effective mitigation strategies to reduce the occurrence of hallucinations in LLMs and improve the reliability of AI systems.

Role of Training Data in Hallucinations

Training data plays a critical role in the occurrence of hallucinations in LLMs. If the training data is of poor quality, biased, or incomplete, the model is more likely to generate hallucinations. Training data issues can include factual inaccuracies, outdated information, and irrelevant content. Furthermore, the lack of diversity in training data can also contribute to hallucinations, as the model may not be exposed to a wide range of perspectives and contexts. Therefore, it is essential to ensure that the training data is of high quality, diverse, and relevant to the task at hand.

AI Model Limitations

AI model limitations are another significant factor that contributes to hallucinations in LLMs. LLMs are complex systems that rely on pattern recognition and probabilistic models to generate text. However, these models are not perfect and can be limited by their architecture, training objectives, and inference strategies. For example, LLMs may struggle to understand the context and intent behind a prompt, leading to the generation of irrelevant or nonsensical content. Furthermore, the model’s inability to reason and understand the implications of its outputs can also contribute to hallucinations. Therefore, it is essential to develop more advanced AI models that can better understand the context and intent behind a prompt and generate more accurate and relevant content.

What Causes a Hallucination in LLM?

Research from a new start-up found that ChatGPT hallucinates about 3 percent of the time. That’s not surprising, especially since deep learning models can exhibit unpredictable behavior.

In response to LLM hallucination problems, stakeholders must take the initiative in implementing safe, secure, and trustworthy AI. Moreover, it doesn’t hurt to understand why these instances occur.

So, what causes an LLM to hallucinate?

Unverified training data – By itself, a large language model cannot distinguish between fact and fiction. So, when you feed it with diverse and large training data without properly verifying the sources, it can pick up factual inaccuracies.
Inadequate/inaccurate prompt context – LLMs may erratically behave when you use inadequate or inaccurate prompts. Moreover, it may generate an incorrect or irrelevant response when your objectives are vague.
Misaligned objectives – Most public LLMs underwent training for general natural language processes. These models need extra help when inferring responses for domain-specific subjects like law, medicine, and finance.
Most importantly, it’s just probabilities – LLMs simply predict the next most probable word in a conversation, not the most accurate word. LLMs have no idea if the generated response is accurate or not.

Can LLM Hallucination Be Prevented?

You might wonder if you can eliminate hallucinations in LLMs.

Well, the short answer is “no.”

Hallucinations naturally occur in LLMs because they are a critical part of how these models operate. However, you might notice that some LLMs hallucinate more than others.

A 2023 study fed LLMs 1,000 documents, one at a time. The researchers asked LLMs like GPT, Llama, Cohere, Google Palm, and Mistral to summarize each of the documents. The study found that these LLMs hallucinated 3% to 27% of the time.

So, you can consider hallucination a built-in feature in LLMs.

Is Hallucination a Bad Thing?

While hallucinations in LLMs can have serious implications, they are not always a bad thing. For instance, they can be useful when you want to give more creative freedom to the model.

Since LLMs use training data, hallucinations would enable them to generate unique scenes, characters, and storylines for novels.

Another example where LLM hallucination can be beneficial is when you’re looking for diversity.

The training data will provide the model with the ideas you need. However, allowing it to hallucinate to some extent can let you explore different possibilities.

On the other hand, hallucinations in LLMs can be a bad thing in use cases where accuracy is extremely important. They can propagate false information, amplify societal prejudices, and reduce the credibility of deep-learning models.

Moreover, LLM hallucination can be a problem for fully automated services. For example, chatbot hallucinations can cause poor customer experience.

Pro Tip: Follow our guide on how to select an AI chatbot for customer service.

How to Minimize Hallucinations in Large Language Models

While you cannot prevent hallucinations in LLMs, there are ways to minimize them. You can use the following techniques:

One effective technique is prompt tuning, which involves adjusting the model's output style and behavior using smaller datasets.

Highly descriptive prompts – You provide the model with additional context by feeding it with highly descriptive prompts. By giving it clarifying features, it may be less likely to you’re grounding it in truth. Consequently, it is less likely to hallucinate.
LLM fine-tuning – You can feed the LLM with domain-specific training data. Doing so will fine-tune the model, allowing it to generate more relevant and accurate responses.
Retrieval-augmented generation (RAG) – This framework focuses on providing LLMs with both the end user question and context around the question, i.e., the most accurate and relevant information in the knowledge base. The LLM uses the both the question and the context to generate or more accurate and relevant response.
Solve hallucination outside the LLM – Instead of trying to minimize hallucination at the LLM layer, you can deploy generative AI solution like Alhena AI. Alhena AI (formerly Gleen AI) solves hallucination at the layer of the chatbot by carefully selecting inputs to the LLM and detecting hallucination in the LLM output.

Alhena AI – The Practical and Reliable Solution

Note that fine-tuning an LLM model requires consistent training. As such, it can come with significant cost and time implications.

Using custom data for fine-tuning can significantly improve the accuracy and reliability of the model's outputs.

Moreover, it can be challenging to pour most of your resources into RAG and prompt engineering, and RAG and prompt engineering won’t eliminate hallucination.

Fortunately, you can deploy a custom generative AI chatbot like Alhena AI. As a commercially available solution trusted by 100s of businesses including the likes of Tatcha, Puffy and Totango, Alhena AI proactively prevents hallucination.

Alhena AI uses Agentic RAG techniques to control hallucination. Its hallucination detection agent is independent of the answer generation RAG agent. This allows for an highly effective zero-hallucination solution

In fact, 80% of Alhena AI’s tech stack focuses on preventing hallucinations. Moreover, unlike LLM fine-tuning, this tool can be ready within a few hours.

Here's a video of Alhena AI versus an Open AI GPT trained on the same knowledge. Alhena AI doesn't hallucinate. The GPT does:

We at Alhena AI have built an advanced Agentic technology to ensure that it constantly generates highly relevant and accurate responses with minimal hallucination.

Ready to See a Hallucination-Free AI?

Book a 15-minute demo, and we’ll put an Alhena agent on one of your real tickets. Watch as it magically retrieves, plans, and resolves in front of you - without any hallucinations.

💡

Discover how Alhena AI can transform your customer experience. Schedule your personalized demo today.

Learn More: Alhena AI For Customer Support
Create and Test Your Own AI Agent for Free: Sign Up
Read Customer Success Stories: Customer Success stories