Generative AI

Generative AI Risks: What the Critics Get Right and Wrong

A 2026 McKinsey survey found that 72% of companies now use generative AI in at least one business function, up from 33% just two years earlier. Yet critics argue the technology causes more harm than good. In a widely cited Tech Policy Press article, researchers Emily Bender and Alex Hanna laid out a case that generative AI hurts consumers, displaces workers, and produces unreliable digital output.

They raised real concerns. But the full picture is more complex than "generative AI is bad." The real generative ai risks depend on how the technology is designed, deployed, and governed. A poorly built chatbot will hallucinate. A well-built one, grounded in verified data, won't.

This post breaks down the biggest problems with generative AI that critics point to, where those arguments hold up, where they fall short, and how purpose-built AI tools like Alhena AI solve the risks that matter most in ecommerce.

What Are the Biggest Generative AI Risks?

Bender and Hanna's article makes three core arguments against generative AI:

Generative AI is bad for consumers because it creates lower quality products and services that are erroneous and expose consumers to implicit racism, sexism, bias, and misinformation.
Inferior generative AI-based products will create a two-tier society, where the "haves" hire actual humans, while the "have nots" are subjected to those inferior generative AI-based products and services.
Generative AI cannot complete tasks set out for it. Output needs to be verified by a human. This will lead to mass layoffs and people being hired as contractors to verify generative AI output.

Their clarion call: investors and companies should resist the generative AI hype, and the government should seek to regulate it.

We at Alhena disagree with the blanket conclusion, though not with every point. Here's our response to each argument.

Generative AI Hallucinations: The Risk That's Solvable

Bender and Hanna use the word "erroneous," but what they really mean is that generative AI chatbots hallucinate. They make things up and can create responses that are factually incorrect. That's because large language models (LLMs) are entirely based on predicting the next most likely word in a sentence.

This is a real problem. A 2024 Stanford study found that LLMs hallucinate between 3% and 27% of the time depending on the task. In ecommerce, even a 3% error rate can mean wrong product specs, made-up return policies, or fictional discount codes reaching thousands of customers per day.

But here's what the critics get wrong: hallucination is a design problem, not a fundamental flaw. While we agree that hallucination is endemic to raw LLMs, it's important to draw a distinction between LLMs (like GPT-4, Claude, and Llama) and chatbots (like ChatGPT and Alhena AI). A chatbot is an application that can use an LLM to generate responses. A well-designed and well-implemented chatbot can and should be free of hallucination.

Take Alhena AI for example. Alhena AI is a chatbot system designed specifically to prevent hallucination and maximize the accuracy and relevance of every generative AI response. Alhena AI does use LLMs, but LLMs only make up 20% of the Alhena AI tech stack. The remaining 80% is focused on carefully managing the inputs to and outputs from the LLM to avoid hallucination and ensure relevance.

For a deeper dive into how hallucination works and what causes it, see our guide on chatbot hallucination.

AI Bias and Misinformation: Garbage In, Garbage Out

As the old saying goes in AI: garbage in, garbage out. Because LLMs consume large swathes of content from the public internet, and the public internet can be biased (sexist, racist, xenophobic, hetero-normative), LLMs and generative AI can expose unwitting consumers to bias and misinformation, according to Bender and Hanna.

Again, we need to draw a distinction between LLMs and well-designed chatbots. Raw generative models may in fact expose people to bias. If the majority of the content consumed by an LLM is sexist, then there is a greater likelihood that the next word predicted by that LLM will be sexist.

A well-designed chatbot model, however, should not use the entire internet as its knowledge base. It should use only a carefully curated knowledge base. As long as that knowledge does not contain biases or misinformation, a well-designed chatbot should have responses that are free from both.

For example, let's say an enterprise wants to build a chatbot to support its products and services. That chatbot should only consume documents about those specific products and services. A well-designed chatbot will only reflect the information contained in those documents, no more, no less. If product documentation says the product can process 1,200 transactions per second, the chatbot should say the same.

Similarly, if the product documents fed to a chatbot are void of gender, racial, or religious bias, its answers won't have any bias either. This is one of the key advantages of purpose-built AI concierges over general-purpose LLMs. The knowledge boundary is the safety boundary.

Will AI Create a Two-Tier Society? Why the Dystopian View Falls Short

The argument that generative AI will create a two-tier society, where the "have nots" must deal with inferior generative AI-based products while the "haves" hire live humans, rests on two assumptions:

Generative AI will lead only to inferior products and services
Humans will always outperform generative AI

First, we at Alhena believe that generative AI will lead to superior, not inferior products and services.

We've already argued above that generative AI (specifically delivered through chatbots) can be free from error and bias. On top of that, generative AI chatbots can be available 24/7/365, can chat in a hundred different languages, and can perform tasks that humans currently perform, often faster.

Especially in customer support, generative AI can support customers around the clock, in any language. It can dramatically decrease time to first response and, in many cases, decrease mean time to resolution as well. Higher deflection rates free up support teams for more complex inquiries.

Second, humans will not always outperform generative AI. If you hire a live personal assistant, you're subjected to that person's biases (both conscious and unconscious), their error rate, and their speed of execution. No human is perfect, and no human can process information as fast as a machine.

Wait time is everything. In healthcare, time spent waiting for an appointment negatively impacts not only patient satisfaction and perceived quality of care but also actual health outcomes. A 2023 study found that 30% of patients have left before seeing a doctor due to excessive wait times. Everyone hates waiting. In customer support, faster response times lead to more satisfied customers.

The data backs this up. Manawa, a travel and experience marketplace, deployed Alhena AI and cut response times from 40 minutes to under 1 minute while automating 80% of customer inquiries. Their CSAT didn't drop. It held steady because the AI answered accurately, not because customers lowered their expectations.

Generative AI as a Complement, Not a Replacement

Rather than replacing human workers, generative AI can and should be viewed as a tool to augment human capabilities. This is where the "AI kills jobs" argument misses the mark.

Generative AI can improve customer service efficiency, freeing up human workers to engage in more complex, creative, and meaningful tasks.

It can also be deployed in Agent Assist mode, where the AI automatically drafts email or text responses, and the human agent edits and personalizes them. The agent stays in control. The AI handles the repetitive work.

A Gartner forecast predicts that by 2029, AI will handle 80% of routine customer service interactions. But that same report notes that human agents will become more valuable, not less, because they'll focus on the high-stakes conversations that require empathy and judgment.

With appropriate training and upskilling, workers can use generative AI technologies to increase their productivity and job satisfaction. The right framing isn't "AI vs. humans". It's "humans with AI vs. humans without AI."

So, whom would you rather have as a personal assistant:

A generative AI alone?
A live human alone?
A live human who's proficient at using generative AI?

The answer, for most businesses, is clearly the third option.

How Alhena AI Addresses the Real Generative AI Challenges

The critics raise valid concerns. Hallucinations, bias, and job displacement are real risks of generative AI when the technology is deployed carelessly. But Alhena AI was built specifically to address these challenges in e-commerce.

Hallucination-free by design. Alhena grounds every response in verified product data from your catalog, knowledge base, and order management system. The LLM generates natural language, but the facts come from your data, not the open internet. Brands like Tatcha saw a 3x conversion rate and 82% chat deflection because shoppers got accurate answers, not hallucinated guesses.

Bias controlled at the source. Because Alhena only consumes your curated product documentation and policies, the bias risk drops to whatever exists in your own content. No internet-scraped training data. No demographic stereotyping. Just your product info, delivered accurately.

Humans stay in the loop. Alhena's Agent Assist mode keeps human agents on complex cases. The AI handles routine questions (order status, product specs, return policies) while agents focus on the conversations that need a human touch. Crocus hit an 86% deflection rate with 84% CSAT, proof that automation and quality aren't mutually exclusive.

Revenue, not just deflection. Most AI tools treat customer interactions as tickets to close. Alhena treats them as opportunities to sell. With agentic checkout that populates carts and pre-fills checkout, Victoria Beckham saw a 20% increase in average order value.

The Real Problems with Generative AI Are About Implementation, Not the Technology

Bender and Hanna aren't wrong to sound the alarm. Generative AI deployed irresponsibly does carry serious risks: hallucinated outputs, biased responses, eroded consumer trust, and displaced workers left without a path forward.

But the conclusion that businesses should "resist generative AI" throws out the good with the bad. The problems with generative ai are problems of implementation, not of the underlying technology. A chatbot built on raw, unguarded LLM access will hallucinate. One built on a curated knowledge base with output verification won't.

The generative AI considerations and challenges that matter in 2026 aren't theoretical. They're practical: choosing the right architecture, grounding AI in verified data, keeping humans in the loop, and measuring outcomes with real revenue attribution.

Businesses that get these choices right don't just avoid the risks. They gain a competitive edge. Tatcha generates 11.4% of total site revenue through AI-assisted conversations. Puffy resolves 63% of inquiries automatically while maintaining 90% CSAT.

The question isn't whether generative AI is good or bad. It's whether your implementation is.

Ready to see how a well-built AI handles the risks that matter? Book a demo with Alhena AI or start for free with 25 conversations.

Alhena AI

Schedule a Demo

Frequently Asked Questions

What are the biggest generative AI risks for businesses?

The top generative AI risks include hallucinations (factually incorrect outputs), bias from training data, security vulnerabilities, and potential workforce displacement. For ecommerce specifically, hallucinations are the most damaging because they can send wrong product information to thousands of customers per day. Purpose-built AI tools that ground responses in verified product data reduce these risks significantly.

How do you prevent AI hallucinations in customer-facing chatbots?

The most effective approach is restricting the AI's knowledge base to verified, curated content rather than the open internet. Alhena AI, for example, uses LLMs for language generation but sources all facts from your product catalog, policies, and order data. The LLM makes up only 20% of the stack, while the other 80% focuses on input/output verification to prevent hallucination.

Is generative AI good or bad for customer service?

Generative AI is neither inherently good nor bad for customer service. The outcome depends on implementation. Poorly deployed AI will hallucinate and frustrate customers. Well-designed AI, grounded in verified data, can cut response times from 40 minutes to under 1 minute (as Manawa achieved) while maintaining high customer satisfaction. The key is choosing tools built for accuracy, not just speed.

What are the disadvantages of generative AI in ecommerce?

The main disadvantages include hallucinated product information, potential bias in recommendations, inability to handle complex emotional situations, and the risk of over-automation without human oversight. These disadvantages are most pronounced with general-purpose AI tools. Ecommerce-specific platforms like Alhena AI address each one by design, using curated knowledge bases and human-in-the-loop workflows.

Will generative AI replace customer service jobs?

Generative AI is more likely to transform customer service roles than eliminate them. Gartner predicts AI will handle 80% of routine interactions by 2029, but human agents will become more valuable for complex, high-empathy conversations. Brands using Alhena AI typically redeploy support staff to higher-value work rather than reduce headcount. The hybrid model (AI plus humans) consistently outperforms either alone.

How does Alhena AI handle the problems with generative AI?

Alhena AI addresses generative AI challenges through three design principles: hallucination prevention (grounding every response in your verified product data), bias control (using only your curated knowledge base, not internet-scraped data), and human-in-the-loop architecture (Agent Assist mode keeps humans on complex cases). Real results include 82% deflection at Tatcha with a 3x conversion rate.

What is the difference between an LLM and a well-designed AI chatbot?

An LLM like GPT-4 or Claude is a language model that predicts the next word in a sequence. A chatbot is an application built on top of an LLM with additional layers for accuracy, safety, and task completion. A raw LLM will hallucinate because it lacks factual grounding. A well-designed chatbot manages inputs and outputs around the LLM to ensure responses are accurate and relevant.

What are the generative AI challenges specific to ecommerce?

Ecommerce faces unique generative AI challenges including hallucinated product specifications, incorrect pricing or availability information, biased product recommendations, inability to process transactions, and lack of integration with order management systems. Alhena AI solves these with purpose-built Product Expert and Order Management agents that connect directly to your ecommerce platform and helpdesk.