Why a Planner‑Led Multi‑Agent System Leaves the One‑Agent Model Behind

Why a Planner‑Led Multi‑Agent System Leaves the One‑Agent Model Behind

Deep-dive into a multi-agent LLM architecture (Planner, Orchestrator, Specialized Agents and ChatState) transforming e-commerce with specialized AI for support, sales, and CX.


Introduction

Modern AI-powered e-commerce shopping experience requires instant answers, personalized product recommendations, seamless order management, and empathetic support. While a single Large Language Model (LLM) can initially seem capable, it quickly becomes an unmanageable tangle of guidelines and tools as complexity increases.

We at Alhena have evolved beyond the one-agent-does-everything design. We've developed a multi-agent LLM architecture that mirrors a highly effective real-world retail team: product specialists, support representatives, back-office experts, and operations managers, all coordinated by an intelligent orchestration layer. This article offers a deep dive into this architecture, the key agents involved, and the transformative results we've observed in a demanding e-commerce environment.

Related reading: Multi-Model, Multi-Agent: Why Modern Ecommerce AI Needs Both


The Growing Pains of a Monolithic "One Big Agent"

The initial appeal of a single, all-encompassing AI is understandable. However, in the dynamic world of e-commerce, this approach quickly encounters significant hurdles:

Pain Point Symptoms Why It Happens
Instruction Overload
  • Inconsistent answers on similar questions
  • Occasional contradictions with earlier replies
  • Important instructions forgotten mid-conversation
  • One massive prompt must cover SKUs, policies, promos, sizing guides, lead capture and more
  • LLM struggles to pick the right directive from hundreds of possibilities in real time
  • Dynamic filtering of instructions per user intent is not straight-forward
Tool Confusion
  • Wrong tool called (adds item to cart while user just wants more information)
  • Forgets to call a tool and returns generic answer
  • Long, unsorted function list presented to the LLM
  • No scoped permissions; everything looks available even if irrelevant
  • Choosing the best tool becomes a probability guess instead of a clear decision
Conflicting Objectives
  • Upsell offer appears while customer is angry about late shipment
  • Empathy drops when sales goal interjects
  • Return policy advice mixed with promotional language
  • Sales, support, and retention goals live in one brain with no priority framework
  • LLM cannot switch personas cleanly mid-thread
  • Lack of contextual routing forces everything through the same response funnel

Analogy – The Overstretched All-Rounder

Picture an exceptionally talented employee who is simultaneously expected to:

  • lead product design
  • code the backend checkout pipeline
  • run the warehouse floor
  • answer customer calls
  • close high-value sales

Even a superstar would burn out, miss details, and disappoint customers because no single person can give every role the focus it deserves. A monolithic LLM that has to perform every e-commerce function at once faces the same overload, leading to mistakes and inconsistent service.

Overwhelmed solo worker juggling tasks vs. calm team handling roles individually.
One overstretched worker vs. a focused team—why task specialization wins.

Our Multi-Agent Architecture: A Deep Dive

To overcome these challenges, we adopted a system where tasks are planned and delegated to distinct, purpose-built agents.

Core Components

  1. PlannerAgent - The Strategist
    • When a user sends a message, our PlannerAgent is the first stop.
    • It reads the entire conversation plus a registry of available agents, their skills and available tools.
    • Its primary role is to devise a Plan – a structured JSON output that breaks down the user's request into one or more specific tasks, assigning each task to the most appropriate specialized agent.
  1. ChatOrchestrator - The Conductor
    • This central component takes the Plan from the PlannerAgent and iteratively executes the plan.
    • It instantiates each required agent (via an Agent Factory which knows about all available agent types)
    • Attaches any dynamic hand-offs through AgentHandoffManager (e.g., to a HumanTransferAgent).
    • Streams the plan execution status to the end user.
    • Instantiates and updates the ChatState as per the state of plan execution.
  1. ChatState - The Shared Memory
    • This is critical. ChatState acts as the single source of truth for the entire conversation.
    • It holds the full chat transcript (messages), the current plan, a pointer to the active agent task, and a chronological output events log (PLAN, MESSAGE, TOOL CALL, HANDOFF).
    • It keeps the common context to be used by each agent cached. Example: brand voice guidelines, user information, business hours, and retrieved knowledge so every agent sees the same ground truth.
    • Because ChatState is updated by the ChatOrchestrator after each agent turn, the next specialist always receives an up-to-date state, eliminating context drift.
  1. Specialized Agents - The Actors
    • Specialist agents with concrete roles like: General Support Agent, Shopping Assistant Agent, Human Transfer Agent, Order Management Agent etc.
    • All specialist agents inherit from a common BaseAgent that assembles:
      • A role-specific system prompt (built from templates + brand voice + agent specific instructions).
      • The tools that agent is allowed to call.
      • Handoffs to other agents.
    • Each specialized agent speaks directly to the user and its output is merged back into ChatState.messages and logged as an output event.

The Anatomy of ChatState

Our ChatState object is pivotal for seamless inter-agent collaboration and consistent user experience. It is instantiated once per conversation and then passed—by reference, not by copy—to every component that needs it.

What ChatState Stores Why It Matters
messages
Full conversation transcript
  • Gives every new agent complete historical context
  • Powers PlannerAgent routing & follow-up logic
  • Persisted for auditing and CX analytics
Plan
JSON plan from Planner Agent
  • Tells Orchestrator which agent to invoke next
  • Visible to specialists so they stay within scope
  • Enables UI streaming of “what’s happening now”
Agent Task
Pointer to task in progress
  • Lets the active agent read its exact mandate
  • Drives status messages like “Working on sizing…”
  • Prevents duplicate work
Events
PLAN / MESSAGE / TOOL CALL / HANDOFF log
  • Each event can be put as a system message to keep each agent informed
  • Important for observability and tracing
Contextual Knowledge
  • Single authoritative fact base for every agent
  • Prevents hallucinations and contradictory answers
  • Prevents re-fetching of context by each specialist agent
Brand Voice
  • Centralised tone and identity policy
  • Every agent reflects one consistent personality
  • Removes duplication across prompts
Collaborating Agents Cards
  • Planner uses cards to pick the best specialist
  • Included in each agent’s background for teamwork awareness
  • Supports future dynamic agent discovery
User Information
  • Stores preferences, name, email etc.
  • Enables instant personalisation by any agent
Human Transfer Scenarios
  • Concrete checklist of when to escalate
  • Keeps human agents focused on high-value cases

The Anatomy of ChatOrchestrator

A common pitfall in LLM architectures is to recursively let models call other models. That approach feels “autonomous” but quickly becomes non-deterministic, opaque, and hard to govern. Our ChatOrchestrator purposely sits between every model invocation and is more of a deterministic code execution layer.

Step Orchestrator Action Importance
Parallel Initialization
  • Uses asyncio.gather to run two tasks at once:
    • KnowledgeRetriever → fetch contextual docs
    • PlannerAgent → build a JSON Plan
  • Reduces cold-start latency
  • Hides IO wait behind planning time
Streaming Task Status
  • Sends the task’s status line to the user
  • Runs before actual agent execution
  • Immediate feedback to the user builds trust
  • Clarifies multi-step workflows in the UI
Agent Execution
(Streaming or Standard)
  • Instantiates agent via AgentFactory
  • Executes each agent sequentially
  • Answers feel more organized
  • More engaging, chat-like UX
Add Results to ChatState
  • Takes output from the agent
  • Pushes new items into messages & output events
  • Captures TOOL CALL and HANDOFF events as well
  • Keeps global context fresh for subsequent agents
  • Creates an auditable event timeline

How Agents Understand "Who Said What"

To prevent agents from repeating work or contradicting each other, it’s crucial they know the origin of previous assistant messages if multiple agents have contributed. Our system handles this transparently.

When the ChatState prepares the conversation history as input for the next LLM call (for a subsequent agent or the same agent in a new turn), it injects a system note if the previous assistant message was from a different named agent.
This would look like this in the prompt history sent to the next agent:

User: Do you have blue shirts?
Assistant: Yes, we have several blue t-shirts...
System: [System Note] The assistant message above was authored by ShoppingAssitantAgent.

When an agent (e.g., Shopping Assistant Agent) generates a response, it's recorded in ChatState output events as a MESSAGE event, which includes the agent name.
Example snippet from output events:

{
  "event_type": "MESSAGE",
  "event_data": {
    "role": "assistant",
    "content": "Yes, we have several blue t-shirts...",
    "agent_name": "ShoppingAssistantAgent" // Clearly attributed
  }
}

Agents often need to interact with external systems (e.g., your product database, order management system). Example TOOL CALL event in output events:

{
  "event_type": "TOOL_CALL",
  "event_data": {
    "agent_name": "OrderManagementAgent",    // Agent that invoked the tool
    "tool_name": "get_order_status_api",     // Name of the tool called
    "tool_output": "{'status': 'Shipped', 'tracking_id': 'XYZ123'}" // Result from the tool
  }
}

This simple mechanism allows subsequent agents to understand the conversational context fully, respecting prior specialized contributions without complex explicit "handoff messages" between agents in the prompt.


Our Team of Specialized E-commerce Agents

Here's how some of our key e-commerce agents are defined and what they do:

Agent Primary Goal in E-commerce Typical Tools Leveraged Example AgentTask
Product Expert Agent
  • Recommend products
  • Compare features
  • Answer detailed product questions
Semantic Product Search
Read Full Document
“Handle this part: ‘I need a breathable blue t-shirt under $40.’”
General Support Agent
  • Answer policy questions
  • Handle FAQs & general inquiries
Knowledge-base search tool “Handle this part: ‘How long do I have to return an item?’”
Order Management Agent
  • Track shipments
  • Update order details
Order API client
Shipping API client
“Handle this part: ‘Can you change my shipping address to 15 Market St?’”
Return Management Agent
  • Guide users through returns
  • Check eligibility & process RMAs
Returns API client
OMS integration
“Handle this part: ‘I received the wrong size and want to return it.’”
Product Upsell Agent
  • Suggest complementary items
  • Offer higher-value alternatives
Upsell engine
Product association rules
“Handle this part: ‘(After item added) and find matching socks.’”
Lead Generation Agent
  • Capture contact details
  • Handle B2B or out-of-stock alerts
Email validator
CRM integration
“Handle this part: ‘Notify me when your XL black hoodie is back in stock.’”
Human Transfer Agent
  • Collect details for escalation
  • Create support tickets for human agents
Ticket-creation tool
User-info form tool
“Handle this part: ‘Your system double-charged me for this order!’”

Crucially, all agents inherently use the brand voice, refusal messages, and other global settings stored in ChatState, ensuring a consistent personality.

On-boarding a new specialist is friction-less: just define its name, skills, and permitted tools into the database. The Planner automatically picks up the new agent's card and routes relevant requests its way—no extra code required.


Example Session: "Track My Package and Recommend Shoes"

Let's see how this comes together.

🗣️ Customer: “Where’s my order #123, and by the way, do you have any red sneakers in men's size 9?”
  1. ChatOrchestrator receives the message. Triggers KnowledgeRetriever to fetch general info about sneakers and PlannerAgent to create a plan.
  2. PlannerAgent analyzes: Recognizes two distinct intents and creates a plan.
  3. ChatOrchestrator begins execution:
    • Streams to User: "Checking on the status of your order #123..."
    • Invokes OrderManagementAgent with its task. Agent likely uses a tool to call an Order API. Result: "Order #123 shipped yesterday, tracking ID is TRK456." This is added to ChatState.
    • Streams to User: "\n\n—\n\nLooking for red sneakers in men's size 9 for you..."
    • Invokes Shopping Assistant Agent. Agent uses SemanticProductSearch tool with "red sneakers men's size 9". Result: "Found 3 matching pairs: [Details of Shoe A, Shoe B, Shoe C]." This is also added to ChatState.
  4. Final Unified Reply (composed from output events by ChatState):
    "Okay, for your order #123, it shipped yesterday and the tracking ID is TRK456.
    For red sneakers in men's size 9, I found these options:
    • The 'Speedster Red Edition' - Great for running.
    • The 'Casual Crimson Loafer' - Perfect for everyday wear.
    • The 'Ruby Runner Pro' - Top reviews for comfort."

Implementation Tips & Key Pitfalls to Avoid

Below are the practical lessons that saved us countless engineering hours (and a few grey hairs). Use them as a checklist before you scale a planner-led, multi-agent stack.

  1. Code First Approach: Fancy agentic framework libraries (langgraph, llamaindex etc.) hide bugs and limit control. Write the orchestration yourself; we have written most of it ourselves with slight usage of the code-first OpenAI Agents SDK.
  2. Invest in prompts: A great Planner prompt and clear specialist prompts solve 80 % of routing mistakes. Tweak them often.
  3. MCP for tooling: Use Model Context Protocol servers for tool execution and hosting. An article coming on MCP in the future.
  4. Trace every event: Record PLAN, MESSAGE, TOOL CALL, and HANDOFF with timestamps. Debugging later will be painless.

Conclusion: Specialization and Orchestration is the Future of Ecommerce CX

By moving from a single, overburdened AI to an orchestrated team of focused agents, we've unlocked a new level of capability, efficiency, and adaptability for our e-commerce interactions. This modular architecture not only resolves the pain points of monolithic systems but also empowers us to rapidly develop and integrate new specialized agents (e.g., a WarrantyClaimAgent or a ProactiveCartRecoveryAgent) with minimal disruption to the existing ecosystem.

At Alhena, we are leading the industry in transforming the customer support using an Agentic AI architecture. Implementing Multi-Agent systems is inherently a complex problem. Instead of building it from scratch, consider leveraging solutions from experts like Alhena.

Power Up Your Store with Revenue-Driven AI