Cookbook: Customer Support¶
A tiered multi-agent customer support system that classifies queries, routes to specialist bots, uses cheap models for simple questions and expensive ones only for complex issues, and escalates to human agents when automated handling fails.
Patterns used: Supervisor, TalkerReasoner, HumanInTheLoop, BoundedExecution, GuardrailChain, RouterMiddleware
Architecture¶
flowchart TD
C[Customer Query] --> G[Input Guardrails\nPII redaction]
G --> S[Supervisor\nClassify intent]
S -->|billing| B[TalkerReasoner\nBilling Bot]
S -->|technical| T[TalkerReasoner\nTech Support]
S -->|account| A[TalkerReasoner\nAccount Bot]
S -->|escalate| H[Human-in-the-Loop]
B -->|easy| BR1[Fast Response\ngpt-4o-mini]
B -->|complex| BR2[Deep Response\nclaude-sonnet]
T -->|easy| TR1[Quick Fix\ngpt-4o-mini]
T -->|complex| TR2[Full Debug\nclaude-sonnet]
H --> HR[Human Agent\nvia ticket queue]
Implementation¶
import asyncio
from pyagent_patterns.base import Agent, Message
from pyagent_patterns.orchestration import Supervisor
from pyagent_patterns.advanced import TalkerReasoner, HumanInTheLoop
from pyagent_patterns.advanced.human_in_the_loop import HumanDecision
from pyagent_patterns.recovery import BoundedExecution
from pyagent_patterns.guardrails import GuardrailChain, PIIGuard, LengthGuard
from pyagent_router.middleware import RouterMiddleware
from pyagent_providers import AnthropicLLM, OpenAILLM
# ── LLMs ──────────────────────────────────────────────────────────────────────
fast_llm = OpenAILLM("gpt-4o-mini")
smart_llm = AnthropicLLM("claude-sonnet-4-20250514")
model_registry = {
"gpt-4o-mini": fast_llm,
"claude-sonnet-4-20250514": smart_llm,
}
router = RouterMiddleware(model_registry=model_registry)
# ── Guardrails ─────────────────────────────────────────────────────────────────
input_guard = GuardrailChain([
LengthGuard(max_chars=3_000, truncate=True),
PIIGuard(redact=True), # protect customer PII in logs
])
# ── Tier 1: Billing bot (TalkerReasoner) ─────────────────────────────────────
billing_bot = TalkerReasoner(
talker=router.wrap(
Agent("billing_fast", fast_llm,
system_prompt=(
"You are a billing support agent. Answer quickly and clearly. "
"You handle: invoice questions, payment methods, refund policies, "
"subscription changes, and pricing. Be concise — 2-3 sentences max."
)),
),
reasoner=router.wrap(
Agent("billing_deep", smart_llm,
system_prompt=(
"You are a senior billing specialist. Handle complex cases: "
"disputed charges, partial refunds, multi-seat subscription adjustments, "
"enterprise billing questions. Provide step-by-step resolution."
)),
),
handoff_threshold=5, # difficulty ≥ 5 goes to reasoner
)
# ── Tier 2: Technical support bot ────────────────────────────────────────────
tech_bot = TalkerReasoner(
talker=router.wrap(
Agent("tech_fast", fast_llm,
system_prompt=(
"You are a tech support agent. Handle common issues: login problems, "
"password resets, browser compatibility, basic integrations. "
"Give step-by-step instructions."
)),
),
reasoner=router.wrap(
Agent("tech_deep", smart_llm,
system_prompt=(
"You are a senior engineer doing technical support. Handle complex issues: "
"API integration failures, webhook debugging, performance problems, "
"data sync issues, custom configurations. Ask clarifying questions if needed."
)),
),
handoff_threshold=4,
)
# ── Tier 3: Account management bot ───────────────────────────────────────────
account_bot = TalkerReasoner(
talker=router.wrap(
Agent("account_fast", fast_llm,
system_prompt=(
"You are an account support agent. Handle: username changes, "
"email updates, team member management, permissions, and SSO setup."
)),
),
reasoner=router.wrap(
Agent("account_deep", smart_llm,
system_prompt=(
"You are a senior account specialist. Handle: GDPR data requests, "
"account mergers, complex permission structures, enterprise SSO, "
"data export and deletion requests."
)),
),
handoff_threshold=6,
)
# ── Tier 4: Human escalation ──────────────────────────────────────────────────
def queue_human_review(output: str, metadata: dict) -> HumanDecision:
"""Route to human agent via your support queue (e.g. Zendesk, Linear)."""
ticket_id = _create_support_ticket(
summary=output[:200],
priority="high" if "urgent" in output.lower() else "normal",
metadata=metadata,
)
print(f"[TICKET CREATED] #{ticket_id}")
# Return a holding response while human picks it up
return HumanDecision(
approved=True,
modified_output=(
f"I've escalated your issue to our specialist team. "
f"Ticket #{ticket_id} has been created. "
f"You'll hear back within 2 business hours."
),
)
human_handler = HumanInTheLoop(
agent=router.wrap(
Agent("human_prep", fast_llm,
system_prompt=(
"Prepare a concise summary for the human support agent: "
"1. Customer issue (1 sentence) "
"2. What was already tried "
"3. Recommended action"
)),
),
review_fn=queue_human_review,
high_risk_keywords=["legal", "lawsuit", "fraud", "hacked", "emergency"],
)
# ── Classifier (Supervisor routing) ──────────────────────────────────────────
classifier = Agent(
"classifier", fast_llm,
system_prompt=(
"Classify customer support queries into exactly one category. "
"Reply with ONLY the category name.\n"
"Categories:\n"
" billing — payments, invoices, refunds, subscriptions\n"
" technical — bugs, errors, API, integrations, performance\n"
" account — login, permissions, team, SSO, data requests\n"
" escalate — angry customers, legal threats, complex edge cases, "
" anything you're unsure about"
),
)
supervisor = Supervisor(
classifier=classifier,
routes={
"billing": billing_bot,
"technical": tech_bot,
"account": account_bot,
"escalate": human_handler,
},
)
# ── Recovery wrapper ──────────────────────────────────────────────────────────
safe_supervisor = BoundedExecution(
pattern=supervisor,
fallback=Agent(
"fallback_agent", fast_llm,
system_prompt=(
"You are a helpful support agent. A technical issue occurred with our "
"routing system. Apologise briefly and ask the customer to try again "
"or contact support@example.com."
),
),
max_retries=2,
timeout_seconds=25.0,
)
# ── Main handler ──────────────────────────────────────────────────────────────
async def handle_query(customer_query: str) -> dict:
# Guardrail check
check = input_guard.check(customer_query)
if not check.passed:
return {"response": "I'm sorry, I couldn't process your message.", "blocked": True}
safe_query = check.sanitized_content or customer_query
# Run through support workflow
result = await safe_supervisor.run(safe_query)
return {
"response": result.output,
"category": result.metadata.get("route"),
"model_used": result.metadata.get("routed_model", "unknown"),
"recovery_level": result.metadata.get("recovery_level", 0),
"escalated": result.metadata.get("route") == "escalate",
}
def _create_support_ticket(summary: str, priority: str, metadata: dict) -> str:
# Replace with your ticketing system integration
return "SUP-" + str(hash(summary))[-6:]
# ── Run it ────────────────────────────────────────────────────────────────────
if __name__ == "__main__":
queries = [
"Why was I charged twice this month?",
"My API webhooks stopped firing after the latest update",
"I need to delete all my data immediately under GDPR",
"This is unacceptable, I'm calling my lawyer",
]
async def demo():
for query in queries:
print(f"\nQuery: {query}")
result = await handle_query(query)
print(f"Category: {result['category']}")
print(f"Model: {result['model_used']}")
print(f"Response: {result['response'][:120]}...")
asyncio.run(demo())
Expected Output¶
Query: Why was I charged twice this month?
Category: billing
Model: gpt-4o-mini ← easy billing question, fast model
Response: This can happen if two subscriptions are active simultaneously or if a
payment retry was processed. To confirm, please check your invoices at
Settings → Billing → Invoice history...
Query: My API webhooks stopped firing after the latest update
Category: technical
Model: claude-sonnet-4-20250514 ← complex technical issue, smart model
Response: Webhook failures after an update are often caused by: 1) SSL certificate
changes, 2) payload format changes in the new version, 3) endpoint timeout
thresholds. Let's debug this step by step: First, check your webhook logs...
Query: I need to delete all my data immediately under GDPR
Category: account
Model: claude-sonnet-4-20250514 ← GDPR = high difficulty, smart model
Response: We take GDPR requests seriously. Here's the process: 1) Submit a formal
Data Subject Access Request at privacy.example.com/dsar. 2) We'll confirm
receipt within 24 hours. 3) Deletion completes within 30 days per GDPR Art. 17...
Query: This is unacceptable, I'm calling my lawyer
Category: escalate
Model: gpt-4o-mini + human
Response: I've escalated your issue to our specialist team. Ticket #SUP-847291 has
been created. You'll hear back within 2 business hours.
[TICKET CREATED] #SUP-847291
Customisation¶
Add a knowledge base tool¶
from pyagent_patterns.advanced import ReAct
def search_kb(query: str) -> str:
# Search your Confluence/Notion/Help Center
return kb_client.search(query, top_k=3)
tech_bot_with_kb = ReAct(
agent=Agent("tech_kb", smart_llm, system_prompt="Answer using the knowledge base."),
tools={"search_kb": search_kb},
max_steps=3,
)
Multi-turn conversation¶
conversation_history: list[Message] = []
async def chat(user_message: str) -> str:
conversation_history.append(Message.user(user_message))
result = await safe_supervisor.run(conversation_history)
conversation_history.append(Message.assistant(result.output))
return result.output
SLA-based routing¶
def route_by_sla(query: str, customer_tier: str) -> str:
if customer_tier == "enterprise":
return "escalate" # always get human for enterprise
if "urgent" in query.lower() or "down" in query.lower():
return "escalate"
return None # let the classifier decide
Cost Profile¶
| Query type | Typical model | Avg cost | Volume (1k/day) |
|---|---|---|---|
| Simple billing | gpt-4o-mini | $0.0003 | $300/mo |
| Complex billing | claude-sonnet | $0.003 | — |
| Simple tech | gpt-4o-mini | $0.0003 | $300/mo |
| Complex tech | claude-sonnet | $0.004 | — |
| GDPR / legal | claude-sonnet | $0.005 | — |
| Human escalation | gpt-4o-mini + human | $0.0003 + agent time | — |
| Blended average | mix | ~$0.001 | ~$900/mo |
Routing saves ~70% vs always using claude-sonnet for everything.