pyagent-router¶
Automatic model routing based on task difficulty and cost — score each task 1-10, pick the cheapest model that can handle it, and route calls transparently without changing your agent code.
Why Routing Matters¶
Running every task through gpt-4o or claude-sonnet is the easiest default — and the most expensive one. Most tasks in a real workflow are easy.
"What is 2+2?" → difficulty 1 → gpt-4.1-nano ($0.000001)
"Summarise this email" → difficulty 3 → gpt-4o-mini ($0.000015)
"Design a distributed → difficulty 8 → claude-sonnet ($0.000180)
consensus algorithm"
Routing the first two tasks to cheaper models saves 90–99% of cost on those calls. Over 10,000 calls per day that compounds fast.
Architecture¶
flowchart LR
T[Task text] --> DS[DifficultyScorer\n1–10 score]
DS --> MS[ModelSelector\nfilter by capability + cost]
MS --> R[RoutedAgent\ncall optimal LLM]
R --> L[routing_log]
subgraph CostEstimator
P[Pricing registry\n$/1M tokens]
end
MS --> P
DifficultyScorer¶
Score any text 1-10 using heuristics — no LLM call required.
from pyagent_router.scorer import DifficultyScorer
scorer = DifficultyScorer()
easy = scorer.score("What is the capital of France?")
print(easy.score) # 2
print(easy.category) # "easy"
print(easy.is_easy) # True
hard = scorer.score(
"Design a Byzantine fault-tolerant consensus protocol that "
"achieves sub-second finality under 33% adversarial nodes. "
"Prove safety and liveness properties formally."
)
print(hard.score) # 9
print(hard.is_hard) # True
print(hard.signals) # {"length": 0.4, "complexity_keywords": 0.9, ...}
Score ranges¶
| Range | Category | Examples |
|---|---|---|
| 1–3 | easy | Factual lookups, simple arithmetic, translations |
| 4–6 | medium | Summaries, code explanations, comparisons |
| 7–10 | hard | System design, proofs, multi-step reasoning, synthesis |
CostEstimator¶
Estimate cost before a call and compare across models.
from pyagent_router.estimator import CostEstimator
estimator = CostEstimator()
# Estimate cost for a specific model
estimate = estimator.estimate("gpt-4o-mini", input_tokens=1_000, output_tokens=500)
print(f"${estimate.total_cost:.6f}") # $0.000225
# Compare across models to find the cheapest fit
estimates = estimator.compare("Explain async/await in Python",
models=["gpt-4.1-nano", "gpt-4o-mini", "gpt-4o"])
for e in estimates:
print(f"{e.model:20} ${e.total_cost:.6f}")
# gpt-4.1-nano $0.000003
# gpt-4o-mini $0.000023
# gpt-4o $0.000187
ModelSelector¶
Combines DifficultyScorer and CostEstimator — pick the cheapest model that meets difficulty and capability requirements.
from pyagent_router.selector import ModelSelector, Capability
selector = ModelSelector()
# Auto-select based on difficulty
result = selector.select("What is 2+2?")
print(result.model) # "gpt-4.1-nano"
print(result.difficulty) # DifficultyScore(score=1, ...)
print(result.estimate) # CostEstimate(total_cost=...)
# Require a specific capability
result = selector.select(
"Write a Python function to implement Dijkstra's algorithm",
required_capability=Capability.CODE,
)
print(result.model) # "gpt-4o-mini" or better (has CODE capability)
Capabilities¶
from pyagent_router.selector import Capability
Capability.GENERAL # default, all models
Capability.CODE # code generation and review
Capability.MATH # numerical and symbolic reasoning
Capability.REASONING # multi-step logical reasoning
Capability.CREATIVE # creative writing and ideation
Capability.VISION # image understanding (multimodal)
Custom model specs¶
from pyagent_router.selector import ModelSelector, ModelSpec, Capability
selector = ModelSelector(specs=[
ModelSpec("my-cheap-model", min_difficulty=1, max_difficulty=4,
capabilities={Capability.GENERAL}, max_context=32_000),
ModelSpec("my-smart-model", min_difficulty=3, max_difficulty=10,
capabilities={Capability.GENERAL, Capability.CODE, Capability.REASONING},
max_context=200_000),
])
RouterMiddleware¶
Wrap any agent to route each call automatically — zero changes to the agent or the caller.
import asyncio
from pyagent_patterns.base import Agent, Message
from pyagent_patterns.orchestration import Pipeline
from pyagent_providers import AnthropicLLM, OpenAILLM
from pyagent_router.middleware import RouterMiddleware
from pyagent_router.selector import Capability
# Build a model registry mapping name → LLM callable
model_registry = {
"gpt-4.1-nano": OpenAILLM("gpt-4.1-nano"),
"gpt-4o-mini": OpenAILLM("gpt-4o-mini"),
"gpt-4o": OpenAILLM("gpt-4o"),
"claude-sonnet-4-20250514": AnthropicLLM("claude-sonnet-4-20250514"),
}
middleware = RouterMiddleware(model_registry=model_registry)
# Wrap individual agents
agent = Agent("analyst", OpenAILLM("gpt-4o"),
system_prompt="Analyse the given data.")
routed = middleware.wrap(agent)
result = asyncio.run(routed.run([Message.user("What is revenue growth?")]))
print(result.metadata["routed_model"]) # "gpt-4o-mini" (easy task)
print(routed.routing_log[-1]) # SelectionResult(model=..., difficulty=...)
Require a capability per agent¶
# Wrap agents with different capability requirements
code_agent = middleware.wrap(
Agent("coder", OpenAILLM("gpt-4o"), system_prompt="Write production Python code."),
required_capability=Capability.CODE,
)
vision_agent = middleware.wrap(
Agent("vision", OpenAILLM("gpt-4o"), system_prompt="Describe this image."),
required_capability=Capability.VISION,
)
Wrap a full pipeline¶
pipeline = Pipeline(stages=[
middleware.wrap(extractor_agent),
middleware.wrap(analyst_agent),
middleware.wrap(writer_agent),
])
result = asyncio.run(pipeline.run(document))
Routing Log¶
Every routed call is recorded — useful for cost analysis and debugging.
routed = middleware.wrap(agent)
# Run several tasks
for task in tasks:
await routed.run([Message.user(task)])
# Inspect routing decisions
for entry in routed.routing_log:
print(f"{entry.model:25} difficulty={entry.difficulty.score} "
f"cost=${entry.estimate.total_cost:.6f}")
# Aggregate cost savings
actual_cost = sum(e.estimate.total_cost for e in routed.routing_log)
premium_cost = sum(
CostEstimator().estimate("gpt-4o", e.estimate.input_tokens,
e.estimate.output_tokens).total_cost
for e in routed.routing_log
)
print(f"Saved: ${premium_cost - actual_cost:.4f} "
f"({(1 - actual_cost/premium_cost):.0%} reduction)")
Integration with ProviderRouter¶
pyagent-router handles model selection (which model to use). pyagent-providers' ProviderRouter handles provider selection (which API endpoint to use). They compose naturally:
from pyagent_router.middleware import RouterMiddleware
from pyagent_providers.router import ProviderRouter, RoutingStrategy
from pyagent_providers.registry import ProviderRegistry
# 1. Route to the right provider (lowest cost, best health)
provider_router = ProviderRouter(registry, strategy=RoutingStrategy.COST_FIRST)
# 2. Route to the right model (based on task difficulty)
model_middleware = RouterMiddleware(model_registry=model_registry)
agent = model_middleware.wrap(
Agent("analyst", llm_from_provider(provider_router))
)
See Also¶
- Patterns Package —
Agent,Pipeline,FanOutFanInand all orchestration patterns - Providers Package —
ProviderRouterfor provider-level routing and fallback - API Reference