Skip to content

pyagent-context

Three-tier memory with trust-aware context ledger — working memory for a single run, session memory across turns, and semantic memory for long-term recall. Agents share context without overloading LLM context windows.

pip install pyagent-context                   # Core (working + session memory)
pip install pyagent-context[compress]         # + ContextCompressor
pip install pyagent-context[chromadb]         # + ChromaDB semantic backend

Architecture

flowchart TD
    A[Agent] --> WM[WorkingMemory\nbounded deque, evicts oldest]
    A --> SM[SessionMemory\nJSON or SQLite, persists across turns]
    A --> SEM[SemanticMemory\nChromaDB, vector similarity search]
    WM --> CL[ContextLedger\nappend-only log, query by trust]
    SM --> CL
    SEM --> CL
    CL --> MSG[to_messages(max_tokens=)\nconvert to LLM message list]

Three tiers:

Tier Class Scope Backend
Working WorkingMemory Single pattern run In-memory deque
Session SessionMemory Multi-turn conversation JSON file or SQLite
Semantic SemanticMemory Long-term recall ChromaDB (vector)

Core Concept — ContextItem

Every piece of memory is a ContextItem — content plus trust metadata.

from pyagent_context.item import ContextItem, TrustLevel

item = ContextItem(
    content="Customer prefers email communication, timezone UTC+5:30",
    source="crm_agent",
    trust_level=TrustLevel.VERIFIED,   # VERIFIED > INFERRED > SPECULATIVE
)

print(item.id)              # UUID
print(item.token_estimate)  # rough token count
print(item.timestamp)       # creation time (float)
print(item.to_dict())       # serialise to JSON-compatible dict

TrustLevel controls what agents can rely on:

Level Meaning Example
VERIFIED Confirmed by a reliable source DB lookup, user-provided
INFERRED Derived by an agent — probably right LLM classification result
SPECULATIVE Agent's guess — use with caution Predicted intent

Tier 1 — WorkingMemory

Bounded deque for a single run. Automatically evicts the oldest items when capacity (count or tokens) is exceeded.

from pyagent_context.memory.working import WorkingMemory
from pyagent_context.item import ContextItem, TrustLevel

wm = WorkingMemory(max_items=50, max_tokens=20_000)

# Add items — eviction happens automatically
wm.add(ContextItem(content="User asked about Q3 revenue", source="input"))
wm.add(ContextItem(content="Extractor found: $25.2B revenue", source="extractor_agent"))
wm.add(ContextItem(content="Fact check: VERIFIED", source="fact_checker", trust_level=TrustLevel.VERIFIED))

print(f"Items: {len(wm)}")
print(f"Tokens: {wm.total_tokens}")
print(f"Utilisation: {wm.utilization:.1%}")

# Eviction returns what was dropped
evicted = wm.add(ContextItem(content="new item", source="agent_x"))
if evicted:
    print(f"Evicted {len(evicted)} item(s) to stay within budget")

Use in a Pipeline stage

import asyncio
from pyagent_patterns.base import Agent, Message
from pyagent_patterns.orchestration import Pipeline
from pyagent_context.memory.working import WorkingMemory
from pyagent_context.item import ContextItem
from pyagent_providers import AnthropicLLM

wm = WorkingMemory(max_tokens=10_000)

async def run_with_memory(task: str) -> str:
    # Seed working memory with prior context
    wm.add(ContextItem(content=f"User goal: {task}", source="user"))

    pipeline = Pipeline(stages=[
        Agent("extractor", AnthropicLLM("claude-haiku-3-5-20241022"),
              system_prompt="Extract key claims and figures."),
        Agent("analyst",   AnthropicLLM("claude-sonnet-4-20250514"),
              system_prompt="Analyse the extracted data and give a recommendation."),
    ])

    result = await pipeline.run(task)

    # Store the result back into working memory
    wm.add(ContextItem(content=result.output, source="pipeline", trust_level=ContextItem.INFERRED))
    print(f"Memory: {len(wm)} items, {wm.total_tokens} tokens")
    return result.output

asyncio.run(run_with_memory("Analyse Tesla Q3 2025 earnings"))

Tier 2 — SessionMemory

Persists across turns — store conversation history so agents remember prior interactions.

from pyagent_context.memory.session import SessionMemory
from pyagent_context.item import ContextItem, TrustLevel

# JSON backend (simple, default)
session = SessionMemory(session_id="user-42-support-ticket-991", backend="json")

# Add items during turn 1
session.add(ContextItem(content="User reported: API returns 429 on /v2/search", source="user_turn_1"))
session.add(ContextItem(content="Agent diagnosed: rate limit exceeded, quota 1000/min", source="agent_turn_1",
                        trust_level=TrustLevel.INFERRED))
session.save()   # persist to disk

# ---- next turn / new process ----

session2 = SessionMemory(session_id="user-42-support-ticket-991", backend="json")
history = session2.get_all()   # load all prior items

print(f"Prior turns: {len(history)} items")
for item in history:
    print(f"  [{item.trust_level}] {item.source}: {item.content[:60]}")

SQLite backend for concurrent multi-agent access

# Multiple agents writing to the same session safely
session = SessionMemory(
    session_id="team-analysis-run-001",
    backend="sqlite",
    storage_path=".sessions",
)

session.add(ContextItem(content="bull_agent: Strong revenue growth supports premium", source="bull_agent"))
session.add(ContextItem(content="bear_agent: P/E 45x unsustainable", source="bear_agent"))
session.save()

Tier 3 — ContextLedger

Append-only log with trust-aware querying and token-budget-aware message conversion.

from pyagent_context.ledger import ContextLedger
from pyagent_context.item import ContextItem, TrustLevel

ledger = ContextLedger()

# Add items with different trust levels
ledger.add("User is a senior developer", source="profile_agent", trust_level=TrustLevel.VERIFIED)
ledger.add("User prefers concise answers", source="style_inference", trust_level=TrustLevel.INFERRED)
ledger.add("User may be frustrated", source="sentiment_agent", trust_level=TrustLevel.SPECULATIVE)

# Query by trust — only act on verified and inferred context
reliable = ledger.query(min_trust=TrustLevel.INFERRED)
print(f"Reliable context items: {len(reliable)}")

# Query by recency
recent = ledger.query(max_age_seconds=300)   # last 5 minutes

# Query by source
from_profile = ledger.query(source="profile_agent")

# Convert to messages with token budget for LLM injection
messages = ledger.to_messages(max_tokens=2000)  # most recent first, within budget
print(f"Injecting {len(messages)} context messages ({ledger.total_tokens} total tokens)")

# Snapshot for persistence
snapshot = ledger.snapshot()
restored = ContextLedger.from_snapshot(snapshot)

Inject ledger context into an agent call

from pyagent_patterns.base import Agent, Message
from pyagent_providers import AnthropicLLM

ledger = ContextLedger()
ledger.add("User is CFO at a 500-person SaaS company", source="crm", trust_level=TrustLevel.VERIFIED)
ledger.add("User asked about churn forecasting last week", source="history", trust_level=TrustLevel.VERIFIED)

agent = Agent("advisor", AnthropicLLM("claude-sonnet-4-20250514"),
              system_prompt="You are a financial advisor. Use the provided context.")

# Build messages: context first, then the actual query
context_messages = ledger.to_messages(max_tokens=1500)
query = Message.user("What's the best way to reduce churn in our enterprise tier?")

import asyncio
result = asyncio.run(agent.run_messages([*context_messages, query]))
print(result.output)

Hook Integration — Automatic Context Wiring

The hook system wires a ContextLedger directly into an agent: the agent reads context before each LLM call and writes its output back to the ledger automatically — no manual to_messages() needed.

import asyncio
from pyagent_context.ledger import ContextLedger
from pyagent_context.item import ContextItem, TrustLevel
from pyagent_patterns.base import Agent
from pyagent_providers import AnthropicLLM

ledger = ContextLedger()
ledger.add("User is CFO at a 500-person SaaS company", source="crm", trust_level=TrustLevel.VERIFIED)
ledger.add("User asked about churn forecasting last week", source="history", trust_level=TrustLevel.VERIFIED)

agent = (
    Agent("advisor", AnthropicLLM("claude-sonnet-4-20250514"),
          system_prompt="You are a financial advisor. Use the provided context.")
    .set_context(ledger)   # reads ledger before LLM call; writes output back after
)

result = asyncio.run(agent.run("What's the best way to reduce churn in our enterprise tier?"))
print(result.output)
print(f"Ledger now has {len(ledger)} items")  # original 2 + agent output = 3

To wire context on all agents in a blueprint at once:

from pyagent_blueprint import load_blueprint, BlueprintCompiler

graph = BlueprintCompiler().compile(load_blueprint("blueprint.yaml"))
graph.wire_context(ledger)   # sets context hook on every agent in the graph

→ See the full Hooks Guide for all four hook types.


SemanticMemory (ChromaDB)

Long-term recall via vector similarity — retrieve the most relevant past context, not just the most recent.

pip install pyagent-context[chromadb]
from pyagent_context.memory.semantic import SemanticMemory
from pyagent_context.item import ContextItem, TrustLevel

# Persistent collection backed by ChromaDB
sem = SemanticMemory(collection_name="agent_knowledge_base", persist_directory=".chromadb")

# Add domain knowledge
sem.add(ContextItem(content="PCI-DSS v4 requires tokenisation of all cardholder data at rest",
                    source="compliance_docs", trust_level=TrustLevel.VERIFIED))
sem.add(ContextItem(content="SOC2 Type II audit completed 2024-11, no critical findings",
                    source="audit_report", trust_level=TrustLevel.VERIFIED))
sem.add(ContextItem(content="Customer XYZ has a data residency requirement: EU only",
                    source="crm", trust_level=TrustLevel.VERIFIED))

# Retrieve by semantic similarity
results = sem.search("What are our data compliance requirements?", top_k=3)
for item, score in results:
    print(f"[{score:.2f}] {item.content[:80]}")

# Combine with ledger
ledger = ContextLedger(items=results[0])  # inject top semantic results

ContextCompressor

When the ledger grows beyond the LLM's context window, compress older items while preserving high-trust content.

pip install pyagent-context[compress]
from pyagent_context.compression import ContextCompressor
from pyagent_context.ledger import ContextLedger
from pyagent_context.item import TrustLevel

compressor = ContextCompressor(target_tokens=4000)

# Ledger with 12k tokens of history
large_ledger = ContextLedger(items=long_history)

# Compress: summarise SPECULATIVE items, keep VERIFIED intact
compressed = compressor.compress(large_ledger)
print(f"Compressed: {large_ledger.total_tokens}{compressed.total_tokens} tokens")
print(f"Verified items preserved: all")

Redaction

Strip PII before items leave the trust boundary.

from pyagent_context.redaction import Redactor

redactor = Redactor(patterns=["email", "phone", "credit_card", "ssn"])

safe = redactor.redact("Contact john.doe@acme.com or call +1-555-867-5309")
print(safe)
# → "Contact [EMAIL] or call [PHONE]"

Full Example — Multi-Turn Customer Support Agent

import asyncio
from pyagent_context.ledger import ContextLedger
from pyagent_context.memory.session import SessionMemory
from pyagent_context.item import ContextItem, TrustLevel
from pyagent_patterns.base import Agent, Message
from pyagent_providers import AnthropicLLM

SESSION_ID = "support-ticket-4421"

async def handle_turn(user_message: str) -> str:
    # Load prior session
    session = SessionMemory(session_id=SESSION_ID, backend="sqlite")
    ledger = ContextLedger(items=session.get_all())

    # Add verified facts about the user (from CRM, set once)
    if len(ledger) == 0:
        ledger.add("User: Enterprise customer, plan: Pro, billing monthly",
                   source="crm", trust_level=TrustLevel.VERIFIED)

    # Add this turn's input
    ledger.add(user_message, source="user", trust_level=TrustLevel.VERIFIED)

    agent = Agent(
        "support_agent",
        AnthropicLLM("claude-sonnet-4-20250514"),
        system_prompt="You are a helpful support agent. Use prior context to avoid repetition.",
    )

    context_msgs = ledger.to_messages(max_tokens=3000)
    result = await agent.run_messages([*context_msgs, Message.user(user_message)])

    # Persist agent response back to session
    ledger.add(result.output, source="agent", trust_level=TrustLevel.INFERRED)
    session._items = ledger.items
    session.save()

    return result.output

# Simulated multi-turn conversation
for msg in ["I can't access my dashboard", "I already tried reloading", "Same issue on mobile too"]:
    reply = asyncio.run(handle_turn(msg))
    print(f"User: {msg}\nAgent: {reply}\n")

When to Use Which Tier

Situation Use
Single pattern run, ephemeral context WorkingMemory
Multi-turn chat / conversation history SessionMemory
Domain knowledge, long-term recall SemanticMemory (ChromaDB)
Trust-filtered context injection ContextLedger
Token-budget enforcement ContextLedger.to_messages(max_tokens=)
PII protection before LLM injection Redactor

See Also