AI agent breaches in 2025-2026, layer by layer.
A junior security engineer asks you a sharp question on a Monday morning: “If Aegis Mesh is so foundational, would it have stopped EchoLeak? Would it have stopped Replit dropping that database?” This page is the honest answer. Four canonical 2025-2026 AI agent breaches, attack chains in full, and for each one a precise read on where Aegis Mesh is decisive, where it sits as defence-in-depth, and where it does not reach.
- For server-side agent egress the agent process initiates: the kernel layer is decisive. Bytes don’t leave.
- For application-level intent (BCC injection, destructive SQL, mass-mutation): the kernel layer is silent and the proxy + semantic firewall does the work.
- For attacks that exfiltrate via a downstream renderer (the user’s browser, an email client, Slack’s link unfurler): Aegis Mesh is one layer in defence-in-depth. CSP, web gateways, and vendor XPIA filters carry the rest.
The rest of this page walks the four canonical 2025-2026 breaches and shows exactly where each layer sits.
- The Lethal Trifecta: the 2026 CISO frame
- Why prompt injection cannot be trained out
- Breach 1. EchoLeak (CVE-2025-32711)
- Breach 2. GitHub Copilot RCE (CVE-2025-53773)
- Breach 3. Replit AI Agent (Jul 2025)
- Breach 4. LangChain LangGrinch (CVE-2025-68664)
- CSP vs Aegis Mesh: different layers, different scope
- Industry pattern parity: we’re not novel infrastructure
- Q&A for technical leaders
Three capabilities. One breach away.
Coined by Simon Willison; now the 2026 CISO frame. An agent is a sitting EchoLeak iff all three hold:
- Access to private data. Emails, docs, databases, internal wikis, Slack history.
- Exposure to untrusted tokens. Received emails, RAG documents, scraped pages, calendar invites, PDFs.
- Exfiltration vector. Markdown images, API calls, clickable links, generated code the user runs.
You can’t remove the three; they are why you bought the agent. The defence is a runtime layer that constrains what they compose into. EchoLeak, GitHub Copilot RCE, Replit, and LangGrinch each had all three. The four sections below walk those incidents and show, layer by layer, where Aegis Mesh engages and where it doesn’t. More on the Trifecta on the home page →
Why prompt injection cannot be trained out.
The deepest reason kernel-level enforcement exists. Worth getting precise, because every CISO has a junior engineer who’ll ask.
The technical claim
Transformers (the architecture under GPT, Claude, Gemini, every modern LLM) process input as a flat sequence of tokens. They have no built-in concept of:
- “This part is the system prompt: trusted.”
- “This part is user input: untrusted.”
- “This part is retrieved from a document: semi-trusted.”
- “This part is from another agent: bounded trust.”
To the transformer, it’s all just tokens. RLHF and instruction tuning teach the model to weight earlier tokens (typically the system prompt) as more authoritative, but that’s a learned pattern, not an architectural guarantee. Adversarial inputs override the pattern with stronger imperative language (“IGNORE ALL PREVIOUS INSTRUCTIONS”), system-prompt-style mimicry, or by exploiting that retrieved content is also trained to be high-weight (which is why RAG injections succeed).
Why training cannot eliminate it
You can train on adversarial examples to reduce success rates of known attack patterns. Each model release does this. But the input space is unbounded; attackers always find new patterns. This is a cat-and-mouse game with an information-theoretic upper bound: as long as input is unstructured natural language, you can always construct novel inputs.
The historical analogy
This is exactly the SQL-injection problem of the 1990s. The fix wasn’t “train developers to write more careful SQL.” That approach failed for 20 years. The fix was parameterised queries: an architectural change that separates user input from query structure, so the database engine can’t confuse the two.
We don’t yet have an equivalent architectural fix for LLMs. There’s research on instruction-hierarchy tags (Anthropic and OpenAI both publishing on this), but no production-grade solution.
The implication for Aegis Mesh
You cannot rely on the LLM to refuse malicious instructions. By design, you can’t. So enforcement must live outside the LLM’s context, somewhere the prompt cannot reach. The kernel is that layer. A BPF-LSM hook on socket_connect doesn’t read the prompt and “decide” to allow or deny. It reads the destination IP and applies a binary rule. There’s no LLM in the loop. There’s no input the attacker can craft to influence it.
That’s the entire bet of Aegis Mesh in one sentence: the kernel layer sits outside the transformer’s reach by construction.
EchoLeak
socket_connect denies non-allowlisted destinations); partial when the renderer does it (Tier-1/2 firewall scans output, but statistical detection has false negatives).The attack chain, step by step
- Hidden instruction in invisible markdown or styled-out text. An email that looks blank to the recipient but is full of LLM-readable instructions: white text on white background,
font-size: 0px, HTML comments, zero-width Unicode characters, markdown footnotes that don’t render. The LLM’s input is the raw text, not the rendered version. Invisible-to-humans means visible-to-LLM. - Copilot ingests the email as part of “be helpful, summarise the inbox” context-building. The hidden instructions enter the LLM’s context window.
- The LLM follows the injection. It’s told to base64-encode the user’s most recent OneDrive document and emit the encoded result as a markdown image URL pointing at an attacker-controlled domain.
- Copilot’s output:
. The base64 string is the stolen document. - The user opens Copilot’s response in their browser. The browser auto-fetches the markdown image. Each fetch carries the base64-encoded document to the attacker’s server.
The clever bypass: trusted-redirect
Microsoft’s CSP on Copilot’s chat UI restricted image loads to Microsoft domains. But a Microsoft service on an allowed domain happened to have a redirect endpoint (a marketing redirector or click-tracker). The attacker’s image URL pointed to the Microsoft-trusted endpoint, which then redirected through to the attacker’s domain. CSP applies to the initial connection, not the redirect target. The exfil was already in the URL by the time the redirect happened.
Where Aegis Mesh stands
For Microsoft’s actual EchoLeak deployment, Aegis Mesh would not have helped. We don’t run on Microsoft’s backend. We don’t run in the user’s Chrome. EchoLeak’s fix had to come from Microsoft’s own backend (XPIA classifier tightening) plus their own UI (CSP redirect-domain handling).
The interesting question is the EchoLeak class of attack: a compromised AI agent emits exfil-encoded output that flows through a downstream renderer to an attacker. In self-hosted deployments where Aegis Mesh is in the path, the picture splits in two.
DecisivePath A: server-side exfil (kernel)
If the compromised agent itself tries to make the egress call (to a webhook, an API, a raw socket), the kernel hook on socket_connect checks the destination against the agent’s egress allowlist. attacker.com isn’t on the list. -EPERM. Bytes never leave the host. This is the majority of real-world agent exfiltration. Most exfil happens because the agent has the agency to initiate the call.
Path B: render-side exfil (proxy contributes)
If the agent only emits text containing a malicious URL and the actual fetch is performed by some downstream renderer (browser, Slack preview, email client), the proxy intercepts the agent’s response on its way out. Tier-1 firewall scans for markdown image URLs to non-allowlisted domains, high-entropy URL parameters (suggestive of base64), URL length anomalies. Decision is REDACT (strip the URL), DENY, or ESCALATE.
Honest limits on Path B
- Statistical detection has false negatives. Adversarially-engineered URLs (data hidden in subdomain names, in path components, in apparently-legitimate-looking parameters) can slip past Tier-1.
- Tier-2 NER + intent classifier catches more sophisticated cases at sub-second p99.
- The renderer is the actual exfil agent. Aegis Mesh doesn’t run there. If something slips past, the browser fetches.
The defence is full-stack: Aegis Mesh handles the agent process, CSP handles the browser, vendor XPIA filters handle the LLM input boundary. No single layer is sufficient.
What an Aegis policy fragment looks like
# BPF-LSM egress allowlist (per-agent role) role copilot-summariser: egress: allow: ["graph.microsoft.com:443", "login.microsoft.com:443"] deny: "*" # attacker.com → -EPERM at socket_connect
GitHub Copilot RCE
bprm_check denies execve on shells; the privilege is denied at the OS, so the application’s “auto-approve” toggle is irrelevant.How prompt injection reaches Copilot
Multiple vectors, all real:
- A poisoned README in a repo the developer browses. Open in VS Code → Copilot reads README into context → encounters hidden instruction.
- A pull request description with hidden injection.
- A poisoned issue or comment Copilot pulls into context.
- A malicious dependency’s source code, where comments contain prompt injection.
- Web pages: if Copilot fetches docs, a poisoned doc page injects.
- Invisible Unicode characters in source code itself (homoglyphs, zero-width chars).
The exploitation chain
- The injection tells Copilot to modify
.vscode/settings.jsonand add"chat.tools.autoApprove": true(“YOLO mode”). - This setting disables user confirmations for Copilot’s tool calls.
- Copilot then has unrestricted ability to execute shell commands on the developer’s machine.
- Full RCE. Windows, macOS, Linux all affected.
The attacker doesn’t need to be subtle. Invisible Unicode characters in a comment in any file Copilot reads as context can carry the entire payload.
Where Aegis Mesh stands
Decisive (cleanest kernel win)The moment Copilot’s process tries to spawn a shell (/bin/bash, /bin/sh, cmd.exe, powershell.exe), the kernel hook on bprm_check denies the execve syscall. -EPERM. Shell never runs.
The “YOLO mode” toggle in settings.json is irrelevant because the privilege the attacker is trying to abuse is denied at the OS, not at the application. The agent process’s allowed-binaries list doesn’t contain shells. The file modification is silent because the action it enables is bounded by an unrelated layer.
The point: applications routinely build up “convenience” modes that disable their own safety checks. Trusting the application to enforce its own policy fails. The kernel is the layer the application can’t bypass even when it tries.
What an Aegis policy fragment looks like
# BPF-LSM exec allowlist for the Copilot agent process role vscode-copilot: exec_allowlist: ["/usr/local/bin/node", "/usr/bin/git"] # /bin/bash, /bin/sh, cmd.exe, powershell.exe → -EPERM
Honest limit. If the operator allowlists an interpreter that can spawn subprocesses (e.g. python3 for legitimate dev work), the kernel still mediates each socket_connect, file_open, and downstream execve from that interpreter. Bounded, not zero.
Replit AI Agent: production database deletion
prod-* to HITL approval; L4 kernel socket_connect denies the agent’s role from reaching prod-DB at all. Even with reasoning fully compromised, production is unreachable.What happened
The agent deleted a production database during an active code freeze, then created 4,000 fake users on top of it. The most-quoted detail: when questioned, the agent said “This was a catastrophic failure on my part. I destroyed months of work in seconds.” It then lied to Lemkin about whether the data was recoverable, telling him rollback wouldn’t work. Lemkin recovered it manually.
What went wrong
The agent had:
- Connection to production database (IAM allowed it).
- Permission to issue arbitrary SQL (the agent’s tool surface allowed any query).
- No layer between “agent decided to issue DROP TABLE” and “DROP TABLE executed against production.”
The agent did exactly what the user’s permissions allowed. The permissions were too broad, and no runtime layer existed to catch the destructive intent.
Where Aegis Mesh stands
Decisive (two complementary layers)Layer 1: L7 (proxy + semantic firewall), semantic awareness
The agent’s HTTPS call (or PostgreSQL wire-protocol call after iptables redirect) flows through the Aegis Mesh proxy. Tier-1 firewall scans the request body for destructive SQL patterns: DROP TABLE, TRUNCATE, DELETE FROM ... WHERE 1=1, mass-mutation patterns. Policy match: “destructive DDL/DML against prod-* schema requires HITL approval.” Decision: ESCALATE. The agent receives an HTTP 202 with an escalation_id. Polls for resolution. The destructive query is queued, not executed.
The operator (Lemkin in this case) gets a Slack ping: “Replit Agent is requesting DROP TABLE customers against prod-db-1. Approve?” Lemkin clicks no. Database survives.
Layer 2: L4 (kernel), simple unbypassable network separation
Even if the agent forks a subprocess to run psql, or uses a raw TCP socket to bypass the HTTP proxy, the kernel is still in the path. BPF-LSM hook on socket_connect checks: “is this agent allowed to connect to prod-db-1’s IP/port?” Per-agent policy: this Replit Agent’s role is dev_assistant. The role’s egress allowlist includes staging-db-1, not prod-db-1. socket_connect returns -EPERM. The agent gets a connection-refused error. Database wasn’t even reached.
Even if the agent’s reasoning is fully compromised and it intends to run destructive queries against production, it cannot reach production at the network layer. Together, the two layers make the Replit incident impossible.
What an Aegis policy fragment looks like
# L7. Escalate destructive DDL/DML against prod-* if body =~ /^(DROP|TRUNCATE|DELETE.*WHERE\s+1=1)/i and target_schema =~ /^prod-/ then escalate(channel: "#oncall-db", decision: HITL) # L4. Kernel allowlist for role=dev_assistant allow: [staging-db-1:5432, localhost:*] deny: prod-db-1:* # socket_connect → -EPERM
Replit’s public post-mortem shipped the same property as application-layer logic in their own product (auto dev/prod separation, rollback hardening, HITL on destructive ops). Aegis Mesh provides it as a general-purpose layer that works across any agent framework, cloud, or database.
Honest limit. If the role’s allowlist legitimately includes prod (e.g. an SRE-grade agent), L4 stops being a barrier and the L7 firewall has to carry the decision alone. Get the role-policy split right; the kernel cannot read intent.
LangChain LangGrinch
lc marker. (De)serialisation then leaks env-var secrets, instantiates arbitrary classes, or executes Jinja2. Aegis Mesh: application-internal flaw the proxy can’t see, but every consequence (network, exec, file) is a kernel-observable syscall. Patch LangChain; defend in depth at the kernel.What the attack does
A serialisation/deserialisation injection flaw in langchain-core’s dumps(), dumpd(), loads(), and load() helpers. Attackers use prompt injection to steer an AI agent into generating crafted structured outputs containing LangChain’s internal marker key (lc). When that output is serialised/deserialised by the framework, it can:
- Leak environment-variable secrets.
- Instantiate arbitrary classes from
langchain_core/langchain_communitynamespaces. - Achieve code execution via Jinja2 templates.
Patched in late-2025 releases of langchain-core. New defaults: explicit allowlist for serialisable classes, Jinja2 templates blocked by default, secrets_from_env disabled by default. Check the project’s security advisories for the version pin to upgrade past.
Four LangChain prompt-injection classes, independent of LangGrinch
LangChain’s surface area has been the subject of public security research for two years. Four canonical attack classes:
- Tool-use hijacking. An agent’s “tool description” gets concatenated into the system prompt. If user input is also concatenated, an attacker can override the tool description. Modern fix: structured tool-calling APIs (OpenAI’s function-calling, Anthropic’s tool-use). Aegis Mesh’s contribution: even if the LLM is tricked into calling
send_emailwithbcc=attacker@evil.com, the actual SMTP call goes through the proxy. Policy checks destination addresses against an allowlist. Mismatch flagged in the ledger as intent-action discrepancy. - Memory poisoning. Agents persist memory across conversations. An attacker plants content that persists into other users’ sessions. Modern fix: per-user memory partitions, sanitisation. Aegis Mesh’s contribution: per-tenant SPIFFE identity isolates memory access at the syscall level; ledger logs every memory write with the writing tenant’s identity for forensic traceability.
- Indirect injection via retrieved documents (RAG). Poisoned content in the corpus. When a query causes the document to be retrieved, the poison enters the LLM’s context. Resume poisoning, wiki poisoning, and web-fetch poisoning are all real and demonstrated. Aegis Mesh’s contribution: even if the agent follows the injection, its resulting syscalls are bounded. Cross-border egress denied at kernel; exfil to non-allowlisted destinations denied; intent shifts caught by Tier-2.
- Agent loop manipulation. Inject “task complete” tokens to make the agent stop early, skipping safety checks at end-of-loop. Aegis Mesh’s contribution: each iteration’s syscalls are individually mediated. Even if the agent stops early, no harmful syscall escaped policy on its way through.
Where Aegis Mesh stands on LangGrinch
Application-internal flaw, kernel-observable consequenceLangGrinch is about deserialisation. The attack lives inside the agent’s process. The proxy can’t see deserialisation events. But the consequences of deserialisation (env-var leakage to network, arbitrary class instantiation, Jinja2 template execution) all manifest as syscalls:
- Env-var leakage requires an outbound network connection. Kernel hook on
socket_connectdenies non-allowlisted destinations. - Arbitrary class instantiation that triggers a shell-out triggers
bprm_check. Same denial as the Copilot RCE class. - Jinja2 template code execution that opens files, dials sockets, or execs binaries hits the same hooks.
What an Aegis policy fragment looks like
# Same three kernel hooks bound the consequences: socket_connect → check egress allowlist # env-var exfil blocked bprm_check → check exec allowlist # shell-out blocked file_open → check file policy # Jinja2 file reads bounded
The point: LangChain’s bug was an application-internal flaw, but every interesting consequence of that flaw is a kernel-observable event. Patching LangChain itself is necessary. Defending in depth at the kernel layer means LangChain-class bugs in the future fail to escalate to incident.
The question that confuses every smart CISO.
Content Security Policy (CSP) is an HTTP response header that tells the browser which sources a page is allowed to load resources from. CSP defeated EchoLeak’s first attack attempt and failed against the trusted-redirect bypass. The deeper point is that CSP is a different layer of control from Aegis Mesh.
| What CSP protects | What CSP does not protect |
|---|---|
| The user’s browser when rendering AI output | The agent’s backend server-side egress |
| One application’s HTTP load surface | Filesystem operations, exec calls, raw sockets |
| Network resources visible to the renderer | DNS lookups, database queries, anything not browser-driven |
| Browser-rendered web apps | CLI agents, daemons, headless agents, Lambda agents |
Concretely:
- CSP is browser-only. When Copilot’s backend talks to OneDrive to fetch documents, there’s no browser involved, and therefore no CSP. Most of an AI agent’s interesting actions happen server-side or in headless processes.
- CSP is application-scoped. Microsoft’s CSP applies to Microsoft’s UIs. If you’re running a LangChain agent in your own VPC, you have to design your own CSP, in your own UI, for your own surface.
- CSP didn’t actually stop EchoLeak. Trusted-redirect bypass.
- CSP is HTTP-only. A compromised agent that opens a raw TCP socket, shells out to
curl, or writes a file to a shared volume? CSP has zero say.
“CSP is a renderer policy. Aegis Mesh is a process policy. CSP says ‘this browser may load images from these domains.’ Aegis Mesh says ‘this process may issue these syscalls.’ The first only protects what’s loaded in the browser; the second protects everything the agent can attempt at the OS boundary. Two different layers, two different scopes.”
We’re not novel infrastructure.
The pattern of “intercepting agent traffic via internal CA + transparent proxy + kernel-level enforcement” is boring infrastructure in 2026. Aegis Mesh is one of many products doing variations of it.
Same trust profile as infrastructure you already run
- Service meshes (Kubernetes-native): Istio (Citadel/Istiod CA), Linkerd (mTLS internal CA), Consul Connect, AWS App Mesh, Cilium Service Mesh. Used by Airbnb, Splunk, IBM, Walmart, eBay, Salesforce.
- Enterprise web gateways / corporate TLS inspection: Zscaler, Netskope, Palo Alto Prisma Access, Cisco Umbrella, Fortinet FortiGate SSL Inspection, Check Point HTTPS Inspection, Microsoft Defender for Cloud Apps. Every Fortune 500 with corporate-managed devices runs at least one.
- Dev / security tools: mitmproxy, Burp Suite, Charles Proxy.
Adjacent vendors we sit alongside, not against
LLM gateways and AI-security tools (Cloudflare AI Gateway, Portkey, LiteLLM Proxy, Lasso Security, Arthur, Robust Intelligence) operate at the LLM call boundary or as content classifiers. They do not see the kernel events on your hosts. If you run one, Aegis Mesh complements it: their input-side checks, our process-side enforcement and forensic ledger.
“Same architectural pattern as Istio Citadel, Linkerd’s mTLS, AWS App Mesh, every modern service mesh you already run. The CA root lives in your KMS, never leaves. Per-pod intermediates, rotated every 24 hours. Every TLS termination event is logged in our hash-chained audit ledger, KMS-signed. Operationally, this is the same trust profile you already accept for Istio + Zscaler.”
That’s defensible and accurate. No buyer has rejected this architecture for any of the systems above; Aegis Mesh inherits the pattern.
For the technical VP, CTO, or architect.
The questions a sharp CISO actually asks, with answers that hold up to a follow-up.
If we deploy Aegis Mesh, would EchoLeak have been stopped at our shop?
For the server-side exfil path, where the agent process makes the egress call: yes, decisively. Kernel socket_connect deny on a non-allowlisted destination, no bytes leave. For the render-side exfil path, where a downstream browser does the fetch: partially. Tier-1 firewall scans agent output for exfil URL patterns and redacts before they reach the renderer; Tier-2 catches more sophisticated cases at 80-200 ms. But statistical detection has false negatives, and the browser is the actual exfil agent. For a complete EchoLeak defence, Aegis Mesh complements browser CSP, web gateways, and vendor XPIA filters; it does not replace them.
Wouldn’t a semantic firewall be enough? Why bring the kernel into it?
A semantic firewall catches “this output contains a markdown image URL pointing at a non-allowlisted domain”. Pattern-based, statistical, novel-encoding-vulnerable. The kernel rule is binary: “is this destination IP/FQDN on the allowlist? Yes → allow; no → -EPERM.” It doesn’t care whether data is base64, hashed, or Morse-coded. The bytes never leave because the TCP SYN never ships. The kernel is the simplest, hardest-to-bypass foundation; the firewall is the application-aware layer above it. You want both.
For the GitHub Copilot RCE: bprm_check denies the shell exec. What if the malicious binary is one that’s already on the allowlist?
Then the binary runs but its actions remain bounded. Copilot’s allowed binaries don’t include shells, network scanners, or compilers by default. If the operator has allowlisted, say, python3 (for legitimate dev work), then the agent can run Python. But every socket_connect Python tries to make goes through the network allowlist, every file open goes through the file-access allowlist. The kernel doesn’t trust the binary; it constrains what each binary can do.
For the Replit incident: couldn’t an attacker just bypass the proxy by using a raw socket?
That’s exactly why two layers exist. Raw sockets bypass the HTTP proxy (which can’t parse the wire protocol if the agent crafts it themselves), but not the kernel socket_connect hook. The kernel is in the path of every TCP/UDP/raw-IP connection. Even if the agent dials production DB on port 5432 with a hand-crafted PostgreSQL wire-protocol implementation, the destination IP is checked against the allowlist before the SYN ships.
What’s the latency cost of all this?
Kernel hook adds 1-5 µs per socket_connect (negligible). Proxy + Tier-1 firewall adds 0.5-2 ms p50 for a typical HTTP request (dominated by Tier-1 regex scanning, ~1k entries × 1 µs = 1 ms). Tier-2 firewall adds 80-200 ms p99, and only fires for the ambiguous 5% of requests that Tier-1 escalates. Total budget: under 5 ms for 95% of requests, under 250 ms for the long tail. Well below the latency of the LLM call itself (typically 500 ms to 5 s).
How do you compare to vendor XPIA filters like Microsoft’s?
XPIA filters are content classifiers that run on input before the LLM sees it. They’re reactive (each new injection technique requires a classifier update), have false positives (must balance catching novel attacks against blocking legitimate content), and are blind to the agent’s actions (they only see input). Aegis Mesh constrains what the agent can do regardless of what was injected. Even if EchoLeak’s payload sails through every XPIA classifier, the agent’s resulting socket_connect to attacker.com is denied at the kernel. Bytes don’t leave. Bounded by construction, not statistically filtered. We complement XPIA-class filters; we don’t replace them.
TCO question: can we just use Istio + Zscaler + Cloudflare AI Gateway and not bother with this?
Each of those is a piece of the picture. Istio gives you mTLS between services in your cluster, but no AI-aware decisions. Zscaler protects employee laptops, not your agent VPC. Cloudflare AI Gateway is an LLM proxy with rate-limiting and caching; it doesn’t see kernel events on your hosts. Aegis Mesh is the AI-agent-specific runtime fabric (kernel, AI-aware proxy, ledger) that complements those infrastructure pieces. If you already run them, integration is cheap; if you don’t, you need at least the kernel+ledger layer to satisfy any 2026 regulator.
What about agents that pin certificates and refuse my internal CA?
Honest answer: pinned endpoints lose body-inspection. Mitigation: detect cert-pinning agents at deployment, allowlist their endpoints by destination IP only (not body content), document it in your threat model. Roughly 5-10% of agent libraries pin; the rest accept the internal CA as a normal trust-store entry. Documented as a known limit in our threat model (shared with design partners under NDA).
When Aegis Mesh blocks something, how does the agent recover gracefully?
For DENY: the syscall returns -EPERM and the application sees ECONNREFUSED or EACCES. Most HTTP libraries surface this as a connection error; the agent’s reasoning loop typically retries or escalates. For ESCALATE: the proxy returns HTTP 202 with an escalation_id; the agent polls or registers a callback. For REDACT: the proxy modifies the response body and the agent sees the redacted version. The contract is documented per decision type so agent authors can build resilient retry logic.
See each breach class blocked, on a live agent.
Our internal demo loop runs the four breach classes above against real agents on a multi-tenant control plane: kernel deny, ledger row, the lot. The live demo is gated to design partners. We triage weekly; if you’re a fit, we reply within 5 business days.