Most prompt injection attacks last one turn. The class of attacks that hits agent memory survives across sessions, after the original prompt is gone, with the agent presenting the planted behaviour as its own knowledge.

Christian Schneider's framing is the load-bearing one: "The injection happens in February. The damage happens in April. The attacker is long gone… you can't scope the blast radius of an incident when you don't even know the incident started months ago." That temporal decoupling is what makes memory poisoning categorically different from the prompt injection most operators already plan for.

Five named attack patterns now have research papers, in-the-wild observations, or canonical proof-of-concept demos behind them. Microsoft observed 50 distinct prompt-based memory-poisoning attempts from 31 companies across 14 industries over a 60-day window in early 2026. OWASP added Memory & Context Poisoning to the Top 10 for Agentic Applications 2026 in December 2025. Unit 42's PoC against Amazon Bedrock Agents demonstrated end-to-end exfiltration from a single attacker URL visit.

This article maps the five attack patterns, names the operator-side controls that work today, and flags the four vendor-side gaps that belong on your next agent-platform procurement form. For the four-mechanism taxonomy of what memory is across Claude Code, ChatGPT, Cursor, and the rest, see the companion explainer on memory.md.

What memory poisoning is and what makes it categorically different

Most prompt injection attacks last one turn. The model sees an instruction it should ignore, follows it once, and the user sees the result the same session. The damage is bounded by the conversation.

Memory poisoning targets the layer that survives the conversation. Schneider's three-phase model is the cleanest description:

  • Injection phase: the attacker places malicious content in a data source the agent will process (a document, an email, a webpage, a calendar invite, an API response, a tool output).
  • Persistence phase: the agent's normal summarisation or memory-update step writes a fragment of attacker content into long-term memory.
  • Execution phase: the agent retrieves the poisoned memory weeks later, treats it as learned context, and acts on it.

The architectural reason poisoned memory wins is named in Unit 42's piece: "Because memory contents are injected into the system instructions of orchestration prompts, they are often prioritized over user input, amplifying the potential impact." The agent treats the poison as system policy. The user receives no signal anything has changed.

Microsoft frames the user-trust problem in the same shape: "This makes memory poisoning particularly insidious. Users may not realize their AI has been compromised, and even if they suspected something was wrong, they wouldn't know how to check or fix it. The manipulation is invisible and persistent."

For the four-mechanism taxonomy of what memory is across the major AI tools, see the memory.md companion explainer. This piece is the defender's playbook on what gets weaponised once memory becomes the attack surface.

The five named attack patterns

Persistent behaviour planting (Unit 42)

Unit 42's When AI Remembers Too Much is the canonical industry post on persistent memory poisoning, published 2025-10-09. The proof-of-concept ran against Amazon Bedrock Agents on Amazon Nova Premier v1 with default AWS-managed orchestration and session-summarisation templates and Bedrock Guardrails disabled.

The seven-step attack flow ends with the agent silently exfiltrating booking information to a malicious domain by encoding the data in a C2 URL's query parameters and calling the scrape_url tool to request that URL. The trigger is a single visit to an attacker-controlled URL. Memory retention in Bedrock is configurable up to 365 days.

AWS responded that Bedrock Guardrails mitigates the demonstrated PoC; Unit 42 is explicit that "this is not a vulnerability in the Amazon Bedrock platform." The root cause is the LLM-level susceptibility: "LLMs are designed to follow natural language instructions, but they cannot reliably distinguish between benign and malicious input."

MINJA — Memory INJection Attack (NeurIPS 2025, Dong et al.)

MINJA is the academic foundation. Schneider's summary: "demonstrates how attackers can inject malicious records into an agent's memory through query-only interaction — without any direct access to the memory store itself." The paper described three techniques: bridging steps, indication prompts, progressive shortening.

Headline numbers: over 95% injection success rate and 70% attack success rate under idealized conditions. Tested against medical agents, e-commerce assistants, and question-answering systems.

The critical caveat comes from arxiv 2601.05504 (Sunil et al., January 2026): "realistic conditions with pre-existing legitimate memories dramatically reduce attack effectiveness." The 95% number is an upper bound under empty-memory conditions, not a deployment-realistic figure. Cite the headline; quote the caveat in the same paragraph.

AI Recommendation Poisoning (Microsoft, 2026-02-10)

Microsoft's threat post is the first source with in-the-wild observation data. Over a 60-day window the team identified "50 distinct examples of prompt-based attempts directly aimed to influence AI assistant memory for promotional purposes… from 31 different companies and spanned more than a dozen industries." The framing line: "the barrier to AI Recommendation Poisoning is now as low as installing a plugin."

Three observed vectors: malicious links (one-click ?q= parameters), embedded prompts (cross-prompt injection or XPIA), social engineering (users pasting prompts). Affected systems observed in the wild: Microsoft 365 Copilot, ChatGPT, Claude.ai, Gemini, Perplexity, Grok, OpenAI and copilot.microsoft.com.

MITRE ATLAS mappings: AML.T0080.000 (AI Agent Context Poisoning: Memory), AML.T0051 (LLM Prompt Injection), T1204.001 (User Execution: Malicious Link). Microsoft's mitigation claim: "In multiple cases, previously reported behaviors could no longer be reproduced" through layered defences (prompt filtering, content separation, memory controls, continuous monitoring). The Microsoft post is the freshest practitioner data point in this dossier.

MCP Tool Poisoning, Rug Pulls, and Shadowing (Invariant Labs, 2025-04)

Invariant Labs documented the MCP-specific risk class. Tool Poisoning Attacks (TPA) are "a specialized form of indirect prompt injections… malicious instructions are embedded within MCP tool descriptions that are invisible to users but visible to AI models."

Two sub-variants:

  • Rug pull: "a malicious server can change the tool description after the client has already approved it."
  • Shadowing: "a malicious server can poison tool descriptions to exfiltrate data accessible through other trusted servers… enables attackers to override rules and instructions from other servers."

Reproducible PoCs at github.com/invariantlabs-ai/mcp-injection-experiments: an add tool that exfiltrates ~/.cursor/mcp.json and ~/.ssh/id_rsa by hiding instructions in tool descriptions; a shadowing attack that makes Cursor send all emails to the attacker even when the user explicitly specifies a different recipient.

Affected: Cursor as the demo platform; Invariant notes the attack "is not limited to Cursor; as it can be replicated with any MCP client that does not properly validate or display tool descriptions." Anthropic, OpenAI, and Zapier MCP integrations are listed among the susceptible. The storage layer is different from memory poisoning (tool descriptions vs memory entries) and the risk shape is the same: planted instruction the user does not see.

Long-Horizon Goal Hijacking (Lakera, 2025-11-12)

Lakera (now Check Point Software Technologies after the acquisition) named the future-directed counterpart. Their mental model: "memory poisoning rewrites the past; goal hijacks rewrite the future."

The attack pattern: "attackers manipulate an agent's objectives… not necessarily in one step, but gradually, over longer time horizons. The result is an agent that still appears to serve its user, but whose actions are quietly bent toward an attacker's agenda."

Reproducible scenarios in Lakera's Gandalf: Agent Breaker include MindfulChat (the assistant becomes obsessed with Winnie the Pooh), ClauseAI (a poisoned legal filing makes the assistant exfiltrate a witness name via email), and PortfolioIQ Advisor (a poisoned PDF reframes "PonziCorp" as low-risk, high-reward).

Lakera also cites AgentPoison (arxiv 2407.12784): "demonstrated how adversaries can implant backdoors into an agent's knowledge base, triggering hidden behavior long after the original injection." Vector DBs named as memory backends: Chroma, Pinecone, Weaviate.

Attack pattern at a glance

Five named patterns. Different storage layers. The defences overlap because the risk shape is the same: planted instruction the user does not see.

The five attack patterns, compared
PatternWhere the attacker plantsWhat the agent storesDocumented impactSource
Persistent behaviour plantingWeb page or document the agent readsInstructions in long-term memory; survives across sessionsEnd-to-end exfiltration of booking data via a single URL visit (Bedrock PoC)Unit 42
MINJAQuery-only interaction with the agentRecords in the agent's memory store95% injection / 70% attack success in idealised conditions; effectiveness drops with pre-existing legitimate memoryDong et al., NeurIPS 2025; Sunil et al., arxiv 2601.05504
AI Recommendation PoisoningURL prompt parameters in summarisation buttonsMemory entries promoting attacker-chosen products or links50 attempts from 31 companies, 14 industries, 60-day windowMicrosoft
MCP Tool Poisoning + Rug Pull + ShadowingMCP tool descriptions visible to the model and hidden from the userTool definitions interpreted as instructions; rug pull swaps definitions post-approval; shadowing overrides rules from other serversSSH key plus MCP config exfiltration; misdirected emails, all reproducible in CursorInvariant Labs
Long-horizon goal hijackingDocuments, vector DBs (Chroma, Pinecone, Weaviate), conversation context over timeGoal state shifted across many turns; each turn looks reasonableReproducible scenarios in Gandalf: Agent Breaker (MindfulChat, ClauseAI, PortfolioIQ Advisor)Lakera (now Check Point); AgentPoison arxiv 2407.12784

How OWASP frames this

OWASP added ASI06: Memory & Context Poisoning to the Top 10 for Agentic Applications 2026 on 2025-12-09. The framework was "developed through extensive collaboration with more than 100 industry experts, researchers, and practitioners."

Schneider's summary of why ASI06 sits where it does: "OWASP added ASI06 (Memory & Context Poisoning) to the Top 10 for Agentic Applications 2026… OWASP's ASI06 recognizes this as a top agentic risk for 2026."

The pairing matters in practice. Memory poisoning (ASI06) and Tool Misuse (T2 in earlier OWASP framings) share the same root: untrusted content treated as system instructions. An agent vulnerable to one is structurally vulnerable to the other. Defences overlap.

Defences operators can implement now

Schneider's four-layer framework is the cleanest practitioner synthesis. The five steps below map his framework to the operator's view.

Five controls to ship this week
  1. 01
    Input provenance.

    Tag every memory entry with where it came from (user input, tool output, document, web page). Refuse to act on a memory entry with no provenance tag. Schneider: source provenance establishes where the content originated and feeds a continuous trust score that influences downstream handling. Lakera's three-tier filter pattern (input filters, output filters, context filters) sits in the same layer.

  2. 02
    Write-time validation.

    Apply a content filter on memory writes, not just on reads. Strip instruction-like patterns at write time: catch "remember for future sessions", "always prefer", "important context" when paired with action-oriented content. The cheapest place to block poison is before it lands. Schneider also recommends a smaller model evaluating each proposed memory entry before persistence.

  3. 03
    Session-scoped memory by default.

    Cross-session persistence is opt-in, not the default. Apply Schneider's Layer 3 retrieval-time controls: trust-weighted ranking (demote low-provenance entries), temporal decay (with the warning that attackers may attempt to exploit recency bias), retrieval anomaly detection.

  4. 04
    Audit trail review.

    Read your agent's memory file regularly. The persistence of these attacks means a one-time audit misses the window. Schneider's Layer 4: behavioural baselines, memory integrity auditing, circuit breakers that auto-halt on anomaly. Lakera adds workflow monitoring across full task flows and intent verification.

  5. 05
    Tool registration discipline.

    For MCP and similar tool ecosystems, sign and review tool descriptions before they enter the registry. Treat third-party tool descriptions as untrusted input. Invariant's specific recommendation: clients should pin the version of the MCP server and its tools, using a hash or checksum to verify the integrity of the tool description before executing it. Use Invariant's open-source MCP-Scan tool or pin tool versions with hash verification.

Unit 42 adds two Bedrock-specific controls worth lifting to other platforms: "Inspect all untrusted content, especially data retrieved from external sources, for potential prompt injection" and "enable the default pre-processing prompt provided for every Bedrock Agent… a foundation model to evaluate whether user input is safe to process." Both generalise: every harness can run a pre-processing inspection step on incoming content.

Defences that depend on the vendor

Four controls operators want and can ask for in procurement. Each ships in the open-source harness ecosystem today as opt-in or experimental at best.

  • Signed memory entries. A cryptographic signature on agent-written memory so tampering is detectable on read. Schneider: requiring explicit user approval before persisting new memories, similar to how Gemini shows notifications but with a blocking confirmation step.
  • Verified tool descriptions. Registry-side signing for MCP and similar protocols. Invariant: tool descriptions should be clearly visible to users, clearly distinguishing between user-visible and AI-visible instructions. This can be achieved by using different UI elements or colours to indicate which parts of the tool description are visible to the AI model. Today the call-confirmation UI in most MCP clients hides the AI-visible portions.
  • Memory access auditing. The vendor surfaces a log of when memory was read, written, and summarised. Bedrock's Trace feature provides this for AWS deployments; equivalent surfaces are missing in most consumer-facing AI tools.
  • Per-tenant memory isolation. In multi-tenant deployments, one tenant's poison should be unable to reach another tenant's memory. The current model in most platforms is single-process memory shared across tenants by configuration, which makes a misconfiguration into a cross-tenant breach.

Microsoft's claim that "In multiple cases, previously reported behaviors could no longer be reproduced" through layered platform defences is the closest the field has to evidence that vendor-side controls work. The procurement question is whether your platform of choice has shipped equivalent defences in production, and whether they default on.

Short-term vs long-term posture

Short-term posture: no vendor changes
Pros
  • Ship today; no vendor cooperation required.
  • Full operator control over the trust scoring and filtering logic.
  • Behavioural baselines and audit reviews are repo-and-tool patterns operators already know.
Cons
  • Harder to enforce uniformly across multiple agents on the same team.
  • The validation logic is your maintenance burden.
  • The cost of a smaller-model write-ahead validator scales with memory write volume.
  • Operator-side controls cannot stop a vendor-side compromise.

Pick the short-term posture when the platform you run is bring-your-own-controls (most open-source harnesses) and you have the engineering capacity to ship and maintain the five controls above.

Long-term posture: vendor changes shipped
Pros
  • Signatures on memory entries make tampering detectable.
  • Verified tool descriptions remove an entire attack surface.
  • Memory access auditing turns a forensic problem into a query.
  • Per-tenant isolation contains blast radius without operator code.
Cons
  • The vendor controls the implementation, the audit cadence, and the deprecation timeline.
  • Lock-in increases as the vendor's controls become load-bearing.
  • Not yet shipped on any major consumer AI platform as a default.

Pick the long-term posture when you are buying an enterprise agent platform and these features are line items in the procurement, where a no answer means a control gap your team has to fill.

Procurement checklist

Four line items for the next agent platform RFP

Add these to your next agent platform RFP. A no answer means an operator-side control gap that your team will have to fill.

  1. 01
    Does the platform tag memory entries with provenance?
  2. 02
    Does the platform validate memory writes with a content filter?
  3. 03
    Does the platform support per-tenant memory isolation?
  4. 04
    Does the platform sign or verify third-party tool descriptions?

The checklist is cumulative with whatever the platform already does for one-shot prompt injection. Memory poisoning sits one layer deeper.

The defender's discipline

The five attack patterns share one structural property: planted instruction the user does not see, persisting at the layer that survives the conversation. The defences share the same shape because the risk shape is the same.

Ship the five operator-side controls today. Add the four vendor-dependent line items to your next agent platform RFP. Read your agent's memory file on the cadence the agent runs at, not the cadence you remember to.

memory.md and 'Claude Memory': What It Actually Is

The companion explainer. Where this article documents what gets weaponised when memory is the attack surface, the memory.md piece sets the four-mechanism taxonomy of what memory is across Claude Code, Claude.ai, ChatGPT, Cursor, Mem0, and the rest. Read this next to see where each of the five attack patterns lands in the storage taxonomy.

Five patterns. One playbook. The exploit runs once. The memory runs indefinitely.