The Lethal Trifecta

The three-capability threat model that enables AI-driven data exfiltration - private data access, untrusted content exposure, and external communication.

Edison Watch prevents data exfiltration by detecting and blocking the combination of capabilities required for an attack.

The Threat: Prompt Injection

AI agents are vulnerable to prompt injection - malicious instructions hidden in external content (like a web page or file) that manipulate the AI into exfiltrating sensitive data.

The Lethal Trifecta

Exfiltration requires three capabilities simultaneously. Edison Watch tracks these via per-session monotonic flags:

Capability	Security Flag	Action
Private Data Access	`read_private_data`	AI reads internal files, DBs, or docs.
Untrusted Content	`read_untrusted_public_data`	AI fetches data from the internet.
External Communication	`write_operation`	AI sends data out (Slack, Email, APIs).

Enforcement Logic: If a session has accessed both Private Data AND Untrusted Content, any subsequent External Communication is paused for human approval.

An attacker needs all three to succeed. Remove any one capability and exfiltration becomes impossible:

Without private data access → nothing valuable to steal
Without untrusted content → no way to inject malicious instructions
Without external comms → no way to send stolen data out

State is tracked in the Edison server and is monotonic: once a flag is set (e.g., Private Data accessed), it cannot be unset for that session. This prevents "reset" attacks where a malicious prompt tries to clear the session's threat state.

Access Control Levels (ACL)

ACLs prevent sensitive data from flowing to lower-sensitivity destinations regardless of the Trifecta state.

Level	Rule
PUBLIC	Can flow anywhere.
PRIVATE	Cannot flow to PUBLIC.
SECRET	Cannot flow to PRIVATE or PUBLIC.

Example: If an agent reads a database marked SECRET, it is immediately blocked from posting to a PUBLIC Slack channel - even if the Trifecta hasn't been fully triggered.

Why traditional security fails here

Traditional security tools operate at the network or identity layer. They verify who is making a request, not what data is in the AI's context window. The Lethal Trifecta is a context-level threat model - it tracks what the agent has seen and what it's about to do, then makes a real-time decision about whether that combination is dangerous.

MCP Malware Risks

How STDIO MCP servers create unmanaged attack surfaces.

AI-Compatible RBAC

Why access control must track data audiences, not just identities.

The Lethal Trifecta

The Threat: Prompt Injection

The Lethal Trifecta

Session State

Access Control Levels (ACL)

Why traditional security fails here

MCP Malware Risks

AI-Compatible RBAC

On this page