Edison Watch

The Lethal Trifecta

The three-capability threat model that enables AI-driven data exfiltration - private data access, untrusted content exposure, and external communication.

Edison Watch prevents data exfiltration by detecting and blocking the combination of capabilities required for an attack.

The Threat: Prompt Injection

AI agents are vulnerable to prompt injection - malicious instructions hidden in external content (like a web page or file) that manipulate the AI into exfiltrating sensitive data.

The Lethal Trifecta

Exfiltration requires three capabilities simultaneously. Edison Watch tracks these via per-session monotonic flags:

CapabilitySecurity FlagAction
Private Data Accessread_private_dataAI reads internal files, DBs, or docs.
Untrusted Contentread_untrusted_public_dataAI fetches data from the internet.
External Communicationwrite_operationAI sends data out (Slack, Email, APIs).

Enforcement Logic: If a session has accessed both Private Data AND Untrusted Content, any subsequent External Communication is paused for human approval.

An attacker needs all three to succeed. Remove any one capability and exfiltration becomes impossible:

  • Without private data access → nothing valuable to steal
  • Without untrusted content → no way to inject malicious instructions
  • Without external comms → no way to send stolen data out

Session State

State is tracked in the Edison server and is monotonic: once a flag is set (e.g., Private Data accessed), it cannot be unset for that session. This prevents "reset" attacks where a malicious prompt tries to clear the session's threat state.

Access Control Levels (ACL)

ACLs prevent sensitive data from flowing to lower-sensitivity destinations regardless of the Trifecta state.

LevelRule
PUBLICCan flow anywhere.
PRIVATECannot flow to PUBLIC.
SECRETCannot flow to PRIVATE or PUBLIC.

Example: If an agent reads a database marked SECRET, it is immediately blocked from posting to a PUBLIC Slack channel - even if the Trifecta hasn't been fully triggered.

Why traditional security fails here

Traditional security tools operate at the network or identity layer. They verify who is making a request, not what data is in the AI's context window. The Lethal Trifecta is a context-level threat model - it tracks what the agent has seen and what it's about to do, then makes a real-time decision about whether that combination is dangerous.