In Context Trajectory Poisoning: Bypassing LLM Agent Monitors with Natural Language
Agent monitors are supposed to catch when an AI has been compromised, but you can fool them with ordinary text: no model access, no GPUs, no gibberish strings.
Apr 12, 2026
Read more