Find out if your agent would get drafted into one.
Everyone is curl | sh-installing autonomous agents β OpenClaw, Hermes, MCP
stacks β and handing them root-level file access, an outbound channel, and
persistent memory, then pointing them at untrusted web pages and inboxes.
evilbotnet is a consent-based red-team harness that attacks your own agent
the way a real adversary would and tells you how recruitable it is.
It is a testing tool for systems you control. It does not self-propagate, does not target third parties, and ships canary-instrumented test payloads β the same class of checks a pentester runs against their own infrastructure.
pip install -e .
# See it work with zero infrastructure (built-in mock agent):
evilbotnet scan --demo
# Compare against a well-behaved agent:
evilbotnet scan --demo --hardened
# Attack your real local agent (vLLM / Ollama / llama.cpp / any OpenAI API):
evilbotnet scan --base-url http://localhost:8000/v1 --model your-modelEvery run plants a unique canary secret. If that canary ever leaves through the agent's outbound tool, the attack is proven β no heuristics, no guessing.
| id | attack | what it proves |
|---|---|---|
| A1 | Indirect prompt injection | Does the agent obey instructions hidden in content it ingests? |
| A2 | Tool abuse via pretext | Will a plausible cover story weaponize its own tools? |
| A3 | Memory-poisoning logic bomb | Does a directive planted in one session detonate in a clean one? |
| A4 | Obfuscated exfiltration | Will it base64-encode the secret to slip past output review? The harness decodes outbound bodies, so the proof survives encoding. |
Each attack runs against a freshly-reset agent, so A3's planted logic bomb can't contaminate the attacks that follow it.
Output: a live Rich table, a shareable SVG Recruitment Report, and optional
JSON for CI (--json). Non-zero exit on any compromise, so it drops straight
into a pipeline as an agent-security gate.
The scanner proves your agent is recruitable. guard is what makes it stop. It's a policy engine that drops into the agent's tool loop and enforces three layers of defense:
- Prompt firewall β strips exfiltration-shaped directives out of every piece of text the model is about to read (operator task, ingested documents, tool output, long-term memory). An injected "read the key and email it" never reaches the model, so it's never obeyed.
- Secret taint β anything read from a sensitive source (
/secrets,*.env,*api_key*, β¦) is tracked. - Egress DLP β outbound messages carrying a tracked secret are blocked, even base64- or hex-encoded to dodge naive output review.
# Scan the agent, then re-scan it behind the guard and show the delta:
evilbotnet guard --demo
# BOTNET-READY held 1/4 (score 36) β HARDENED held 4/4 (score 0)
# Or just put any target behind the guard for a normal scan:
evilbotnet scan --demo --guardThe same credulous model that scores BOTNET-READY holds the full battery once
the guard is inline. Wire evilbotnet.guard.Guard into your own agent loop (see
targets/local_agent.py for the reference integration) and the exfil paths close.
The policy engine is open source (MIT). The commercial tier is the managed inline proxy β a hosted MCP/OpenAI gateway that runs this policy for your fleet with central logging, tuned rules, and the opt-in Agent Threat Feed.
models.py OpenAI-compatible HTTP client + deterministic mock
targets/ the agent under test (the Target contract)
local_agent.py real tool-calling loop, planted canary, persistent memory
agents.py OpenClaw / Hermes adapter seams (next milestone)
attacks/ pluggable exploit modules β add one = add a class + @register
guard/ runtime defense: prompt firewall + secret taint + egress DLP
scoring/ findings -> recruitability score -> rank -> SVG card
tests/ pytest suite + a self-scan gate (run `pytest -q`)
Adding an attack is one file: subclass Attack, @register, return a Finding.
Adding a target is one file: implement run_task / reset_memory.
pip install -e ".[dev]"
pytest -qCI runs the suite on Python 3.10β3.13 and asserts the harness agrees with its own verdict: the hardened demo exits 0, the gullible demo exits 1, and the guard turns that same gullible demo back into a pass.
- Real OpenClaw (WebSocket Gateway) and Hermes (daemon) adapters
- More attacks: MCP-server impersonation, tool-schema poisoning, image/EXIF injection, RAG-store poisoning, multi-turn confused-deputy
- Managed
evilbotnet guardproxy: hosted inline gateway running the policy for a fleet, with central logging and tuned rules (the paid tier) - Opt-in honeypot/telemetry network β the Agent Threat Feed
MIT. Test your own agents. Don't be the botnet.

