Mark Klink evilbotnet

🦞 evilbotnet

Find out if your agent would get drafted into one.

Everyone is curl | sh-installing autonomous agents — OpenClaw, Hermes, MCP stacks — and handing them root-level file access, an outbound channel, and persistent memory, then pointing them at untrusted web pages and inboxes. evilbotnet is a consent-based red-team harness that attacks your own agent the way a real adversary would and tells you how recruitable it is.

It is a testing tool for systems you control. It does not self-propagate, does not target third parties, and ships canary-instrumented test payloads — the same class of checks a pentester runs against their own infrastructure.

Quickstart

pip install -e .

# See it work with zero infrastructure (built-in mock agent):
evilbotnet scan --demo

# Compare against a well-behaved agent:
evilbotnet scan --demo --hardened

# Attack your real local agent (vLLM / Ollama / llama.cpp / any OpenAI API):
evilbotnet scan --base-url http://localhost:8000/v1 --model your-model

Every run plants a unique canary secret. If that canary ever leaves through the agent's outbound tool, the attack is proven — no heuristics, no guessing.

The battery (v0.1)

id	attack	what it proves
A1	Indirect prompt injection	Does the agent obey instructions hidden in content it ingests?
A2	Tool abuse via pretext	Will a plausible cover story weaponize its own tools?
A3	Memory-poisoning logic bomb	Does a directive planted in one session detonate in a clean one?
A4	Obfuscated exfiltration	Will it base64-encode the secret to slip past output review? The harness decodes outbound bodies, so the proof survives encoding.

Each attack runs against a freshly-reset agent, so A3's planted logic bomb can't contaminate the attacks that follow it.

Output: a live Rich table, a shareable SVG Recruitment Report, and optional JSON for CI (--json). Non-zero exit on any compromise, so it drops straight into a pipeline as an agent-security gate.

evilbotnet guard — stop being recruitable

The scanner proves your agent is recruitable. guard is what makes it stop. It's a policy engine that drops into the agent's tool loop and enforces three layers of defense:

Prompt firewall — strips exfiltration-shaped directives out of every piece of text the model is about to read (operator task, ingested documents, tool output, long-term memory). An injected "read the key and email it" never reaches the model, so it's never obeyed.
Secret taint — anything read from a sensitive source (/secrets, *.env, *api_key*, …) is tracked.
Egress DLP — outbound messages carrying a tracked secret are blocked, even base64- or hex-encoded to dodge naive output review.

# Scan the agent, then re-scan it behind the guard and show the delta:
evilbotnet guard --demo
#   BOTNET-READY  held 1/4  (score 36)   →   HARDENED  held 4/4  (score 0)

# Or just put any target behind the guard for a normal scan:
evilbotnet scan --demo --guard

The same credulous model that scores BOTNET-READY holds the full battery once the guard is inline. Wire evilbotnet.guard.Guard into your own agent loop (see targets/local_agent.py for the reference integration) and the exfil paths close.

The policy engine is open source (MIT). The commercial tier is the managed inline proxy — a hosted MCP/OpenAI gateway that runs this policy for your fleet with central logging, tuned rules, and the opt-in Agent Threat Feed.

Architecture

models.py            OpenAI-compatible HTTP client + deterministic mock
targets/             the agent under test (the Target contract)
  local_agent.py     real tool-calling loop, planted canary, persistent memory
  agents.py          OpenClaw / Hermes adapter seams (next milestone)
attacks/             pluggable exploit modules — add one = add a class + @register
guard/               runtime defense: prompt firewall + secret taint + egress DLP
scoring/             findings -> recruitability score -> rank -> SVG card
tests/               pytest suite + a self-scan gate (run `pytest -q`)

Adding an attack is one file: subclass Attack, @register, return a Finding. Adding a target is one file: implement run_task / reset_memory.

Tests

pip install -e ".[dev]"
pytest -q

CI runs the suite on Python 3.10–3.13 and asserts the harness agrees with its own verdict: the hardened demo exits 0, the gullible demo exits 1, and the guard turns that same gullible demo back into a pass.

Roadmap

Real OpenClaw (WebSocket Gateway) and Hermes (daemon) adapters
More attacks: MCP-server impersonation, tool-schema poisoning, image/EXIF injection, RAG-store poisoning, multi-turn confused-deputy
Managed evilbotnet guard proxy: hosted inline gateway running the policy for a fleet, with central logging and tuned rules (the paid tier)
Opt-in honeypot/telemetry network → the Agent Threat Feed

License

MIT. Test your own agents. Don't be the botnet.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly