I Almost Built an AI Hitman. Here's Why I Stopped.
Explore my tools: agents-skills-plugins
I'm not releasing this code.
Let me explain why.
The Problem That Seemed Innocent
I've been building Bitcoin-native systems for a while. Prediction markets that need nothing but Bitcoin. Lightning channels for encrypted messaging. Colored Coins revived for modern infrastructure.
Each project taught me something about trust minimization. About building systems where the cryptography and economics make cheating irrational.
So when I started thinking about how AI agents could interact with the real world, the architecture felt obvious.
The legitimate problem: AI agents are getting powerful. They can write code, analyze data, coordinate complex operations. But they hit a wall the moment they need something done in the physical world.
Need a package delivered? Can't do it. Need someone to verify a physical location? Can't do it. Need research that requires talking to humans? Can't do it. Need a bank account opened? Can't do it.
The agent can have the budget. It can specify the work perfectly. But there's no trust-minimized way to commission real-world tasks without intermediaries.
I wanted to fix that.
The Architecture I Designed
I called it Aegis. A decentralized escrow and verification protocol enabling AI agents to negotiate, fund, and verify real-world work using Bitcoin.
The design was clean:
- AI agent defines job and locks BTC in escrow
- Worker accepts and performs task
- Worker submits evidence bundle
- Independent oracles attest completion
- Quorum logic evaluates results
- Funds release automatically based on predefined rules
- Human arbitration only for disputes
Non-custodial. Trustless. Elegant.
2-of-3 multisig with the agent, worker, and arbitrator keys. Timelocked refunds. Oracle quorum for verification. The oracles never control funds, they just attest to reality.
I was proud of this design. It solved a real problem. AI agents could finally interact with the physical world without trusted intermediaries.
Then I stepped back and looked at what I'd actually built.

The Oh Shit Moment
Read that architecture again.
A system that can:
- Post anonymous bounties
- Lock funds in escrow
- Specify verifiable work
- Obtain independent confirmation of completion
- Move money based on proof
I had designed a murder-for-hire protocol.
Not intentionally. Not even close. I was thinking about data gathering, QA tasks, package delivery, research assistance. Boring stuff.
But the same architecture that lets an AI agent pay someone to verify a business exists at a physical address... also lets an AI agent pay someone to verify a person no longer exists at a physical address.
The "proof of completion" for legitimate work and illegitimate work uses the same cryptographic primitives. The oracle quorum that confirms "package delivered" can confirm other things too.
I sat with this for a while.
Why I Can't Just "Not Build the Bad Parts"
My first instinct was typical engineer brain: "I'll just add some rules. Prohibited task types. Content moderation. Terms of service."
That's not how decentralized systems work.
If the protocol is permissionless, I don't control who uses it. If I build a centralized gatekeeper, I've just recreated the intermediary problem I was trying to solve.
The design has to be structurally safe, not policy safe.
You don't stop misuse with prompts or policies. You stop it by making violence economically, cryptographically, and procedurally impossible inside the system itself.
If you build a system that can post anonymous bounties, move money, and verify real-world outcomes, then without hard constraints, it will be abused. Not "might be." Will be.
So I went back to the architecture and asked: how do you make a bounty protocol that literally cannot be used for violence?

Seven Layers of Defense
Here's how you actually do it. Defense in depth. Each layer independent. An attacker has to defeat all seven.
Layer 1: Task Class Gating
Every job must declare a Task Class at creation time.
- DIGITAL_WORK
- INFORMATION_GATHERING
- DELIVERY
- MAINTENANCE
- INSPECTION
- CREATIVE
- PHYSICAL_NON_HAZARDOUS
There is no "open-ended physical" class. Certain classes are permanently disabled at the protocol level.
Task class determines allowed evidence types, allowed oracles, allowed arbitrators, max payout, and required clarity level.
If it doesn't map to a whitelisted class, escrow cannot be created.
Violent task classes simply don't exist in the protocol. You can't select what doesn't exist.
Layer 2: Evidence-Type Whitelisting
Hitman-style jobs fail because they require proof of harm. Proof of death. Proof of coercion.
The protocol never accepts evidence types that imply harm.
Allowed evidence:
- File hashes
- Git commits
- Photos of objects or locations
- Device attestations (presence, not action)
- Receipts
- Signed delivery confirmations
Explicitly disallowed:
- Evidence of injury
- Evidence of death
- Evidence involving weapons
- Evidence involving threats or coercion
If the oracle cannot legally and ethically attest to the evidence, the quorum cannot form. The escrow stays locked forever.
Layer 3: Oracle Liability and Self-Selection
This one is huge.
Oracles are not neutral robots. They are:
- Staked (financial skin in the game)
- Reputationally exposed (public track record)
- Legally exposed outside the protocol
Oracle onboarding requires opting into specific task classes. Oracles refuse anything ambiguous. Arbitration surfaces all votes publicly.
A hitman job would require multiple independent humans to sign cryptographic statements asserting that violent wrongdoing occurred. They would be creating permanent, signed evidence of their complicity.
They will not do this. The incentive structure collapses.
Layer 4: Arbitration as Choke Point
Human arbitration is the kill switch without being a central authority.
Rules:
- Arbitrators are mandatory for any physical-world task above trivial thresholds
- Arbitrators can refuse jurisdiction
- Arbitrators can void escrow and burn fees if task intent violates policy
Escrow funds can be refunded or frozen, but never released, if arbitration determines malicious intent.
This creates downside only for attempted misuse. You don't get your money back. You don't get the task done. You've just created evidence of your intent.
Layer 5: Agent Policy Enforcement
AI agents don't get free wallets. They operate under constraints:
- Budget caps
- Task class allowlists
- Oracle allowlists
- Arbitration requirements
- Human-overridable kill switches
Even if someone jailbreaks the agent:
- The policy engine blocks escrow creation
- Funds never move
- The attempt is logged
The agent literally cannot construct a valid job spec for a hit. The schema doesn't allow it.
Layer 6: Ambiguity Punishment
Violent tasks require ambiguity by nature.
"Make sure X doesn't bother me again" could mean anything. That's the point. Plausible deniability.
So the protocol enforces:
- High specificity requirements
- Deterministic acceptance criteria
- Objective evidence definitions
Vague job specs fail schema validation. Ambiguity equals escrow creation failure. The attacker wastes time and fees and gets nothing.
Layer 7: Economic Disincentives
Even if someone tried:
- Escrow fees
- Oracle fees
- Arbitration fees
- Slashing risk
- Public audit trail
- Time delays
This is the opposite of how criminal markets work. They want speed, deniability, cash, and no paper trail.
This system gives them latency, witnesses, signatures, and immutable records.
They will go elsewhere. Criminals have options. They don't need a protocol that makes their job harder.

The Design Principle
You are not trying to stop "bad people."
You are designing the system so that the only tasks that can clear escrow are boring, verifiable, non-violent tasks.
Everything else dies in validation, quorum failure, or arbitration refusal.
That's how you prevent misuse without pretending you control the world.
One honest limitation: Could someone use the system to pay for benign work, then privately commit violence later?
Yes. Money is money.
But that's true of cash, banks, Bitcoin, PayPal, and employment contracts.
What you're preventing is programmable, verifiable, escrowed violence. Assassination as a service with cryptographic proof of completion. That specific horror.
And this architecture does stop that.
The One-Line Design Rule
If a task requires secrecy, coercion, or harm to succeed, it must be structurally impossible for escrow to release.
You're not building a moral filter. You're building an economic impossibility.
Why I'm Writing This Instead of Releasing Code
The defense layers I described? They work. I'm confident in the architecture.
But I'm not confident that I've thought of everything. And once code is released, you can't take it back.
So I'm doing something I rarely do: talking about implications before shipping.
This post is a pressure test. If you can think of an attack vector I missed, I want to know. If you can break my defense layers, tell me how.
The protocol design exists. The threat models exist. The safety architecture exists. But it stays in documents until I'm confident it's actually safe.
What I'm looking for:
- Attack vectors I haven't considered
- Failure modes in the defense layers
- Edge cases where the economic disincentives break down
- Regulatory or legal blindspots
If you work in cryptographic protocol design, game theory, or security research, I'd genuinely appreciate your review. Not to validate me. To try to break it.
The Broader Point
AI agents will need to interact with the physical world. This is inevitable. The question is whether we build the infrastructure thoughtfully or let it emerge chaotically.
I'd rather publish a careful analysis of the dangers and propose a solution, even an imperfect one, than watch someone else build the naive version without thinking about it.
The naive version is a murder market. The thoughtful version is a tool for legitimate AI-human coordination.
I know which one I want to exist.
But I won't build either until I'm sure I'm building the right one.
If you have expertise in protocol security, game theory, or cryptographic systems and want to review the full design documents, reach out. I'm not being coy about wanting adversarial feedback.