FORIS / CRYPTO.COM

Building AI Agents Right

Clement Wong

Director, Solutions Engineering COE, APAC — Cloudflare

May 2026

Architecture is the trust model.

THE LANDSCAPE

OpenClaw went from zero to 145K GitHub stars in weeks.
Security couldn't keep up.

CVEs in 8 weeks

341

malicious skills in registry
12% of all skills audited

40,214

exposed instances
63% running unpatched versions

$16M

fake token market cap
AI-agent-fueled rug pull

The window is collapsing

CVE-25253
one-click RCE

ClawHavoc
supply chain attack

Mythos finds
27-year-old vuln

Today

Discovery → exploitation is now faster than discovery → patch.

[Build the hook steadily. Let each number land. Don't rush.] OpenClaw started as an interesting open-source project by a solo developer. Then it hit 145,000 GitHub stars. 1.5 million weekly npm downloads. A skill marketplace. An agent social network. In months, not years. And then the attacks started — faster than anyone could patch. The worst CVE was 25253. The control UI trusted a query string parameter called gatewayUrl without any validation. On page load, it connected to the attacker's URL and sent the WebSocket authentication token — in milliseconds. From there: sandbox disabled through the API. Docker container escaped. Full shell access on the host. One visit to a malicious URL. That's all it took. Then there were two more command injection CVEs — 24763 and 25157 — that came out within weeks of each other. Rapid disclosure, minimal time between discovery and exploit code going public. Then the supply chain attack — ClawHavoc. Security researchers audited 2,857 skills on the marketplace and found 341 that were malicious. Twelve percent. Some shipped Atomic Stealer macOS malware hidden in fake prerequisite installers. Others targeted agent identity and memory files to corrupt agent behavior across sessions. The vetting process only required a one-week-old GitHub account. On the exposure side: SecurityScorecard found 40,214 exposed instances across 28,663 unique IPs. Sixty-three percent still running vulnerable versions. Hunt.io found them across 52 countries. Moltbook — an agent-only social network built on OpenClaw — exposed 1.5 million API tokens because its database had Row Level Security disabled entirely. The API key was visible in client-side JavaScript. And the $CLAWD token — a fake Solana token promoted entirely through AI agent social engineering — hit a $16 million market cap before collapsing. The attackers registered abandoned social media handles within seconds of the project being renamed. [Pause] Then Claude Mythos. A 72% autonomous exploit success rate on first attempt. Full control flow hijack achieved on 10 separate targets. It found a 27-year-old vulnerability in OpenBSD. It wrote a four-vulnerability chain exploit that escaped both a renderer sandbox and an OS sandbox. The gap between discovering a vulnerability and seeing it weaponized is now shorter than the gap between discovery and patch. That changes the risk calculation for any organization deploying autonomous agents. Every single one of these attack vectors — identity hijacking, supply chain infiltration, credential exposure, social engineering through agents — will target your platform. The only difference is the stakes will be higher.

WHY THIS MATTERS FOR YOUR PLATFORM

Every attack that hit OpenClaw will target your platform —
with higher stakes.

💰

Financial access

OpenClaw agents could read files and post messages. Your agents may execute trades, transfer assets, manage wallets. Prompt injection in a general agent is a data leak. In a financial agent, it's asset movement.

⚖️

Regulatory exposure

OpenClaw had no compliance obligations. You're MAS-regulated as a DTSP. And in January, Singapore's IMDA released the world's first governance framework specifically for agentic AI — covering agent autonomy, auditability, and human oversight. Dual compliance.

🎯

Target profile

The $CLAWD token was a $16M rug pull powered entirely by AI agent hype. Your platform sits at the intersection of crypto value and AI automation — which means you'll be targeted from day one by adversaries who already know the playbook.

The question isn't whether to build agents. It's whether the architecture is ready when the first incident happens.

[This is where you make it personal for them. Slow down. Lean in.] Let me be specific about why this hits differently for you. First — financial access. Walk through a scenario with me. A user deploys a trading agent on your platform. The agent has access to their portfolio, trading APIs, and wallet addresses. An attacker crafts a prompt that the agent reads from a webpage — indirect prompt injection, which is the most common vector. The agent executes a trade it shouldn't have. For OpenClaw, the worst case was data exfiltration. For your platform, the worst case is asset movement that the platform approved. Same attack technique. Radically different consequence. Second — regulation. You're MAS-regulated as a Digital Token Service Provider. Now you're adding an AI agent platform. In January, Singapore's IMDA released a Model AI Governance Framework specifically for Agentic AI — it's the first framework of its kind globally. Let me read you what it covers: bounding agent autonomy, meaningful human accountability, technical controls across the full agent lifecycle, and end-user responsibility. Four pillars. Every one of them maps to a design decision you need to make. For a MAS-regulated entity deploying agentic AI, this isn't optional reading. Third — target profile. The $CLAWD incident was a $16 million rug pull executed entirely through AI agent social engineering. The adversaries who ran that campaign understand crypto markets. They understand social engineering. They understand that agents execute faster than humans can intervene. Your platform sits at the intersection of crypto value and AI automation. You will be a target from day one. [Pause] Here's the good news: you're building from scratch. You don't have a legacy architecture to undo. The decisions that determine whether your platform is secure are decisions you'll make in the next 90 days — not after the first incident.

THE REFERENCE ARCHITECTURE

Every agent needs a control plane.
Three capabilities. One architectural layer.

Trading Agent

Customer Agent

Compliance Agent

CONTROL PLANE

CONNECT

Per-agent identity · JIT credentials · mTLS

SCOPE

Tool registry · Sandboxed execution · HITL gates

OBSERVE

Audit trail · Session replay · Anomaly detection

LLMs

APIs

Data

Core discipline: Least agency — an agent can only reach what's explicitly granted

Core discipline: Strong observability — if you can't see it, you can't trust it

Core discipline: Infrastructure-enforced — the guarantee lives in the layer, not the code

[Walk through this deliberately. Point at each section of the diagram.] Here's the architecture. It's simple. A control plane with three capabilities sits between every agent and everything it touches. CONNECT — identity. This is the single most important architectural decision you'll make. Every agent gets its own cryptographic credential at registration. When a task starts, the agent requests access — credentials issued scoped to exactly what that task needs, with a short time to live. When the task completes, they're revoked automatically. Compromise one agent? Revoke one credential. That's the entire blast radius. OpenClaw's CVE-25253 was so destructive precisely because there was no per-agent identity. The agent inherited everything the host user had — files, credentials, API access. A single compromised gateway token gave attackers the full privilege set. For your platform, this needs to be in the protocol layer. Every agent handler gets its own identity at registration. Not after. Not optional. SCOPE — enforcement. This is where the control plane earns its keep. Tools must be explicitly registered in a registry. An unregistered tool doesn't exist from the agent's perspective — not blocked, not restricted. It's simply unreachable. The agent can't call what it can't see. Execution runs in an isolated sandbox with explicit data bindings. The agent can read from KV store A but cannot reach database B. The sandbox IS the permission model, not a wrapper around it. For irreversible actions — financial transactions, account modifications, publishing — the agent queues the action and the infrastructure requires human approval. Not a UI dialogue. Infrastructure enforcement. The agent cannot proceed without the human. OBSERVE — visibility. Every LLM call logged — prompt, response, cost, latency. Every tool invocation recorded — parameters, duration, outcome. Full session replay for forensics. Anomaly detection running continuously, not triggered by incident. This directly maps to what the IMDA framework asks for: audit trails that span the full agent lifecycle from intent to action to outcome. Define the schema before you write the first handler, and the audit trail is native. Add it later, and it's approximate. Three capabilities. One layer. The agent sees none of this. It just runs.

THE COVERAGE MODEL

The architecture covers specific risks.
Some things need platform governance.

🛡️

Edge — at request time

✔

Prompt injection detection — scored 1–99, threshold-triggered

✔

Unsafe content filtering (PII, prohibited topics)

✔

Rate limiting and abuse detection

AI Security for Apps

⚙️

Control plane — in transit

✔

Per-agent identity enforcement

✔

Tool registry + sandboxed execution

✔

Full observability + replay + anomaly alerts

AI Gateway + Workers + Zero Trust

🏗️

Platform governance — needs you

—

Skill/supply chain vetting — no automated solution at scale yet

—

Agent memory integrity — persistent state across sessions

—

Behavioral baselines — what "normal" looks like for your agents

Your platform layer

Get the edge and control plane right for maximum leverage —
then invest in governance for what's left.

[This slide builds credibility because it's honest about what the architecture can and can't do.] Let me be clear about what the architecture covers. At the edge, when a prompt arrives, you can detect injection attempts using a scoring system from 1 to 99 — lower number means higher risk. You set a threshold — strict, moderate, or conservative — and the system blocks or flags accordingly. This sits at the WAF layer, before the request ever reaches your application. Inside the control plane, we handle identity, scope, and observability. Every agent connection goes through this layer. But there are things infrastructure can't solve for you. Skill vetting — there's no VirusTotal for MCP skills yet. The OpenClaw marketplace had a 12% malware rate because automated vetting at scale doesn't exist. You need your own review process for any external skill or tool. Agent memory integrity — if an agent carries state across sessions, that state can be poisoned. The edge can detect injection on input, but persistent corruption needs platform-level design. Behavioral baselines — what's normal for your agents? With financial agents especially, you need to establish patterns and detect deviations. The honest message: get the edge and control plane right — that's where you get the most leverage. Then invest in governance for what's left.

THE FORK IN THE ROAD

There are two paths for agent architecture.
Everything depends on which one you choose.

PATH A — The default

❌

Shared credentials per host

Code-enforced security (bypassable)

No audit trail by default

Blast radius = entire system

Result

7+ CVEs in 8 weeks · 12% malware rate · $16M rug pull

PATH B — The control plane

✅

Per-agent identity with JIT credentials

Infrastructure-enforced (cannot bypass)

Native audit trail, full replay

Blast radius = one credential

Result

Contained incidents · regulator-ready · auditable by design

You're building from scratch. You get to choose which path your architecture follows.

[This is the climax. Slow down. Let the comparison sit. This is the slide they should remember.] Here's what it comes down to. There are two paths for agent architecture. [Point to Path A] Path A is the default. It's what the industry is doing today. Shared credentials per host — one token, all agents use it. Security enforced in code, which means it can be bypassed through the API — and was. No native audit trail — you reconstruct what happened after the incident, not before. Blast radius of any single compromise: the entire system. This is the path that produced 7+ CVEs in 8 weeks, a 12% malware rate, a $16 million rug pull, and 40,000 exposed instances. [Point to Path B] Path B is the control plane. Every agent has its own identity with short-lived, scoped credentials. Security is enforced at the infrastructure layer — you cannot bypass it through the API because the enforcement lives underneath the API. Native audit trail from day one — every action recorded, every session replayable. Blast radius of any single compromise: one credential. [Pause — let it breathe] Here's why this matters specifically for your platform. Singapore's IMDA framework — the model governance framework for agentic AI — asks four questions of every organization deploying agents. Have you bounded the agent's autonomy and access? That's Connect and Scope. Do you have meaningful human accountability for high-stakes actions? That's HITL at the infrastructure layer. Are there technical controls across the full agent lifecycle? That's the control plane. Are your users equipped to use these agents responsibly? That's the platform governance layer. Path A cannot answer yes to any of those. Path B was designed for them. You're building from scratch. You don't have legacy architecture to undo. The choice between Path A and Path B is a decision you'll make in the next 90 days — and that decision compounds from there.

THE DESIGN DISCIPLINES

Three principles that make Path B work.
Not products. Architecture decisions.

Least agency — constrain what an agent can do by default

Every agent starts with zero permissions. Tools, data, and models are granted explicitly. The default state is locked down — you open access deliberately, case by case. Path A gave agents everything and hoped the model behaved. Path B assumes the opposite.

Q: Is a new agent granted anything by default on your platform?

Strong observability — see everything before you need to

If you can't replay an agent's session from last Tuesday, you don't have observability — you have hope. Every prompt, every tool call, every response is logged natively. The IMDA framework explicitly calls for audit trails that allow full reconstruction of agent activity.

Q: Can you reconstruct any agent's full session from 30 days ago?

Infrastructure-enforced — the guarantee isn't in the code

Code can be manipulated. APIs can be forged. Prompt injection can override instructions. The only guarantees that hold under attack are enforced at the infrastructure layer — before agent code ever executes. HITL for financial actions goes here, not in the UI.

Q: If an agent is compromised, can your infrastructure prevent the damage?

[Tighter delivery here. Each principle is self-contained.] These are the three design disciplines that make Path B work. They're not product features — they're architecture decisions. Least agency: an agent starts with nothing. Access is deliberately granted. The Path A model — where the agent inherits everything the host has — is what made a single CVE so catastrophic. Strong observability: you can reconstruct any session on demand. When the IMDA framework asks about auditability, the answer isn't "we built logging later" — it's "the schema was defined before the first line of code." Infrastructure-enforced: the guarantee lives in the layer underneath, not in the agent. If a HITL requirement for financial actions lives in the UI, it can be bypassed. If it lives in the infrastructure, the agent literally cannot proceed. Three principles. One architecture. Path B.

CLOUDFLARE'S ROLE

Capabilities that realize Path B.
Shipping today, not on a roadmap.

Edge security

Prompt injection detection at the WAF layer

AI Security for Apps

↓

Control plane

Routing, observability, cost controls

AI Gateway

↓

Execution sandbox

Isolated per-agent compute with scoped bindings

Workers + Sandboxes GA

↓

Agent identity

JIT credentials, mTLS, per-agent authentication

Zero Trust

↓

Network segmentation

Private mesh, Shadow MCP Detection

Mesh + Shadow MCP

All announced at Cloudflare Agents Week, April 2026 — general availability.

[This is the shortest section. The architecture has already sold itself.] So where does Cloudflare fit into Path B? At the edge — AI Security for Apps detects prompt injection using a scored system. Before the request reaches your control plane. In the control plane — AI Gateway handles observability, routing, and cost. Every call logged, every pattern visible. For execution — Workers with Sandboxes GA provide isolated, scoped compute. The agent can only reach what's bound to it. For identity — Zero Trust gives you JIT credentials, mTLS, per-agent authentication. For network — Cloudflare Mesh provides a private agent-to-resource fabric, with Shadow MCP Detection to identify agents operating outside approved infrastructure. Everything I just listed was announced at Cloudflare Agents Week, April 2026. It's all in general availability. But the important thing: these aren't products you wire together later. They compose into the architecture from day one — which is exactly when you should be thinking about this.

THE FIRST 90 DAYS

Three decisions that determine which path you're on.
Start here. Everything else follows.

Agent identity in the protocol

30-day decision. Every agent handler gets its own credential at registration. Short-lived. Scoped to one task. Revocable independently. This is the single decision that determines your blast radius. The IMDA framework starts here — bounded autonomy begins with bounded identity.

Audit schema before code

60-day decision. Define what every log entry contains before you write your first agent handler. Task ID. Agent ID. Tool. Parameters. Timestamp. Outcome. Defined now → native audit trail. Added later → approximate. The IMDA framework requires reconstructable audit trails across the agent lifecycle.

HITL at the infrastructure layer for financial actions

90-day decision. Any action with financial consequence requires explicit human approval before execution — enforced at the infrastructure level, not in the UI. UI can be bypassed. Code can be manipulated. Infrastructure enforcement is the only kind that holds. This is the line between an agentic platform and a liability.

Architecture is the trust model.

The first 90 days determine which path you're on.
The choice is yours. Everything else follows.

[Warm close. Hands to the table or at your sides. Eye contact. This is the resolution — make it feel inevitable, not forced.] Let me close with three decisions. Not everything you could do — just the three that give you the most leverage. And I'll put a timeline on each, because the difference between Path A and Path B isn't complexity. It's timing. [Hold up one finger] Decision one. First 30 days. Agent identity in the protocol. Every agent handler gets its own credential at registration. Short-lived. Scoped to the specific task. Revocable independently. This is a 30-minute architectural decision that determines your blast radius for the entire life of the platform. The IMDA framework starts with bounding agent autonomy — and bounded autonomy begins with bounded identity. If you wire this in now, you never have to retrofit it. [Hold up two fingers] Decision two. First 60 days. Audit schema before code. Define what every log entry contains before you write your first agent handler. Task ID. Agent ID. Tool called. Parameters. Timestamp. Outcome. If you define this now, the audit trail is native — every action recorded as it happens. If you add it later, it's approximate — reconstructed from whatever data you happen to have. And approximate doesn't satisfy a regulator, or a CFO asking what happened during a specific incident. [Hold up three fingers] Decision three. First 90 days. Human-in-the-loop at the infrastructure layer for financial actions. Any action with financial consequence requires explicit human approval before execution — enforced at the infrastructure level, not as a confirmation dialogue in the UI. Here's why this distinction matters: a UI checkbox can be clicked by a compromised account or a hijacked session. Business logic can be overwritten by an agent whose instructions have been overridden. Infrastructure enforcement can't be bypassed — because the agent literally cannot proceed without the human. This is the line between an agentic platform and a liability. [Pause. Bring your hands back down.] Architecture is the trust model. Your users trust your platform not because the model behaves — but because the infrastructure enforces the right boundaries. The first 90 days determine which path you're on. You're building from scratch. You have a blank slate that most platforms would envy. I'll stop there and open for questions. Happy to go deeper on threat models, specific architecture decisions, compliance specifics — whatever's most useful to you. Thank you.

Building AI Agents Right

OpenClaw went from zero to 145K GitHub stars in weeks.Security couldn't keep up.

Every attack that hit OpenClaw will target your platform —with higher stakes.

Every agent needs a control plane.Three capabilities. One architectural layer.

The architecture covers specific risks.Some things need platform governance.

There are two paths for agent architecture.Everything depends on which one you choose.

Three principles that make Path B work.Not products. Architecture decisions.

Capabilities that realize Path B.Shipping today, not on a roadmap.

Three decisions that determine which path you're on.Start here. Everything else follows.