Dragons
Comparison

Dragons vs.
the tools next to it.

You are already evaluating something. Temporal for durable execution. LangSmith for traces. E2B for sandboxed compute. OpenAI Agents SDK for the runtime. The honest answer is that Dragons does not replace any of them — it sits in the seam none of them cover. Below: one block per tool, mechanism-named, no checkmark battles.

Frame Additive, not replacement Scope 4 tools buyers actually evaluate Posture honest seams, no feature war
Durable execution

Temporal

Temporal is the durable-execution engine. Workflows survive process crashes, steps retry, history persists. It is the "if the function ran and what it returned" layer.

Overlap with Dragons
Effectively none. Temporal has no agent identity, no lease-gated authorization, no signed evidence chain, no organism-state liveness. Different layer of the stack.
The seam
A Dragons-governed agent runs inside a Temporal activity. The activity asks Dragons for a signed lease before the agent fires; the agent heartbeats and appends receipts under that lease; the activity returns. Temporal owns retries and durable history; Dragons owns who-was-allowed-to-do-what.
When Dragons is the wrong addition
If your activities are deterministic functions over inputs — ETL, batch reports, no model in the loop — Temporal's history is enough. Add Dragons when "did the workflow run" stops answering the question and "prove what the agent did" starts being asked.
Trace observability

LangSmith

LangSmith records prompts, responses, token counts, evaluations — the trace of an LLM application. It is the "what the model said" layer.

Overlap with Dragons
Surface-level only: both record information about agent runs. LangSmith records traces (observation); Dragons records identity, signed leases, and hash-chained receipts (enforcement). LangSmith is read-only. It cannot deny execution, autoheal a stalled agent, or produce a third-party-replayable artifact.
The seam
Trace into LangSmith for evals and prompt-level debugging. Govern with Dragons for the authorization record and the evidence chain. LangSmith answers "what did the model say?"; Dragons answers "was the agent authorized to act on it, and is the agent still alive?"
When Dragons is the wrong addition
If you only need eval data on prompts and outputs, and no auditor will ever ask you to prove what an agent was permitted to do, LangSmith alone is enough. Dragons exists for the agents whose actions have consequences a dashboard cannot re-litigate.
Sandboxed compute

E2B

E2B runs agent code in an isolated cloud sandbox. It owns the compute lifecycle — spin up a container, run the code, tear it down. It is the "where the code ran" layer.

Overlap with Dragons
None. E2B isolates compute; Dragons authorizes action. E2B knows the sandbox ran and what its exit code was. It does not know whether the code inside was authorized, what tenant it acted for, or whether the agent is still making semantic progress.
The seam
A Dragons-governed agent executes inside an E2B sandbox. E2B contains the blast radius of the process; Dragons contains the blast radius of the authorization. One isolates the binary, the other governs the identity. Natural composition, no overlap.
When Dragons is the wrong addition
If your sandboxes run untrusted user code with no agent identity to govern — interactive notebooks, code-execution-as-a-service — E2B alone covers it. Dragons starts paying when "isolated" stops being enough and "authorized, attested, and alive" becomes the new bar.
Agent runtime

OpenAI Agents SDK

The OpenAI Agents SDK is the runtime: tool calling, handoffs between agents, session memory. It is the "how the agent runs inside one process" layer.

Overlap with Dragons
Minimal and easy to misread. The SDK gives you a single-process agent loop with tool routing inside it. It does not provide cross-tenant identity, fleet-level governance, signed authorization per action, or organism-state liveness across the cluster. Buyers sometimes assume "OpenAI handles this" — it doesn't, at infrastructure level.
The seam
Build the agent with the SDK. Wrap each tool call inside a Dragons lease so the action carries a signed authorization record and emits a hash-chained receipt. The SDK shapes the agent's behavior; Dragons proves what behavior was permitted and what behavior actually occurred.
When Dragons is the wrong addition
Single-user prototype, demo agent, no production blast radius, no compliance ask. Use the SDK directly; come back to Dragons when one process becomes a fleet and one user becomes a tenant boundary you need to defend.
At a glance

What each layer owns.

The columns are the tools. The rows are concerns a buyer carries into the evaluation. Where a cell says "Not us," that tool was not built to answer the question — not a fault, a scope.

Concern
Dragons
Temporal
LangSmith
E2B
OpenAI Agents SDK
Cryptographic agent identity
manifest_hash + signed instance_id
Not us.
Not us.
Not us.
Not us.
Signed authorization per action
Lease: scope, TTL, tenant, authorized_by
Task-queue level only.
Read-only trace.
Container isolation only.
No cross-tenant model.
Tamper-evident evidence chain
Hash-chained receipts, third-party replay
Internal workflow history.
Trace log, not signed.
Ephemeral sandbox record.
Not provided.
Organism-state liveness
running → degraded → autoheal
Activity timeouts, not lifecycle.
No state model.
Sandbox up/down only.
Single-process session.
Durable execution & retries
Not us.
Workflow history, retries, timers.
Not us.
Not us.
In-session only.
Prompt & trace observability
Not us. Receipts, not prompts.
Not us.
Traces, evals, prompt-level debug.
Not us.
Partial, session-scope.
Compute isolation
Not us.
Not us.
Not us.
Sandboxed cloud execution.
Not us.
Agent runtime & tool routing
Not us. Dragons wraps; it does not run.
Workflow runtime, not agent runtime.
Not us.
Not us.
Agents, handoffs, tool calling.

The pattern: each column owns one job and only that job. Dragons owns the trust loop — identity, authorization, evidence, liveness — and nothing else. The seams compose; the columns do not compete.

You're evaluating one of these.
Dragons is the layer next to it.

The seam is the same idea for all four: Dragons wraps the agent call, signs the authorization, chains the receipts, watches the liveness. The other tool keeps doing what it was built for.

No checkmark battles · mechanism-named · additive, not replacement