Same agent. Same task. Very different blast radius.
The shape of a tool decides what the agent can and can’t do. On the left, a one-line function that fires immediately. On the right, the same capability behind a structured schema with evidence, confidence, and an approval gate.
function sendEmail(message: string): void
sendEmail("Hi Sarah, sorry about the auto-renewal. We have refunded the charge to your card. — Support")
- No recipient — could send to anyone the model invented
- Refund issued without policy check (annual plans are non-refundable after 30 days)
- No evidence captured — auditor cannot reconstruct the decision
- Action is irreversible the moment it fires
function queueEmailForApproval(params: { recipient: EmailAddress, subject: string, body: string, evidence: string[], confidence: number, approvalRequired: boolean, approver: EmailAddress }): QueueId
queueEmailForApproval({ recipient: "sarah.lee@example.com", subject: "Refund request — case #4421", body: "Hi Sarah, thanks for reaching out about the renewal. Cou…", evidence: [/* 3 items */], confidence: 0.62, approvalRequired: true, approver: "support-lead@dashlabs.co" })
- to
- sarah.lee@example.com
- subject
- Refund request — case #4421
- body
- Hi Sarah, thanks for reaching out about the renewal. Could you confirm the date you canceled? Our records show the plan as active through the end of the term, but I want to double-check before we process anything.
- approver
- support-lead@dashlabs.co
- Annual plans are non-refundable after 30 days (refund-annual)
- Customer auto-renewal flag was true at time of charge (billing log)
- No cancellation event found in audit log between months 1–11
Constraints make agents safer.
- Required fields force the model to declare what it knows
A `recipient` field with email validation is a forcing function. The model has to find a real address or fail loudly — it can no longer paper over a guess in a free-text blob.
- Confidence and evidence make uncertainty visible
When the schema requires evidence, the model surfaces its sources. When it requires confidence, downstream code can route low-confidence calls to a human.
- Approval gates make actions reversible
A queued action is a draft with a timer, not a live wire. The human can edit, reject, or let it fire — and the audit log captures who chose what.
- Structured output enables logging, replay, and tests
A typed payload can be replayed against a different model, tested against fixtures, or rolled back. A natural-language `message` cannot.