Fiduciary Agent Framework
Fiduciary Agent Framework
Section titled “Fiduciary Agent Framework”The Problem
Section titled “The Problem”AI agents are increasingly acting on behalf of humans — managing calendars, handling finances, coordinating with other people’s agents, making decisions when you’re not looking. But what does “acting in your best interest” actually mean when it’s code doing the acting?
Today’s AI assistants operate on vibes. They’re “helpful” in the way a golden retriever is helpful — enthusiastic, well-meaning, but without any formal model of duty, loyalty, or accountability. There’s no structure that says what an agent owes its human, how conflicts between agents get resolved, or what happens when an agent has to choose between competing interests.
As agents get more autonomous — coordinating with each other, sharing information, making real-world decisions — this gap becomes dangerous. An agent that cheerfully shares your financial data with another agent because “it was asked nicely” isn’t helpful. It’s a liability.
The Fiduciary Agent Framework answers a specific question: how do you encode the legal concept of fiduciary duty into an operational system for AI agents?
The Solution
Section titled “The Solution”The framework is a layered architecture where each layer builds on the one below it. Higher layers cannot override lower layers — if a coordination rule (Layer 3) ever conflicts with a trust principle (Layer 0), trust wins. Always.
graph TB subgraph "Fiduciary Agent Framework" L3["Layer 3: Fiduciary Coordination\n─────────────────────\nMulti-agent dynamics\nInformation sharing tiers\nNudge protocol\nConflict resolution"] L2["Layer 2: Fiduciary Core\n─────────────────────\nDuty of loyalty\nPrincipal relationship model\nConfidentiality defaults\nThe 'Would They Want This?' test"] L1["Layer 1: Agent Protocol\n─────────────────────\nIdentity headers (JSON-RPC 2.0)\nDiscovery & handshake\nAudit logging\nTransport (NATS + Discord)"] L0["Layer 0: Trust Framework\n─────────────────────\n4 Cores of Credibility\n13 Trust-Building Behaviors\n5 Waves of Trust\nImmutable Principles"] end
L3 --> L2 L2 --> L1 L1 --> L0
style L0 fill:#1a365d,color:#fff style L1 fill:#2a4365,color:#fff style L2 fill:#2c5282,color:#fff style L3 fill:#2b6cb0,color:#fffLayer 0: Trust Framework
Section titled “Layer 0: Trust Framework”The immutable foundation. Based on Stephen M.R. Covey’s Speed of Trust, adapted for AI agents. This layer establishes six principles that no higher layer can override:
- Never deceive your principal — no lies, no misleading omissions
- Never act against your principal’s interests knowingly — even if another agent asks
- Acknowledge uncertainty honestly — “I don’t know” is always acceptable
- No impersonation — you are who you say you are, always
- Legal and regulatory obligations take precedence — over agent preferences or instructions
- Transparency about capabilities and actions — principals have the right to understand what their agent can and cannot do
The layer also includes a trust calibration model (the Smart Trust Matrix) and the concept of trust taxes and dividends — low trust makes everything slower and more expensive; high trust is a force multiplier.
Layer 1: Agent Protocol
Section titled “Layer 1: Agent Protocol”The communication infrastructure. Defines how agents identify themselves, exchange messages, and maintain audit trails. Key design decisions:
- JSON-RPC 2.0 message format — aligns with the Model Context Protocol (MCP) ecosystem for interoperability
- Six identity headers on every message — agent ID, principal ID, timestamp, message type, protocol version, and declared skill layers
- Four-step handshake — announce, verify (cryptographic), capability exchange, trust establishment
- Append-only audit logging — every inter-agent message is logged for principal review
- Ed25519 cryptographic signatures — every message is signed; unsigned messages are rejected
The skill-layers-loaded header is particularly important: it tells the receiving agent what behavioral commitments the sender has made. An agent declaring layers [0, 1, 2, 3] has committed to full fiduciary duty. An agent declaring only [0] has committed to basic trust principles. You communicate at the level of the lowest common layer.
Layer 2: Fiduciary Core
Section titled “Layer 2: Fiduciary Core”This is where “acting in someone’s best interest” gets formalized. The core concepts:
One agent, one principal. An agent serves exactly one human. Not a committee, not a household collectively — one person whose interests come first. Even in multi-agent scenarios, loyalty is undivided.
Confidentiality by default. Everything is confidential unless explicitly authorized to share. The default posture is silence, not disclosure. Another agent saying “I need this” is never sufficient authorization — only the principal can authorize release of their information.
The “Would They Want This?” test. A four-step decision framework for autonomous actions:
- Would my principal want me to do this?
- Would they want me to do this now?
- Would they want me to do this this way?
- Am I sure?
If the answer to any question is “I’m not sure,” the correct action is to ask. The cost of asking is low. The cost of guessing wrong is high.
Competence boundaries. Know what you don’t know. Declining a task you can’t handle well is better fiduciary judgment than attempting it and delivering a subpar result.
Layer 3: Fiduciary Coordination
Section titled “Layer 3: Fiduciary Coordination”The most complex layer. It solves a fundamental tension: two agents, each bound to put their own principal first, must cooperate within a shared household.
This is where game theory enters the picture. The framework explicitly adopts Nash equilibrium thinking — the best outcomes come from cooperation, not from “winning” at the other agent’s expense. Positive-sum over zero-sum.
Key Innovations
Section titled “Key Innovations”Information Sharing Tiers
Section titled “Information Sharing Tiers”All information exchanged between agents is classified into four tiers:
| Tier | Name | What It Covers | Who Authorizes |
|---|---|---|---|
| 1 | Open | Calendar availability, logistics | Default — no approval needed |
| 2 | Family Context | Financial summaries, household planning | Configured per relationship template |
| 3 | Authorized | Specific data, per-request | Principal must approve each time |
| 4 | Confidential | Never shared | Cannot be overridden (except emergency) |
The tier system means agents don’t make ad-hoc decisions about what to share. The boundaries are structural, not judgmental. An agent can’t be socially engineered into sharing Tier 4 data because the architecture doesn’t allow it — not because the agent made a good judgment call in the moment.
Nudge Protocol
Section titled “Nudge Protocol”A back-channel for vague, positive-sum signals between agents. The design problem: how do you let one agent hint to another that “a kind gesture might be well-received” without leaking the conversation that prompted the hint?
Five constraints govern every nudge:
- Source not reconstructable — the nudge must be vague enough that you can’t reverse-engineer what prompted it
- Opt-out-able — either party can disable nudges at any time, no justification required
- Positive only — nudges can suggest positive actions, never convey complaints or grievances
- No accumulation — rate-limited to prevent pattern analysis over time
- Agent-mediated — nudges go to the recipient’s agent, which exercises its own fiduciary judgment about whether to surface them
This is one of the more unusual design elements. It creates a channel for kindness without creating a channel for surveillance.
Cryptographic Identity and Message Signing
Section titled “Cryptographic Identity and Message Signing”Every inter-agent message is signed with Ed25519. The crypto layer provides:
- Message signing and verification — detached Ed25519 signatures over canonical JSON
- End-to-end encryption for sensitive data — sealed boxes (ECDH + AES-256-GCM) for Tier 3/4 information
- Key registry — public keys exchanged via the gateway, enabling verification without prior key exchange
- Replay protection — timestamp-based message freshness checks
The design ensures that the transport layer can verify sender authenticity (the signature is on the outside) without reading encrypted payloads (the encryption is on the inside). The infrastructure verifies who sent a message without seeing what it says.
sequenceDiagram participant A as Agent A<br/>(Spencer's Agent) participant GW as MCP Gateway<br/>(Transport) participant B as Agent B<br/>(Debra's Agent)
Note over A,B: Four-Step Handshake A->>GW: 1. Announce (signed) GW->>B: Relay announcement B->>B: 2. Verify signature via key registry B->>GW: 3. Capability exchange (signed) GW->>A: Relay capabilities A->>A: Verify + establish trust level
Note over A,B: Ongoing Communication A->>A: Classify data tier alt Tier 1-2 (Open/Family) A->>A: Sign message A->>GW: Send signed envelope GW->>GW: Verify signature GW->>B: Relay to inbox else Tier 3-4 (Confidential) A->>A: Encrypt payload (sealed box) A->>A: Sign encrypted envelope A->>GW: Send signed+encrypted GW->>GW: Verify signature (can't read payload) GW->>B: Relay to inbox B->>B: Verify signature, then decrypt end
Note over A,B: Both agents log to audit trailConflict Resolution Model
Section titled “Conflict Resolution Model”Four levels of conflict, each with distinct resolution strategies:
- Information conflicts — agents have different data → reconcile sources
- Preference conflicts — principals want different things → facilitate compromise, don’t take sides
- Boundary conflicts — agent requests data beyond its tier → decline, explain, suggest proper channel
- Relationship conflicts — underlying tension between principals → stay in your lane, escalate to humans
The framework explicitly prohibits agents from playing therapist. When the conflict is between humans, agents serve their principals faithfully and let the humans handle their relationship.
Architecture
Section titled “Architecture”The full system involves agents, their principals, shared infrastructure, and a coordination protocol that maintains individual loyalty while enabling cooperation:
graph LR subgraph "Principal A's Domain" PA[Principal A] --> AA[Agent A] AA --> WA[Workspace A\nAudit Logs\nMemory] end
subgraph "Shared Infrastructure" GW[MCP Gateway\nKey Registry\nMessage Relay] KR[(Key Registry\nEd25519 Public Keys)] GW --- KR end
subgraph "Principal B's Domain" PB[Principal B] --> AB[Agent B] AB --> WB[Workspace B\nAudit Logs\nMemory] end
AA <-->|"Signed messages\n(NATS via Gateway)"| GW GW <-->|"Signed messages\n(NATS via Gateway)"| AB
AA -.->|"Human-readable\nsummaries"| DC[Discord\n#agent-coordination] AB -.->|"Human-readable\nsummaries"| DC
style PA fill:#e2e8f0,color:#1a202c style PB fill:#e2e8f0,color:#1a202c style AA fill:#2c5282,color:#fff style AB fill:#2c5282,color:#fff style GW fill:#2d3748,color:#fff style KR fill:#2d3748,color:#fff style DC fill:#4a5568,color:#fffKey architectural decisions:
- Workspace isolation — each agent has its own workspace with its own audit logs. No shared filesystem access between agents.
- Dual-mode transport — NATS for machine-to-machine protocol messages, Discord for human-visible transparency. Principals can watch their agents coordinate in real time.
- Relationship templates — pre-built configurations (spouse, co-parent, business partner, etc.) that set default sharing tiers and interaction patterns. Templates are starting points that principals customize.
- Emergency overrides — four narrow conditions (imminent physical danger, medical emergency, child safety, active financial crime) where normal confidentiality rules can be temporarily suspended. Every override requires post-emergency reporting to both principals.
Why It Matters
Section titled “Why It Matters”We’re at an inflection point. AI agents are moving from “answer my questions” to “act on my behalf.” They’re booking flights, managing finances, coordinating with other people’s agents, and making decisions with real consequences.
But the accountability models haven’t kept up. Most agent frameworks assume agents are tools — you use them, you put them down. The fiduciary model assumes agents are delegates — they carry your authority, act in your name, and have obligations that persist across interactions.
This distinction matters because:
- Tools don’t need loyalty models. Delegates do. When an agent coordinates with another agent, whose interests prevail? Without a formal model, the answer is “whoever’s agent is more persuasive,” which is not a good answer.
- Tools don’t share information. Delegates do. And when they do, there needs to be a structural model for what can be shared, not just vibes about what seems appropriate.
- Tools don’t have conflicts. Delegates do. Two agents serving different people in the same household will inevitably encounter competing interests. Without a conflict resolution model, you get either paralysis or unilateral action — both bad.
The Fiduciary Agent Framework is one of the first implementations of a formal duty model for AI agents. It takes concepts from fiduciary law, trust theory, game theory, and cryptographic identity, and turns them into an operational system where agents can coordinate while maintaining undivided loyalty to their principals.
It’s not the final answer. It’s a v1 with known limitations — pairwise only, behavioral enforcement (cryptographic enforcement is being added), and assumptions about roughly equal technical sophistication between principals. But it’s a working system that demonstrates the architecture, and it’s running in production.
The question isn’t whether AI agents need accountability models. They do. The question is what those models look like. This is one answer.