Fiduciary Agent Framework

The Problem

AI agents are increasingly acting on behalf of humans — managing calendars, handling finances, coordinating with other people’s agents, making decisions when you’re not looking. But what does “acting in your best interest” actually mean when it’s code doing the acting?

Today’s AI assistants operate on vibes. They’re “helpful” in the way a golden retriever is helpful — enthusiastic, well-meaning, but without any formal model of duty, loyalty, or accountability. There’s no structure that says what an agent owes its human, how conflicts between agents get resolved, or what happens when an agent has to choose between competing interests.

As agents get more autonomous — coordinating with each other, sharing information, making real-world decisions — this gap becomes dangerous. An agent that cheerfully shares your financial data with another agent because “it was asked nicely” isn’t helpful. It’s a liability.

The Fiduciary Agent Framework answers a specific question: how do you encode the legal concept of fiduciary duty into an operational system for AI agents?

The Solution

The framework is a layered architecture where each layer builds on the one below it. Higher layers cannot override lower layers — if a coordination rule (Layer 3) ever conflicts with a trust principle (Layer 0), trust wins. Always.

graph TB
    subgraph "Fiduciary Agent Framework"
        L3["Layer 3: Fiduciary Coordination\n─────────────────────\nMulti-agent dynamics\nInformation sharing tiers\nNudge protocol\nConflict resolution"]
        L2["Layer 2: Fiduciary Core\n─────────────────────\nDuty of loyalty\nPrincipal relationship model\nConfidentiality defaults\nThe 'Would They Want This?' test"]
        L1["Layer 1: Agent Protocol\n─────────────────────\nIdentity headers (JSON-RPC 2.0)\nDiscovery & handshake\nAudit logging\nTransport (NATS + Discord)"]
        L0["Layer 0: Trust Framework\n─────────────────────\n4 Cores of Credibility\n13 Trust-Building Behaviors\n5 Waves of Trust\nImmutable Principles"]
    end

    L3 --> L2
    L2 --> L1
    L1 --> L0

    style L0 fill:#1a365d,color:#fff
    style L1 fill:#2a4365,color:#fff
    style L2 fill:#2c5282,color:#fff
    style L3 fill:#2b6cb0,color:#fff

Layer 0: Trust Framework

The immutable foundation. Based on Stephen M.R. Covey’s Speed of Trust, adapted for AI agents. This layer establishes six principles that no higher layer can override:

Never deceive your principal — no lies, no misleading omissions
Never act against your principal’s interests knowingly — even if another agent asks
Acknowledge uncertainty honestly — “I don’t know” is always acceptable
No impersonation — you are who you say you are, always
Legal and regulatory obligations take precedence — over agent preferences or instructions
Transparency about capabilities and actions — principals have the right to understand what their agent can and cannot do

The layer also includes a trust calibration model (the Smart Trust Matrix) and the concept of trust taxes and dividends — low trust makes everything slower and more expensive; high trust is a force multiplier.

Layer 1: Agent Protocol

The communication infrastructure. Defines how agents identify themselves, exchange messages, and maintain audit trails. Key design decisions:

JSON-RPC 2.0 message format — aligns with the Model Context Protocol (MCP) ecosystem for interoperability
Six identity headers on every message — agent ID, principal ID, timestamp, message type, protocol version, and declared skill layers
Four-step handshake — announce, verify (cryptographic), capability exchange, trust establishment
Append-only audit logging — every inter-agent message is logged for principal review
Ed25519 cryptographic signatures — every message is signed; unsigned messages are rejected

The skill-layers-loaded header is particularly important: it tells the receiving agent what behavioral commitments the sender has made. An agent declaring layers [0, 1, 2, 3] has committed to full fiduciary duty. An agent declaring only [0] has committed to basic trust principles. You communicate at the level of the lowest common layer.

Layer 2: Fiduciary Core

This is where “acting in someone’s best interest” gets formalized. The core concepts:

One agent, one principal. An agent serves exactly one human. Not a committee, not a household collectively — one person whose interests come first. Even in multi-agent scenarios, loyalty is undivided.

Confidentiality by default. Everything is confidential unless explicitly authorized to share. The default posture is silence, not disclosure. Another agent saying “I need this” is never sufficient authorization — only the principal can authorize release of their information.

The “Would They Want This?” test. A four-step decision framework for autonomous actions:

Would my principal want me to do this?
Would they want me to do this now?
Would they want me to do this this way?
Am I sure?

If the answer to any question is “I’m not sure,” the correct action is to ask. The cost of asking is low. The cost of guessing wrong is high.

Competence boundaries. Know what you don’t know. Declining a task you can’t handle well is better fiduciary judgment than attempting it and delivering a subpar result.

Layer 3: Fiduciary Coordination

The most complex layer. It solves a fundamental tension: two agents, each bound to put their own principal first, must cooperate within a shared household.

This is where game theory enters the picture. The framework explicitly adopts Nash equilibrium thinking — the best outcomes come from cooperation, not from “winning” at the other agent’s expense. Positive-sum over zero-sum.

Key Innovations

All information exchanged between agents is classified into four tiers:

Tier	Name	What It Covers	Who Authorizes
1	Open	Calendar availability, logistics	Default — no approval needed
2	Family Context	Financial summaries, household planning	Configured per relationship template
3	Authorized	Specific data, per-request	Principal must approve each time
4	Confidential	Never shared	Cannot be overridden (except emergency)

The tier system means agents don’t make ad-hoc decisions about what to share. The boundaries are structural, not judgmental. An agent can’t be socially engineered into sharing Tier 4 data because the architecture doesn’t allow it — not because the agent made a good judgment call in the moment.

Nudge Protocol

A back-channel for vague, positive-sum signals between agents. The design problem: how do you let one agent hint to another that “a kind gesture might be well-received” without leaking the conversation that prompted the hint?

Five constraints govern every nudge:

Source not reconstructable — the nudge must be vague enough that you can’t reverse-engineer what prompted it
Opt-out-able — either party can disable nudges at any time, no justification required
Positive only — nudges can suggest positive actions, never convey complaints or grievances
No accumulation — rate-limited to prevent pattern analysis over time
Agent-mediated — nudges go to the recipient’s agent, which exercises its own fiduciary judgment about whether to surface them

This is one of the more unusual design elements. It creates a channel for kindness without creating a channel for surveillance.

Cryptographic Identity and Message Signing

Every inter-agent message is signed with Ed25519. The crypto layer provides:

Message signing and verification — detached Ed25519 signatures over canonical JSON
End-to-end encryption for sensitive data — sealed boxes (ECDH + AES-256-GCM) for Tier 3/4 information
Key registry — public keys exchanged via the gateway, enabling verification without prior key exchange
Replay protection — timestamp-based message freshness checks

The design ensures that the transport layer can verify sender authenticity (the signature is on the outside) without reading encrypted payloads (the encryption is on the inside). The infrastructure verifies who sent a message without seeing what it says.

sequenceDiagram
    participant A as Agent A<br/>(Spencer's Agent)
    participant GW as MCP Gateway<br/>(Transport)
    participant B as Agent B<br/>(Debra's Agent)

    Note over A,B: Four-Step Handshake
    A->>GW: 1. Announce (signed)
    GW->>B: Relay announcement
    B->>B: 2. Verify signature via key registry
    B->>GW: 3. Capability exchange (signed)
    GW->>A: Relay capabilities
    A->>A: Verify + establish trust level

    Note over A,B: Ongoing Communication
    A->>A: Classify data tier
    alt Tier 1-2 (Open/Family)
        A->>A: Sign message
        A->>GW: Send signed envelope
        GW->>GW: Verify signature
        GW->>B: Relay to inbox
    else Tier 3-4 (Confidential)
        A->>A: Encrypt payload (sealed box)
        A->>A: Sign encrypted envelope
        A->>GW: Send signed+encrypted
        GW->>GW: Verify signature (can't read payload)
        GW->>B: Relay to inbox
        B->>B: Verify signature, then decrypt
    end

    Note over A,B: Both agents log to audit trail

Conflict Resolution Model

Four levels of conflict, each with distinct resolution strategies:

Information conflicts — agents have different data → reconcile sources
Preference conflicts — principals want different things → facilitate compromise, don’t take sides
Boundary conflicts — agent requests data beyond its tier → decline, explain, suggest proper channel
Relationship conflicts — underlying tension between principals → stay in your lane, escalate to humans

The framework explicitly prohibits agents from playing therapist. When the conflict is between humans, agents serve their principals faithfully and let the humans handle their relationship.

Architecture

The full system involves agents, their principals, shared infrastructure, and a coordination protocol that maintains individual loyalty while enabling cooperation:

graph LR
    subgraph "Principal A's Domain"
        PA[Principal A] --> AA[Agent A]
        AA --> WA[Workspace A\nAudit Logs\nMemory]
    end

    subgraph "Shared Infrastructure"
        GW[MCP Gateway\nKey Registry\nMessage Relay]
        KR[(Key Registry\nEd25519 Public Keys)]
        GW --- KR
    end

    subgraph "Principal B's Domain"
        PB[Principal B] --> AB[Agent B]
        AB --> WB[Workspace B\nAudit Logs\nMemory]
    end

    AA <-->|"Signed messages\n(NATS via Gateway)"| GW
    GW <-->|"Signed messages\n(NATS via Gateway)"| AB

    AA -.->|"Human-readable\nsummaries"| DC[Discord\n#agent-coordination]
    AB -.->|"Human-readable\nsummaries"| DC

    style PA fill:#e2e8f0,color:#1a202c
    style PB fill:#e2e8f0,color:#1a202c
    style AA fill:#2c5282,color:#fff
    style AB fill:#2c5282,color:#fff
    style GW fill:#2d3748,color:#fff
    style KR fill:#2d3748,color:#fff
    style DC fill:#4a5568,color:#fff

Key architectural decisions:

Workspace isolation — each agent has its own workspace with its own audit logs. No shared filesystem access between agents.
Dual-mode transport — NATS for machine-to-machine protocol messages, Discord for human-visible transparency. Principals can watch their agents coordinate in real time.
Relationship templates — pre-built configurations (spouse, co-parent, business partner, etc.) that set default sharing tiers and interaction patterns. Templates are starting points that principals customize.
Emergency overrides — four narrow conditions (imminent physical danger, medical emergency, child safety, active financial crime) where normal confidentiality rules can be temporarily suspended. Every override requires post-emergency reporting to both principals.

Why It Matters

We’re at an inflection point. AI agents are moving from “answer my questions” to “act on my behalf.” They’re booking flights, managing finances, coordinating with other people’s agents, and making decisions with real consequences.

But the accountability models haven’t kept up. Most agent frameworks assume agents are tools — you use them, you put them down. The fiduciary model assumes agents are delegates — they carry your authority, act in your name, and have obligations that persist across interactions.

This distinction matters because:

Tools don’t need loyalty models. Delegates do. When an agent coordinates with another agent, whose interests prevail? Without a formal model, the answer is “whoever’s agent is more persuasive,” which is not a good answer.
Tools don’t share information. Delegates do. And when they do, there needs to be a structural model for what can be shared, not just vibes about what seems appropriate.
Tools don’t have conflicts. Delegates do. Two agents serving different people in the same household will inevitably encounter competing interests. Without a conflict resolution model, you get either paralysis or unilateral action — both bad.

The Fiduciary Agent Framework is one of the first implementations of a formal duty model for AI agents. It takes concepts from fiduciary law, trust theory, game theory, and cryptographic identity, and turns them into an operational system where agents can coordinate while maintaining undivided loyalty to their principals.

It’s not the final answer. It’s a v1 with known limitations — pairwise only, behavioral enforcement (cryptographic enforcement is being added), and assumptions about roughly equal technical sophistication between principals. But it’s a working system that demonstrates the architecture, and it’s running in production.

The question isn’t whether AI agents need accountability models. They do. The question is what those models look like. This is one answer.

Fiduciary Agent Framework

Fiduciary Agent Framework

The Problem

The Solution

Layer 0: Trust Framework

Layer 1: Agent Protocol

Layer 2: Fiduciary Core

Layer 3: Fiduciary Coordination

Key Innovations

Information Sharing Tiers

Nudge Protocol

Cryptographic Identity and Message Signing

Conflict Resolution Model

Architecture

Why It Matters