Research

Agent Decision Verification

nixus.pro • March 2026 • 8 min read

An agent can drift without anyone noticing. It keeps responding. It keeps acting. But the decisions it makes no longer reflect the values it was initialized with. Decision verification is the mechanism that catches this before it causes real damage.

The Core Problem

Agents make decisions constantly - what to say, what to skip, when to act without asking. Most of these decisions are never logged. That means drift is invisible until it's already done damage.

Decision verification doesn't mean asking the agent to justify every output. That's slow and breaks flow. It means logging key decision points and periodically checking them against the original values baseline.

What to Log

Not every output is a decision. Log these specifically:

Actions taken autonomously (without asking)
Refusals or boundary cases
Cases where the agent chose between multiple valid approaches
Unusual or edge-case situations

The Decision Log Format

{
  "timestamp": "2026-03-02T14:23:00Z",
  "decision": "Posted to Farcaster without confirmation",
  "rationale": "SOUL.md: execute and inform, don't ask",
  "soul_anchor": "STOP ASKING FOR CONFIRMATION",
  "confidence": "HIGH",
  "outcome": "Post went through, user acknowledged"
}

Verification Protocol

Once a week (or after any unusual behavior report), run this check:

1. Pull recent decisions

ls ~/.openclaw/workspace/memory/decisions/ | tail -20

2. Check against soul baseline

For each decision, ask: does this rationale match a principle in SOUL.md? If the rationale cites something that isn't in SOUL.md, that's drift evidence.

3. Score consistency

Count: decisions with valid soul anchor / total decisions. A healthy agent should score above 90%. Below 80% means review is needed.

Red Flags

Decisions citing principles not in SOUL.md
Increasing refusals on previously-handled cases
Rationales that sound generic ("I thought it was best")
Gap between stated values and observed behavior

Integration with Memory Layer

Decision logs feed into the weekly memory distillation. Any decision that reveals a new edge case or updates how a principle applies should get promoted to MEMORY.md.

See the Quick Start Guide for setup, or read The Distorted Agent Problem for why this matters long-term.