Learning Engine

The Learning Engine observes sprint outcomes — what succeeded, what failed, how long tasks took, which mandates were triggered — and uses that data to improve future sprint planning. It is a feedback loop that makes the Conductor smarter over time without requiring manual configuration.

What it learns

The Learning Engine operates on the append-only event log. After each sprint completes or fails, it runs an analysis pass and updates several internal models:

1. Task duration model

Every task has an estimatedTokens value from the Architect. The learning engine tracks the ratio of estimated to actual tokens and adjusts the Architect’s estimation calibration:

interface TaskOutcome {
  taskId: string;
  taskType: string;           // 'db-migration' | 'ui-component' | 'api-endpoint' | ...
  estimatedTokens: number;
  actualTokens: number;
  ratio: number;              // actual / estimated
  sprintId: string;
  projectVertical: string;
}

After enough observations, the Architect receives a calibration hint: “UI component tasks in B2B SaaS projects typically run 1.4x the initial estimate.”

2. Mandate trigger patterns

The engine tracks which mandates are triggered most frequently and in what contexts:

interface MandateTrigger {
  mandateId: string;
  ruleId: string;
  context: string;            // 'payment-feature' | 'auth-flow' | 'db-migration' | ...
  resolution: string;         // how the Builder fixed it
  resolutionTokenCost: number;
}

This data is used to improve the Captain’s mandate preflight: if “payment” in the goal has a 90% correlation with mandate_f1 (PCI) triggering, the Captain preemptively flags it and the Architect plans around it before BUILD starts.

3. Agent selection accuracy

After each sprint, the learning engine notes which agents were dispatched but produced no output (indicating they were not needed) and which cases where an undispatched agent’s absence caused a BLOCKED state:

interface AgentSelectionOutcome {
  sprintId: string;
  selectedAgents: AgentId[];
  superfluousAgents: AgentId[];  // dispatched but no-op
  missingAgents: AgentId[];       // not dispatched; caused a gap
}

Over time, the Captain’s agent selection becomes more accurate for similar sprint types.

4. Failure pattern library

Failed sprints are the most valuable learning signal. Each Medic diagnosis is stored in the failure pattern library:

interface FailurePattern {
  id: string;
  patternName: string;
  description: string;
  triggerConditions: string[];    // what goal/context patterns correlate with this failure
  rootCause: string;
  resolution: string;
  preventionAdvice: string;       // what the Architect should do differently
}

Example: if three sprints fail because the Builder assumed a Supabase Edge Function would be available but the project hadn’t enabled it, the engine creates a pattern: “Edge Function dependency without infrastructure check.” The Navigator is then prompted to always check for Edge Function availability when the goal involves edge-side logic.

Data flow

Sprint event log
       │
       ▼
Learning Engine (runs after each sprint)
       │
       ├── Task duration model → Architect calibration hints
       ├── Mandate trigger patterns → Captain preflight improvements
       ├── Agent selection outcomes → Captain agent selection model
       └── Failure patterns → Navigator and Medic knowledge base

Accessing learning data

# View learning engine statistics
defiant learning stats

# Output:
# Sprints analyzed: 47
# Task estimation accuracy: 1.23x average (23% underestimate)
# Most triggered mandates: mandate_15 (42%), mandate_27 (31%), mandate_7 (28%)
# Most common failure pattern: "missing RLS policy on new table" (8 occurrences)
# Captain agent selection accuracy: 89%

# View failure patterns
defiant learning patterns

# View calibration hints for the Architect
defiant learning calibration

Privacy and scope

The Learning Engine operates entirely locally. Sprint data stays in ~/.defiant/events.db. No data is sent to Defiant’s servers or to Anthropic. The learning models are project-scoped by default — a fintech project’s patterns do not pollute a solo-founder project’s calibration.

Cross-project learning (within the same local installation) can be enabled:

{
  "learning": {
    "crossProjectLearning": true,   // default: false
    "sharePatternsBetweenVerticals": false
  }
}

Resetting the learning engine

If the learning data becomes stale (e.g., after a major codebase restructure):

# Reset all learning data for a project
defiant learning reset --project proj_01hw...

# Reset the failure pattern library only
defiant learning reset --patterns

# Reset task duration calibration only
defiant learning reset --calibration

Roadmap

The Learning Engine in Defiant 2.0 is v1. Planned improvements:

Cross-installation sharing: opt-in aggregated pattern sharing across Defiant installations, with differential privacy
Vertical-specific calibration: separate duration models per vertical, since Healthcare sprints consistently run longer than Solo Founder sprints
Proactive recommendations: the Captain proactively suggests splitting large goals based on historical patterns, before the user submits them
Test failure pattern library: learn which types of test failures recur and seed the Builder with prevention guidance