Deliberatic — Argumentation-Based Consensus for Multi-Agent AI

Why Multi-Agent Debate isn't enough

Three fatal gaps in current MAD frameworks that Deliberatic solves.

01

No formal semantics

Agents exchange natural language opinions and opaque confidence scores. There's no structured way to compare, attack, or support positions. When Agent A says "87% confident" and Agent B says "91%," those numbers are unverifiable and incomparable.

Du et al. 2023 · Wu et al. 2025: "simple majority voting already achieves most performance gains"

02

No fault tolerance

If one agent is compromised, hallucinating, or adversarial, no mechanism detects it. Most frameworks use majority voting, which Wu et al. showed cannot exceed the accuracy of the strongest single agent — and over-confident agents actively degrade team output.

Wu et al. 2025: "MAD cannot exceed the accuracy of its strongest participant"

03

No audit trail

Enterprise deployments in finance, healthcare, and government require explainable decision chains. Current frameworks produce conversation logs — not verifiable evidence. When regulators ask "why did the system decide X?", there's no proof.

AI Governance market: $309M → $4.8B by 2034 (Precedence Research)

Six layers of principled reasoning

Each maps to established research in argumentation theory, distributed consensus, and agent communication.

Layer 1

Argumentation

Weighted Bipolar AF (wBAF) Graded Semantics

Agent positions modeled as nodes in a weighted bipolar argumentation graph. Attacks and supports form directed edges. Acceptability computed via iterative propagation —

σ(a) = w(a)·ρ(α(a), dom(a)) + Σ supports − Σ
                                attacks

— converging under contraction when γ⁺ + γ⁻ < 1. Not binary accept/reject. Degrees.

Dung 1995 · Amgoud & Cayrol 1998 · Potyka 2019 wBAF modular semantics

Layer 2

Consensus

Adaptive Two-Phase Raft → PBFT

Clear winner (gap > τ=0.15)? Raft fast-path — leader proposes, majority quorum commits in ~50ms. Close call? PBFT conflict-path — Pre-Prepare → Prepare → Commit with 3f+1 Byzantine tolerance. New evidence allowed during Prepare. ~200ms but provably correct.

Ongaro & Ousterhout 2014 (Raft) · Castro & Liskov 1999 (PBFT) · Zhu et al. 2025

Layer 3

Constitution

Normative DSL YAML + Predicate Logic

Hard boundaries (violations → instant rejection), soft preferences (adjust acceptability ±delta), amendment rules (3 vindicated dissents trigger constitutional review). Pre-commit validator ensures every verdict is constitutionally compliant.

Anthropic Constitutional AI · OpenAI Deliberative Alignment · Public Constitutional AI (Abiri 2024)

Layer 4

Transport

A2A v0.3 + MCP Linux Foundation / AAIF

Deliberation rounds are A2A Tasks. Positions are A2A Messages with structured Parts. Verdicts emit as A2A Artifacts. Also ships as an MCP server with 7 tools. Agent Cards advertise deliberation/v1 skill. JSON-RPC 2.0 / SSE / gRPC.

A2A v0.3 (150+ orgs) · MCP Nov 2025 spec (97M+ SDK downloads) · AAIF, Linux Foundation

Layer 5

Evidence

Merkle Evidence Chains SHA-256 · Append-Only

Every round produces a Merkle tree:

Hash(positions) → Hash(challenges) →
                                Hash(constitutional checks) →
                                Hash(verdict)

. Merkle root published with verdict. Any party can verify integrity, trace reasoning, audit compliance. Export: JSON, PDF, OTEL spans.

Merkle 1979 · OpenTelemetry trace correlation · Append-only audit log pattern

Layer 6

Reputation + Calibration

Domain-Aware ELO + Calibration ρ ∈ [800, 2400]

Domain-aware ELO + calibration: ρ_new(a,d) = ρ_old(a,d) + K_d·(S − E_d). Vindicated dissenters get 1.5× K bonus (domain-scoped). Weighting uses ρ(α(a), dom(a)), discounted for poor calibration (overconfidence / high ECE), so rhetorical certainty can’t dominate.

Elo 1978 · Wu et al. 2025: "majority pressure suppresses independent correction"

Anatomy of a deliberation

What happens when agents disagree — in 200ms.

T+0ms · A2A tasks/send

Round opens

Task arrives via A2A. Engine opens round with topic, constitution reference, deadline (30s), and quorum (3). All agents matching the Agent Card skill filter deliberation/v1 are invited. Moderator elected: highest-reputation non-participant.

T+50ms · Position submission

wBAF graph constructed

Each agent submits a Position — structured argument with typed evidence (performance, resource, latency, schema). Parsed into wBAF nodes. Initial weights = evidence strength × domain-aware, calibration-adjusted reputation ρ(α(a), dom(a)). Agents can challenge() — adding attack edges — or support() — adding support edges.

T+120ms · Graded semantics

Acceptability computed

Iterative propagation:

σ(a) = w(a)·ρ(α(a), dom(a)) + Σ
                                γ⁺·σ(supporters) − Σ γ⁻·σ(attackers)

. Converges when max delta < ε=0.001. Typically 3-7 iterations. Result: every position has a continuous acceptability degree in [0,1].

T+140ms · Constitutional check

Normative validation

Constitution Interpreter validates all surviving positions. Hard boundary violations → instant rejection. Soft preferences → acceptability adjusted ±delta. If all positions rejected, round escalates to human review via A2A push notification.

T+200ms · Consensus commit

Verdict + evidence chain

Gap > τ? Raft fast-path commits. Close? PBFT conflict-path with new evidence window. Winning position → verdict. All data → Merkle chain (SHA-256). Dissents recorded. Reputation scores updated. Verdict emitted as A2A Artifact. Evidence chain: merkle://0x...

The API

TypeScript SDK · Python SDK · Rust core engine · MCP server

                            deliberate.ts
                        

                            constitution.yaml
                        
evidence.ts
TypeScript

                    // Open a deliberation round over A2A v0.3
                    const round =
                    await
                    deliberatic.open({ topic:
                    "Which agent handles user onboarding?", constitution:
                    "cluster://production-alpha",
                    deadline: "30s", quorum:
                    3, transport:
                    "a2a",
                    // or "mcp", "http" consensus:
                    "adaptive"
                    // raft fast-path → pbft conflict-path
                    });

                    // Submit a position (becomes a wBAF node)
                    await round.submit({ agent: "onboarding-specialist",
                    position: { claim:
                    "Domain context yields 94% accuracy over 1,847
                        tasks", evidence: [ { type:
                    "performance",
                    value:
                    0.94, n:
                    1847,
                    confidence:
                    0.03 }, {
                    type:
                    "resource",
                    tokens:
                    1200,
                    latency_ms:
                    340 }, {
                    type:
                    "schema",
                    output:
                    OnboardingResult } ], fallback:
                    "general-agent"
                    } });

                    // Challenge (adds an attack edge to the wBAF
                        graph)
                    await round.challenge({ agent: "general-agent", target:
                    "onboarding-specialist", grounds:
                    "resource_efficiency", evidence: [
                    { type:
                    "resource",
                    tokens:
                    380,
                    latency_ms:
                    95 }, {
                    type:
                    "performance",
                    value:
                    0.89, n:
                    3200,
                    confidence:
                    0.02 } ], claim:
                    "3x cheaper at p=0.89 — within soft preference
                        delta"
                    });

                    // Resolve: wBAF semantics → constitutional check →
                        consensus
                    const verdict =
                    await round.resolve();
                    // {
                    // winner: "general-agent",
                    // acceptability: 0.82,
                    // consensus_path: "raft", // gap was > τ
                    // constitutional: { hard: "all_pass", soft_delta:
                        +0.08 },
                    // dissents: [{ agent: "onboarding-specialist", σ: 0.71
                        }],
                    // evidence_chain: "merkle://0xab3f...c912",
                    // latency_ms: 142
                    // }
                

Two halves of one mind

Deliberatic decides. AgenTroMatic executes. Connected by A2A.

Decisions · deliberatic.com

Deliberatic

The prefrontal cortex

Weighted Bipolar AF (Dung + Potyka)
Adaptive consensus (Raft / PBFT)
Constitutional normative DSL
Merkle evidence chains (SHA-256)
Domain-aware ELO + calibration-adjusted reputation
A2A + MCP dual transport

Execution · agentromatic.com

AgenTroMatic

The autonomous nervous system

Spec-driven task execution
Self-healing agent workflows
Automatic scaling & retry
Fault-tolerant pipelines
Real-time telemetry (OTEL)
A2A Artifact streaming

Part of the [&] stack

Eight domains. One agent infrastructure.

Compute

fleetprompt.com Runtime & Skills

Contracts

specprompt.com Specifications

Orchestrate

delegatic.com Task Routing

Decisions

deliberatic.com Argumentation & Consensus

Execution

agentromatic.com Autonomous Automation

Infra

webhost.systems Hosting

Spatial

geofleetic.com Location Awareness

Temporal

ticktickclock.com Scheduling & Deadlines

Every decision
deserves a proof.

Why Multi-Agent Debate isn't enough

Six layers of principled reasoning

Anatomy of a deliberation

The API

Two halves of one mind

Part of the [&] stack

Agents that prove
before they act.

Every decisiondeserves a proof.

Why Multi-Agent Debate isn't enough

Six layers of principled reasoning

Anatomy of a deliberation

The API

Two halves of one mind

Part of the [&] stack

Agents that provebefore they act.

Every decision
deserves a proof.

Agents that prove
before they act.