Distributed Hive Mind — Design Document¶

Master Issue: #2710 PR: #2717

Problem Statement¶

Goal-seeking agents generated by the Goal Agent Generator operate with isolated memory. Each agent has its own Kuzu graph DB scoped by agent_id. When multiple agents work on related tasks, they cannot share discoveries, leading to duplicated effort and missed cross-domain insights.

Solution: Layered Hive Mind Architecture¶

The Unified Hive Mind composes four independent mechanisms into a layered architecture where each layer solves a distinct problem:

┌─────────────────────────────────────────────────────┐
│  Layer 4: QUERY                                     │
│  Content-hash deduplication across all sources       │
│  Keyword + topic retrieval, merged result sets       │
├─────────────────────────────────────────────────────┤
│  Layer 3: DISCOVERY (Gossip Protocol)               │
│  Periodic top-K fact sharing for "unknown unknowns"  │
│  Lamport clocks, configurable fanout                 │
├─────────────────────────────────────────────────────┤
│  Layer 2: TRANSPORT (Event Bus)                     │
│  FACT_PROMOTED events propagated to peers            │
│  Append-only event log for audit trail               │
├─────────────────────────────────────────────────────┤
│  Layer 1: STORAGE (Hierarchical Graph)              │
│  Local subgraph (private) + Hive subgraph (shared)   │
│  Promotion with configurable consensus policy        │
└─────────────────────────────────────────────────────┘

Why Four Layers?¶

Each layer addresses a distinct concern that the others cannot:

Layer	Concern	Without It
Storage	Where do shared facts live?	No persistence or access control
Transport	How do promotions propagate?	Agents must poll for changes
Discovery	How do agents find facts they didn't know to look for?	Only query-based retrieval
Query	How are results from all layers merged?	Duplicate facts in results

API¶

Quick Start¶

from amplihack.agents.goal_seeking.hive_mind.unified import (
    UnifiedHiveMind,
    HiveMindAgent,
    HiveMindConfig,
)

# Create hive with default config
hive = UnifiedHiveMind()

# Register agents
hive.register_agent("agent_a")
hive.register_agent("agent_b")

# Convenience wrappers
alice = HiveMindAgent("agent_a", hive)
bob = HiveMindAgent("agent_b", hive)

# Alice learns a fact (stored locally)
alice.learn("PostgreSQL runs on port 5432", confidence=0.95, tags=["infra"])

# Alice promotes her best fact to the hive
alice.promote("PostgreSQL runs on port 5432", confidence=0.95, tags=["infra"])

# Bob can now find Alice's promoted fact
results = bob.ask("What port does PostgreSQL use?")
# → [{"content": "PostgreSQL runs on port 5432", ...}]

# Gossip spreads unpromoted facts too
hive.run_gossip_round()
hive.process_events()

Configuration¶

config = HiveMindConfig(
    promotion_confidence_threshold=0.7,  # Min confidence to promote
    promotion_consensus_required=2,       # Agents must agree
    gossip_interval_rounds=5,            # Auto-gossip every N rounds
    gossip_top_k=10,                     # Facts per gossip message
    gossip_fanout=2,                     # Peers per gossip round
    event_relevance_threshold=0.3,       # Min relevance to incorporate
    enable_gossip=True,                  # Toggle gossip layer
    enable_events=True,                  # Toggle event layer
)
hive = UnifiedHiveMind(config)

Experiment Results¶

Five experiments were conducted, each testing a different approach:

#	Approach	Tests	Overall Score
1	Shared Blackboard	31	47%
2	Event-Sourced	47	49%
3	Gossip Protocol	42	54%
4	Hierarchical Graph	54	57%
5	Unified (combined)	40	94%

The unified approach outperforms the best individual experiment by +37 percentage points, confirming that the mechanisms are complementary.

Detailed Metrics¶

Experiment 1 (Blackboard): +75pp cross-agent recall. Simple but no access control — all facts immediately shared. Good for small agent groups.

Experiment 2 (Event-Sourced): +65pp cross-domain quality. Complete audit trail via append-only event log. 0.013ms publish latency. Late joiner replay in 0.02ms.

Experiment 3 (Gossip): >95% knowledge convergence in 7 rounds for 5 agents. Weighted random sampling ensures all facts eventually propagate. Scales sub-linearly for small networks.

Experiment 4 (Hierarchical): +4.2pp with zero local regression. Most conservative — only high-confidence facts with consensus are promoted. Best autonomy preservation.

Experiment 5 (Unified): 100% local, 100% cross-domain, 81% combined = 94% overall. Composes all four layers for best-of-all-worlds.

Module Reference¶

`hive_mind/unified.py` — Unified Hive Mind¶

HiveMindConfig — Configuration dataclass
UnifiedHiveMind — Main orchestrator composing all layers
HiveMindAgent — Per-agent convenience wrapper

`hive_mind/blackboard.py` — Shared Blackboard¶

SharedFact — Fact dataclass with content hash
HiveMemoryStore — Shared fact CRUD with dedup
HiveMemoryBridge — Local ↔ shared bridge
HiveRetrieval — MemoryAgent-compatible strategy
MultiAgentHive — Agent registry + coordinator

`hive_mind/event_sourced.py` — Event Sourcing¶

HiveEvent — Immutable event dataclass
HiveEventBus — Thread-safe pub/sub
EventLog — Append-only log with persistence
EventSourcedMemory — Memory + event publishing
HiveOrchestrator — Event bus coordinator

`hive_mind/gossip.py` — Gossip Protocol¶

GossipFact / GossipMessage — Gossip data types
GossipProtocol — Per-agent gossip logic
GossipNetwork — Network coordinator
GossipMemoryAdapter — Memory store bridge

`hive_mind/hierarchical.py` — Hierarchical Graph¶

HiveFact / LocalFact — Two-level fact types
PromotionPolicy — Configurable promotion rules
PromotionManager — Propose/vote/promote lifecycle
PullManager — Hive query + pull to local
HierarchicalKnowledgeGraph — Two-level orchestrator

Current State¶

All five original "future work" items have been implemented:

Real Kuzu Integration — Done. Each agent owns a Kuzu DB via KuzuGraphStore.
LearningAgent Bridge — Done. FederatedGraphStore composes local + hive.
Full Eval Harness — Done. 1000-turn eval with 12 agents across 5 hive federation.
Distributed Mode — Done. EventBus with Local/Redis/Azure Service Bus backends.
HiveGraph Protocol — Done. Swappable backends (InMemory, PeerHive with Raft).

CognitiveAdapter Hive Integration¶

The bridge between LearningAgent and the hive mind is in CognitiveAdapter (src/amplihack/agents/goal_seeking/cognitive_adapter.py).

How Facts Flow¶

LearningAgent.learn_from_content(content)
  → LLM extracts structured facts
  → CognitiveAdapter.store_fact(context, fact, confidence)
    → Stores in local Kuzu DB (CognitiveMemory.store_fact)
    → _promote_to_hive() — auto-promotes to shared hive if connected
      → hive.promote_fact(agent_name, HiveFact(...))

LearningAgent.answer_question(question)
  → CognitiveAdapter.search(query) or get_all_facts()
    → Queries local Kuzu DB
    → _search_hive(query) — queries shared hive
    → _merge_results() — deduplicates, local facts prioritized
  → LLM synthesizes answer from merged fact set

Usage¶

from amplihack.agents.goal_seeking.learning_agent import LearningAgent
from amplihack.agents.goal_seeking.hive_mind.hive_graph import InMemoryHiveGraph

hive = InMemoryHiveGraph("shared")
hive.register_agent("agent_a")

agent = LearningAgent(
    agent_name="agent_a",
    storage_path=Path("/tmp/agent_a"),
    use_hierarchical=True,
    hive_store=hive,  # Enables auto-promotion + hive retrieval
)

Key Design Decisions¶

Auto-promotion on store: Every store_fact() call auto-promotes to hive. Simpler than explicit promotion — no facts missed, no extra caller code.
Local-first merge: Local facts take priority over hive facts in dedup. Agents trust their own extractions more than shared knowledge.
Silent failure: Hive promotion errors are logged but never raised. Local storage always succeeds even if the hive is unavailable.

See TUTORIAL.md in this directory for getting started.