Skip to content

Architecture

Overview

amplihack-memory-lib is organized into three layers, each building on the one below:

+------------------------------------------------------------+
|                     Public API Layer                        |
|  HierarchicalMemory  |  CognitiveMemory  |  ExperienceStore|
+------------------------------------------------------------+
|                   Shared Utilities Layer                    |
|  similarity  |  entity_extraction  |  contradiction        |
|  MemoryClassifier  |  pattern_recognition  |  security     |
+------------------------------------------------------------+
|                    Storage Backend Layer                    |
|  MemoryConnector  ->  KuzuBackend  |  SQLiteBackend        |
+------------------------------------------------------------+
|                     Kuzu / SQLite                           |
+------------------------------------------------------------+

Core Components

HierarchicalMemory (Graph RAG)

File: hierarchical_memory.py

The primary memory system for AI agents that need structured knowledge with relationships. It manages a Kuzu knowledge graph containing:

  • SemanticMemory nodes -- Distilled facts with concept labels, confidence scores, tags, and entity names
  • EpisodicMemory nodes -- Raw source content (episodes) with provenance labels
  • SIMILAR_TO edges -- Computed via Jaccard text similarity at storage time
  • DERIVES_FROM edges -- Link facts to their source episodes (provenance)
  • SUPERSEDES edges -- Track temporal updates (newer fact replaces older)
  • TRANSITIONED_TO edges -- Explicit value transition chains for temporal reasoning

Key methods:

Method Purpose
store_knowledge() Store a fact node, auto-classify, compute similarity edges
store_episode() Store a raw episode node
retrieve_subgraph() Graph RAG retrieval: keyword match + similarity traversal
get_all_knowledge() Get all fact nodes for an agent
export_graph() / import_graph() Serialize/deserialize the full graph

Protocol-compatible aliases (store_fact, search_facts, get_all_facts) provide interop with ExperienceStore consumers.

CognitiveMemory (Six-Type)

File: cognitive_memory.py

A higher-level memory system modeled after human cognition with six distinct memory types, each stored in its own Kuzu node table:

Memory Type Table Purpose Lifecycle
Sensory SensoryMemory Raw short-lived observations Auto-expires via TTL
Working WorkingMemory Active task context Bounded capacity (20 slots), evicts lowest relevance
Episodic EpisodicMemory Autobiographical events Consolidatable into summaries
Semantic SemanticMemory Distilled facts/knowledge Persistent, searchable by keyword
Procedural ProceduralMemory Step-by-step procedures Usage-count tracked, searchable
Prospective ProspectiveMemory Trigger-action pairs Pending -> triggered -> resolved

Relationship edges:

  • SIMILAR_TO (SemanticMemory -> SemanticMemory)
  • DERIVES_FROM (SemanticMemory -> EpisodicMemory)
  • PROCEDURE_DERIVES_FROM (ProceduralMemory -> EpisodicMemory)
  • CONSOLIDATES (ConsolidatedEpisode -> EpisodicMemory)
  • ATTENDED_TO (SensoryMemory -> EpisodicMemory)

ExperienceStore (Legacy/Simple)

File: store.py

A simpler, flat storage system for experience records. Uses MemoryConnector to delegate to either the Kuzu or SQLite backend. Provides:

  • Automatic compression of old experiences
  • Retention policies (age limit, count limit)
  • Duplicate detection
  • Full-text search
  • Storage quota enforcement

MemoryConnector (Backend Factory)

File: connector.py

Factory class that creates and manages the appropriate backend:

# Kuzu backend (default)
connector = MemoryConnector(agent_name="my-agent", backend="kuzu")

# SQLite fallback
connector = MemoryConnector(agent_name="my-agent", backend="sqlite")

Each agent gets isolated storage under ~/.amplihack/memory/<agent_name>/.


Shared Utilities

Similarity (similarity.py)

Deterministic text similarity without ML embeddings:

  • compute_word_similarity(a, b) -- Jaccard coefficient on tokenized words minus stop words
  • compute_tag_similarity(a, b) -- Jaccard coefficient on tag lists
  • compute_similarity(node_a, node_b) -- Weighted composite: 0.5 * word + 0.2 * tag + 0.3 * concept
  • rerank_facts_by_query(facts, query) -- Rerank retrieved facts by keyword relevance; boost temporal facts when query contains temporal cues

Entity Extraction (entity_extraction.py)

Extracts proper nouns from text using regex heuristics:

  • Handles apostrophe names (O'Brien), hyphenated names (Al-Hassan), multi-word names (Sarah Chen)
  • Checks concept field first (more specific), then content
  • Returns lowercase for consistent indexing

Contradiction Detection (contradiction.py)

Detects when two facts about the same concept contain conflicting numerical values:

  • Requires overlapping concept words (at least one meaningful word in common)
  • Extracts numbers from both facts
  • Flags when facts have numbers unique to each (potential update/conflict)

MemoryClassifier (hierarchical_memory.py)

Rule-based keyword classifier:

Keywords Category
step, how to, procedure, recipe PROCEDURAL
plan, goal, future, will, todo PROSPECTIVE
happened, event, observed EPISODIC
(default) SEMANTIC

Security Layer (security.py)

  • AgentCapabilities -- Capability-based access control (scope levels, allowed types, query cost limits)
  • CredentialScrubber -- Regex-based detection and redaction of API keys, passwords, tokens, SSH keys, DB URLs
  • QueryValidator -- SQL query cost estimation and safety validation
  • SecureMemoryBackend -- Wrapper that enforces all security policies

Pattern Recognition (pattern_recognition.py)

  • PatternDetector -- Tracks recurring patterns across discoveries, recognizes when threshold is reached
  • recognize_patterns() -- Batch pattern recognition with known-pattern filtering
  • Confidence formula: min(0.5 + occurrences * 0.1, 0.95), adjusted by validation success rate

Design Philosophy

Ruthless Simplicity

Every component has a single, clear purpose. No unnecessary abstractions. The similarity module uses Jaccard coefficients instead of ML embeddings -- simple, deterministic, and sufficient for the use case.

Zero-BS Implementation

No stubs, no placeholders, no fake implementations. Every function works or does not exist. The security layer actually scrubs credentials. The pattern detector actually tracks occurrences.

Regeneratable (Bricks & Studs)

Each module is a self-contained "brick" with a well-defined public API ("stud"). The similarity module, entity extraction, and contradiction detection are all independent -- they can be replaced, tested, or regenerated from their specification without affecting other modules.


Data Flow

Store Knowledge Flow

User calls store_knowledge(content, concept, ...)
    |
    v
MemoryClassifier assigns category (if not given)
    |
    v
extract_entity_name() extracts proper nouns
    |
    v
CREATE SemanticMemory node in Kuzu
    |
    +---> If source_id: CREATE DERIVES_FROM edge
    |
    +---> If temporal: _detect_supersedes()
    |       |
    |       +---> Find existing facts for same entity
    |       +---> detect_contradiction() checks for conflicts
    |       +---> CREATE SUPERSEDES edge + TRANSITIONED_TO edge
    |
    +---> _create_similarity_edges()
            |
            +---> compute_similarity() against recent nodes
            +---> CREATE SIMILAR_TO edges for scores > 0.3

Retrieve Subgraph Flow (Graph RAG)

User calls retrieve_subgraph(query, max_nodes=20)
    |
    v
Keyword matching: CONTAINS on concept + content
    |
    v
Entity-centric retrieval: extract_entity_name(query)
    +---> Match on entity_name field
    |
    v
Merge direct matches (deduped by memory_id)
    |
    v
Graph traversal: follow SIMILAR_TO edges (1 hop)
    |
    v
Follow SUPERSEDES chain for temporal context
    |
    v
Follow TRANSITIONED_TO chain for value transitions
    |
    v
Collect all edges between result nodes
    |
    v
Return KnowledgeSubgraph(nodes, edges, query)
    |
    v
User calls subgraph.to_llm_context() for LLM-ready text