Architecture¶
Overview¶
amplihack-memory-lib is organized into three layers, each building on the one below:
+------------------------------------------------------------+
| Public API Layer |
| HierarchicalMemory | CognitiveMemory | ExperienceStore|
+------------------------------------------------------------+
| Shared Utilities Layer |
| similarity | entity_extraction | contradiction |
| MemoryClassifier | pattern_recognition | security |
+------------------------------------------------------------+
| Storage Backend Layer |
| MemoryConnector -> KuzuBackend | SQLiteBackend |
+------------------------------------------------------------+
| Kuzu / SQLite |
+------------------------------------------------------------+
Core Components¶
HierarchicalMemory (Graph RAG)¶
File: hierarchical_memory.py
The primary memory system for AI agents that need structured knowledge with relationships. It manages a Kuzu knowledge graph containing:
- SemanticMemory nodes -- Distilled facts with concept labels, confidence scores, tags, and entity names
- EpisodicMemory nodes -- Raw source content (episodes) with provenance labels
- SIMILAR_TO edges -- Computed via Jaccard text similarity at storage time
- DERIVES_FROM edges -- Link facts to their source episodes (provenance)
- SUPERSEDES edges -- Track temporal updates (newer fact replaces older)
- TRANSITIONED_TO edges -- Explicit value transition chains for temporal reasoning
Key methods:
| Method | Purpose |
|---|---|
store_knowledge() |
Store a fact node, auto-classify, compute similarity edges |
store_episode() |
Store a raw episode node |
retrieve_subgraph() |
Graph RAG retrieval: keyword match + similarity traversal |
get_all_knowledge() |
Get all fact nodes for an agent |
export_graph() / import_graph() |
Serialize/deserialize the full graph |
Protocol-compatible aliases (store_fact, search_facts, get_all_facts) provide interop with ExperienceStore consumers.
CognitiveMemory (Six-Type)¶
File: cognitive_memory.py
A higher-level memory system modeled after human cognition with six distinct memory types, each stored in its own Kuzu node table:
| Memory Type | Table | Purpose | Lifecycle |
|---|---|---|---|
| Sensory | SensoryMemory |
Raw short-lived observations | Auto-expires via TTL |
| Working | WorkingMemory |
Active task context | Bounded capacity (20 slots), evicts lowest relevance |
| Episodic | EpisodicMemory |
Autobiographical events | Consolidatable into summaries |
| Semantic | SemanticMemory |
Distilled facts/knowledge | Persistent, searchable by keyword |
| Procedural | ProceduralMemory |
Step-by-step procedures | Usage-count tracked, searchable |
| Prospective | ProspectiveMemory |
Trigger-action pairs | Pending -> triggered -> resolved |
Relationship edges:
- SIMILAR_TO (SemanticMemory -> SemanticMemory)
- DERIVES_FROM (SemanticMemory -> EpisodicMemory)
- PROCEDURE_DERIVES_FROM (ProceduralMemory -> EpisodicMemory)
- CONSOLIDATES (ConsolidatedEpisode -> EpisodicMemory)
- ATTENDED_TO (SensoryMemory -> EpisodicMemory)
ExperienceStore (Legacy/Simple)¶
File: store.py
A simpler, flat storage system for experience records. Uses MemoryConnector to delegate to either the Kuzu or SQLite backend. Provides:
- Automatic compression of old experiences
- Retention policies (age limit, count limit)
- Duplicate detection
- Full-text search
- Storage quota enforcement
MemoryConnector (Backend Factory)¶
File: connector.py
Factory class that creates and manages the appropriate backend:
# Kuzu backend (default)
connector = MemoryConnector(agent_name="my-agent", backend="kuzu")
# SQLite fallback
connector = MemoryConnector(agent_name="my-agent", backend="sqlite")
Each agent gets isolated storage under ~/.amplihack/memory/<agent_name>/.
Shared Utilities¶
Similarity (similarity.py)¶
Deterministic text similarity without ML embeddings:
compute_word_similarity(a, b)-- Jaccard coefficient on tokenized words minus stop wordscompute_tag_similarity(a, b)-- Jaccard coefficient on tag listscompute_similarity(node_a, node_b)-- Weighted composite: 0.5 * word + 0.2 * tag + 0.3 * conceptrerank_facts_by_query(facts, query)-- Rerank retrieved facts by keyword relevance; boost temporal facts when query contains temporal cues
Entity Extraction (entity_extraction.py)¶
Extracts proper nouns from text using regex heuristics:
- Handles apostrophe names (O'Brien), hyphenated names (Al-Hassan), multi-word names (Sarah Chen)
- Checks concept field first (more specific), then content
- Returns lowercase for consistent indexing
Contradiction Detection (contradiction.py)¶
Detects when two facts about the same concept contain conflicting numerical values:
- Requires overlapping concept words (at least one meaningful word in common)
- Extracts numbers from both facts
- Flags when facts have numbers unique to each (potential update/conflict)
MemoryClassifier (hierarchical_memory.py)¶
Rule-based keyword classifier:
| Keywords | Category |
|---|---|
| step, how to, procedure, recipe | PROCEDURAL |
| plan, goal, future, will, todo | PROSPECTIVE |
| happened, event, observed | EPISODIC |
| (default) | SEMANTIC |
Security Layer (security.py)¶
- AgentCapabilities -- Capability-based access control (scope levels, allowed types, query cost limits)
- CredentialScrubber -- Regex-based detection and redaction of API keys, passwords, tokens, SSH keys, DB URLs
- QueryValidator -- SQL query cost estimation and safety validation
- SecureMemoryBackend -- Wrapper that enforces all security policies
Pattern Recognition (pattern_recognition.py)¶
- PatternDetector -- Tracks recurring patterns across discoveries, recognizes when threshold is reached
recognize_patterns()-- Batch pattern recognition with known-pattern filtering- Confidence formula:
min(0.5 + occurrences * 0.1, 0.95), adjusted by validation success rate
Design Philosophy¶
Ruthless Simplicity¶
Every component has a single, clear purpose. No unnecessary abstractions. The similarity module uses Jaccard coefficients instead of ML embeddings -- simple, deterministic, and sufficient for the use case.
Zero-BS Implementation¶
No stubs, no placeholders, no fake implementations. Every function works or does not exist. The security layer actually scrubs credentials. The pattern detector actually tracks occurrences.
Regeneratable (Bricks & Studs)¶
Each module is a self-contained "brick" with a well-defined public API ("stud"). The similarity module, entity extraction, and contradiction detection are all independent -- they can be replaced, tested, or regenerated from their specification without affecting other modules.
Data Flow¶
Store Knowledge Flow¶
User calls store_knowledge(content, concept, ...)
|
v
MemoryClassifier assigns category (if not given)
|
v
extract_entity_name() extracts proper nouns
|
v
CREATE SemanticMemory node in Kuzu
|
+---> If source_id: CREATE DERIVES_FROM edge
|
+---> If temporal: _detect_supersedes()
| |
| +---> Find existing facts for same entity
| +---> detect_contradiction() checks for conflicts
| +---> CREATE SUPERSEDES edge + TRANSITIONED_TO edge
|
+---> _create_similarity_edges()
|
+---> compute_similarity() against recent nodes
+---> CREATE SIMILAR_TO edges for scores > 0.3
Retrieve Subgraph Flow (Graph RAG)¶
User calls retrieve_subgraph(query, max_nodes=20)
|
v
Keyword matching: CONTAINS on concept + content
|
v
Entity-centric retrieval: extract_entity_name(query)
+---> Match on entity_name field
|
v
Merge direct matches (deduped by memory_id)
|
v
Graph traversal: follow SIMILAR_TO edges (1 hop)
|
v
Follow SUPERSEDES chain for temporal context
|
v
Follow TRANSITIONED_TO chain for value transitions
|
v
Collect all edges between result nodes
|
v
Return KnowledgeSubgraph(nodes, edges, query)
|
v
User calls subgraph.to_llm_context() for LLM-ready text