Automated Memory-Code Linking¶
Status: ✅ Implemented (Week 3)
Overview¶
The Kuzu memory backend automatically links memories to relevant code entities (files and functions) when memories are stored. This creates a rich graph connecting agent memories with the codebase structure, enabling code-aware memory retrieval.
How It Works¶
Automatic Linking on Storage¶
When KuzuBackend.store_memory() is called, the system automatically:
- Links to files based on file paths in metadata
- Links to functions based on function names mentioned in content
- Assigns relevance scores based on link quality
- Prevents duplicates by checking for existing links
Link Types¶
File Links (RELATES*TO_FILE**)¶
Created when memory metadata contains a file path:
memory = MemoryEntry(
id="mem-1",
memory_type=MemoryType.EPISODIC,
content="Refactored validation logic",
metadata={"file": "src/utils.py"}, # ← Triggers file link
...
)
Relevance Score: 1.0 (metadata is explicit and reliable)
Function Links (RELATES*TO_FUNCTION**)¶
Created when function names appear in memory content:
memory = MemoryEntry(
id="mem-2",
memory_type=MemoryType.SEMANTIC,
content="The validate_input function checks required fields", # ← Mentions function
...
)
Relevance Score: 0.8 (content matching is less precise)
Link Properties¶
Each memory-code link includes:
- relevance_score (DOUBLE): 0.0-1.0 indicating link quality
- context (STRING): Describes how the link was created
"metadata_file_match"- File path from metadata"content_name_match"- Function name in content- timestamp (TIMESTAMP): When the link was created
Usage¶
Enable Auto-Linking (Default)¶
from amplihack.memory.backends.kuzu_backend import KuzuBackend
# Auto-linking enabled by default
backend = KuzuBackend()
backend.initialize()
# Memories will auto-link to code
backend.store_memory(memory)
Disable Auto-Linking¶
Query Memory-Code Links¶
Find code linked to a memory:
# Get all files linked to a memory
result = backend.connection.execute("""
MATCH (m:EpisodicMemory {memory_id: $memory_id})-[r:RELATES_TO_FILE_EPISODIC]->(cf:CodeFile)
RETURN cf.file_path, r.relevance_score
""",
{"memory_id": "mem-1"}
)
Find memories linked to a file:
# Get all memories about a specific file
result = backend.connection.execute("""
MATCH (m:EpisodicMemory)-[r:RELATES_TO_FILE_EPISODIC]->(cf:CodeFile {file_id: $file_id})
RETURN m.memory_id, m.title, r.relevance_score
ORDER BY r.relevance_score DESC
""",
{"file_id": "src/main.py"}
)
Relationship Schema¶
Auto-linking creates 10 relationship types (5 memory types × 2 code targets):
Memory → File¶
RELATES_TO_FILE_EPISODICRELATES_TO_FILE_SEMANTICRELATES_TO_FILE_PROCEDURALRELATES_TO_FILE_PROSPECTIVERELATES_TO_FILE_WORKING
Memory → Function¶
RELATES_TO_FUNCTION_EPISODICRELATES_TO_FUNCTION_SEMANTICRELATES_TO_FUNCTION_PROCEDURALRELATES_TO_FUNCTION_PROSPECTIVERELATES_TO_FUNCTION_WORKING
Performance¶
- <100ms per memory for auto-linking (typically 0-5 links)
- Lazy database queries - only runs when code entities exist
- Deduplication - prevents creating duplicate links
- Non-blocking - linking failures don't break memory storage
Example: Full Workflow¶
from amplihack.memory.backends.kuzu_backend import KuzuBackend
from amplihack.memory.models import MemoryEntry, MemoryType
from datetime import datetime
# Initialize backend
backend = KuzuBackend()
backend.initialize()
# Import code graph (prerequisite)
from amplihack.memory.kuzu.code_graph import KuzuCodeGraph
from amplihack.memory.kuzu.connector import KuzuConnector
connector = KuzuConnector()
connector.connect()
code_graph = KuzuCodeGraph(connector)
code_graph.run_blarify("./src", languages=["python"])
# Store memory with file metadata
memory1 = MemoryEntry(
id="mem-refactor-1",
session_id="session-123",
agent_id="agent-builder",
memory_type=MemoryType.EPISODIC,
title="Refactored Authentication",
content="Moved validation to separate function",
metadata={"file": "src/auth.py", "change_type": "refactor"},
created_at=datetime.now(),
accessed_at=datetime.now(),
)
backend.store_memory(memory1)
# ✓ Automatically linked to src/auth.py with relevance=1.0
# Store memory mentioning function
memory2 = MemoryEntry(
id="mem-pattern-1",
session_id="session-123",
agent_id="agent-builder",
memory_type=MemoryType.SEMANTIC,
title="Validation Pattern",
content="The validate_token function uses JWT decode with verification",
metadata={"category": "security"},
created_at=datetime.now(),
accessed_at=datetime.now(),
)
backend.store_memory(memory2)
# ✓ Automatically linked to validate_token() with relevance=0.8
# Query memories for a file
results = backend.connection.execute("""
MATCH (m)-[r:RELATES_TO_FILE_EPISODIC]->(cf:CodeFile {file_id: 'src/auth.py'})
RETURN m.memory_id, m.title, r.relevance_score
ORDER BY r.relevance_score DESC
""")
for row in results:
print(f"Memory: {row[1]} (score: {row[2]})")
Implementation Details¶
Matching Logic¶
File Matching¶
- Uses substring matching:
cf.file_path CONTAINS $file_path OR $file_path CONTAINS cf.file_path - Supports both absolute and relative paths
- Handles partial path matches
Function Matching¶
- Uses substring matching:
$content CONTAINS f.function_name - Filters out very short names (< 4 chars) to avoid false positives
- Case-sensitive matching
Deduplication¶
- Checks for existing links before creating new ones
- Uses MATCH query to verify relationship doesn't exist
- Prevents duplicate links within same transaction
Error Handling¶
- Auto-linking failures do NOT fail memory storage
- Errors are logged as warnings
- Graceful degradation ensures data integrity
Testing¶
Comprehensive test suite in tests/memory/backends/test_kuzu_auto_linking.py:
- ✅ Basic file linking
- ✅ Basic function linking
- ✅ Relevance scoring
- ✅ Deduplication
- ✅ Context metadata
- ✅ Error handling
- ✅ Performance requirements
- ✅ Disable functionality
Run tests:
Demo¶
Run the interactive demo:
Output shows:
- File-based auto-linking (relevance=1.0)
- Function-based auto-linking (relevance=0.8)
- Mixed linking (both file and function)
- Deduplication behavior
- Disabled linking mode
Future Enhancements (Week 5-6)¶
- Code context injection at retrieval time
- Semantic similarity for fuzzy matching
- Class linking based on class names in content
- Configurable relevance scoring via custom rules
- Background batch linking for historical memories