Blarify Code Graph Integration¶
Complete integration of blarify code graph with Neo4j memory system.
Overview¶
This integration allows the memory system to understand code structure by:
- Converting codebase to graph representation via blarify
- Storing code nodes (files, classes, functions) in Neo4j
- Linking code to memories for context-aware retrieval
- Querying code relationships for agent decision-making
Key Feature: Code graph and memory graph live in the SAME Neo4j database, enabling powerful cross-domain queries.
Architecture¶
Node Types¶
Code Nodes¶
- CodeFile: Source files with language and LOC
- Class: Classes with docstrings and metadata
- Function: Functions/methods with parameters and complexity
- Import: Import statements (as relationships)
Relationship Types¶
DEFINED_IN: Class/Function → CodeFileMETHOD_OF: Function → ClassIMPORTS: CodeFile → CodeFileCALLS: Function → FunctionINHERITS: Class → ClassREFERENCES: Generic referencesRELATES_TO_FILE: Memory → CodeFileRELATES_TO_FUNCTION: Memory → Function
Schema Integration¶
Code schema extends existing memory schema:
// Memory nodes (existing)
(:Memory)-[:HAS_MEMORY]->(:AgentType)
// Code nodes (new)
(:Function)-[:DEFINED_IN]->(:CodeFile)
(:Function)-[:METHOD_OF]->(:Class)
(:Class)-[:DEFINED_IN]->(:CodeFile)
// Code-Memory links (new)
(:Memory)-[:RELATES_TO_FILE]->(:CodeFile)
(:Memory)-[:RELATES_TO_FUNCTION]->(:Function)
Installation¶
Prerequisites¶
- Neo4j Running: Memory system Neo4j instance
- Blarify Installed (optional for testing):
- Optional SCIP for Speed (330x faster):
Supported Languages¶
Blarify supports 6 languages:
- Python
- JavaScript
- TypeScript
- Ruby
- Go
- C#
Usage¶
1. Basic Import¶
Import entire codebase:
This will:
- Run blarify on
./src(default) - Generate code graph JSON
- Import to Neo4j
- Link to existing memories
- Display statistics
2. Import Specific Directory¶
3. Filter by Languages¶
4. Use Existing Blarify Output¶
Skip blarify run if you already have output:
5. Incremental Update¶
Update only changed files:
6. Link to Project¶
Associate code with specific project:
Programmatic API¶
Initialize Integration¶
from amplihack.memory.neo4j.connector import Neo4jConnector
from amplihack.memory.neo4j.code_graph import BlarifyIntegration
with Neo4jConnector() as conn:
integration = BlarifyIntegration(conn)
# Initialize schema
integration.initialize_code_schema()
Import Code Graph¶
from pathlib import Path
# Import blarify output
counts = integration.import_blarify_output(
Path(".amplihack/blarify_output.json"),
project_id="my-project"
)
print(f"Imported {counts['files']} files, {counts['functions']} functions")
Link Code to Memories¶
# Create relationships between code and memories
link_count = integration.link_code_to_memories(project_id="my-project")
print(f"Created {link_count} code-memory relationships")
Query Code Context¶
# Get code context for a memory
context = integration.query_code_context(memory_id="memory-123")
for file in context["files"]:
print(f"File: {file['path']} ({file['language']})")
for func in context["functions"]:
print(f"Function: {func['name']} at line {func['line_number']}")
Get Statistics¶
stats = integration.get_code_stats(project_id="my-project")
print(f"Files: {stats['file_count']}")
print(f"Classes: {stats['class_count']}")
print(f"Functions: {stats['function_count']}")
print(f"Total lines: {stats['total_lines']}")
Testing¶
Run Test Suite¶
Tests run with sample data, so you don't need blarify installed to verify integration works.
Test coverage:
- ✓ Schema initialization
- ✓ Sample code import
- ✓ Code-memory relationships
- ✓ Query functionality
- ✓ Incremental updates
Manual Testing¶
# 1. Create sample blarify output
from scripts.test_blarify_integration import create_sample_blarify_output
import json
sample_data = create_sample_blarify_output()
with open("test_output.json", "w") as f:
json.dump(sample_data, f, indent=2)
# 2. Import sample data
python scripts/import_codebase_to_neo4j.py --blarify-json test_output.json
# 3. Query in Neo4j Browser
MATCH (cf:CodeFile) RETURN cf LIMIT 10
Blarify Output Format¶
JSON Structure¶
{
"files": [
{
"path": "src/module/file.py",
"language": "python",
"lines_of_code": 150,
"last_modified": "2025-01-01T00:00:00Z"
}
],
"classes": [
{
"id": "class:MyClass",
"name": "MyClass",
"file_path": "src/module/file.py",
"line_number": 10,
"docstring": "Class description",
"is_abstract": false
}
],
"functions": [
{
"id": "func:MyClass.my_method",
"name": "my_method",
"file_path": "src/module/file.py",
"line_number": 20,
"docstring": "Method description",
"parameters": ["self", "arg1", "arg2"],
"return_type": "str",
"is_async": false,
"complexity": 5,
"class_id": "class:MyClass"
}
],
"imports": [
{
"source_file": "src/module/file.py",
"target_file": "src/other/module.py",
"symbol": "MyFunction",
"alias": "my_func"
}
],
"relationships": [
{
"type": "CALLS",
"source_id": "func:MyClass.method1",
"target_id": "func:OtherClass.method2"
}
]
}
Custom Blarify Output¶
If blarify output format differs, modify parsing in code_graph.py:
_import_files(): Parse file nodes_import_classes(): Parse class nodes_import_functions(): Parse function nodes_import_imports(): Parse import relationships_import_relationships(): Parse code relationships
Use Cases¶
1. Context-Aware Memory Retrieval¶
Query memories with relevant code context:
MATCH (m:Memory)-[:RELATES_TO_FUNCTION]->(f:Function)
WHERE f.name = 'execute_query'
RETURN m.content, f.docstring, f.file_path
2. Code Change Impact Analysis¶
Find memories affected by code changes:
MATCH (cf:CodeFile {path: 'connector.py'})<-[:DEFINED_IN]-(f:Function)
MATCH (f)<-[:RELATES_TO_FUNCTION]-(m:Memory)
RETURN m.content, m.agent_type, f.name
3. Function Call Chain Analysis¶
Trace function calls from memory to implementation:
MATCH (m:Memory)-[:RELATES_TO_FUNCTION]->(f1:Function)
MATCH path = (f1)-[:CALLS*1..3]->(f2:Function)
RETURN path
4. Class Hierarchy and Memories¶
Find memories related to class hierarchies:
MATCH (c1:Class)-[:INHERITS]->(c2:Class)
MATCH (c1)<-[:METHOD_OF]-(f:Function)<-[:RELATES_TO_FUNCTION]-(m:Memory)
RETURN c1.name, c2.name, m.content
5. Agent Learning from Code¶
Help agents learn from existing code:
MATCH (f:Function)
WHERE f.complexity > 10
OPTIONAL MATCH (f)<-[:RELATES_TO_FUNCTION]-(m:Memory)
RETURN f.name, f.complexity,
CASE WHEN m IS NULL THEN 'No memory' ELSE m.content END as memory
Performance¶
Optimization Tips¶
- Use SCIP for Speed: 330x faster than LSP
- Incremental Updates: Only import changed files
- Filter Languages: Reduce parsing time
- Neo4j Indexes: Automatically created for performance
Benchmarks¶
Typical codebase (1000 files, 100K LOC):
| Operation | Time (LSP) | Time (SCIP) |
|---|---|---|
| Blarify Analysis | 5-10 min | ~2 sec |
| Neo4j Import | ~30 sec | ~30 sec |
| Memory Linking | ~10 sec | ~10 sec |
| Total | 6-11 min | ~42 sec |
Troubleshooting¶
Blarify Not Installed¶
If blarify not installed, use sample data for testing:
Neo4j Connection Failed¶
Verify Neo4j is running:
# Check Neo4j status
docker ps | grep neo4j
# Or use memory system tools
python -m amplihack.memory.neo4j.connector
Import Failed¶
Check blarify output format:
import json
with open(".amplihack/blarify_output.json") as f:
data = json.load(f)
print(json.dumps(data, indent=2))
Memory Linking Not Working¶
Verify metadata format:
# Memories must have file path in metadata
memory_store.create_memory(
content="...",
agent_type="builder",
metadata={"file": "connector.py"} # Important!
)
Advanced Configuration¶
Custom Neo4j Instance¶
python scripts/import_codebase_to_neo4j.py \
--neo4j-uri bolt://localhost:7687 \
--neo4j-user neo4j \
--neo4j-password mypassword
Skip Memory Linking¶
Custom Output Path¶
Future Enhancements¶
Planned Features¶
- Real-time Updates: Watch file system for changes
- Vector Embeddings: Semantic code search
- Diff Analysis: Track code evolution over time
- AI-Generated Summaries: Automatic code documentation
- Cross-Language References: Link across language boundaries
Contributing¶
To extend blarify integration:
- Add new node types in
code_graph.py - Create parsers for custom formats
- Add relationship types
- Update schema initialization
- Add tests in
test_blarify_integration.py
References¶
Support¶
For issues or questions:
- Check test suite:
python scripts/test_blarify_integration.py - Review logs in console output
- Check Neo4j Browser:
http://localhost:7474 - See
docs/neo4j_memory_system.mdfor memory system details
Status: Production ready Last Updated: 2025-01-03 Maintainer: Amplihack Team