Documentation Knowledge Graph - Quick Reference¶
One-page guide for using the documentation graph system
Import Documentation¶
from amplihack.memory.neo4j import Neo4jConnector, DocGraphIntegration
from pathlib import Path
# Setup
connector = Neo4jConnector()
connector.connect()
doc_integration = DocGraphIntegration(connector)
doc_integration.initialize_doc_schema()
# Import single file
stats = doc_integration.import_documentation(
file_path=Path("docs/my_doc.md"),
project_id="my-project"
)
# Import all files in directory
for doc_path in Path("docs/").glob("**/*.md"):
doc_integration.import_documentation(doc_path)
Link to Code¶
# Link documentation to code nodes (from blarify)
link_count = doc_integration.link_docs_to_code()
print(f"Created {link_count} links")
Query Documentation¶
# Search for relevant docs
results = doc_integration.query_relevant_docs("authentication", limit=5)
for doc in results:
print(f"{doc['title']}")
print(f" Path: {doc['path']}")
print(f" Concept matches: {doc['concept_matches']}")
CLI Usage¶
# Import all markdown files
python scripts/import_docs_to_neo4j.py docs/
# Import with linking
python scripts/import_docs_to_neo4j.py --link-code --link-memory docs/
# Test parsing (no Neo4j needed)
python scripts/test_doc_parsing_standalone.py
Graph Schema¶
DocFile → HAS_SECTION → Section
DocFile → DEFINES → Concept
DocFile → REFERENCES → CodeFile
Concept → IMPLEMENTED_IN → Function/Class
Memory → DOCUMENTED_IN → DocFile
Extracted Data¶
From each markdown file:
- Title: First H1 heading
- Sections: All headings (H1-H6)
- Concepts: Headings, bold text, code languages
- Code refs: @file.py, file:line,
inline.py - Links: text
- Metadata: Size, word count, last modified
Example Queries¶
Find documentation about a topic¶
Get all concepts from a document¶
Find code implementing a concept¶
Find documentation referencing specific code¶
Statistics¶
# Get graph statistics
stats = doc_integration.get_doc_stats()
print(f"Documents: {stats['doc_count']}")
print(f"Concepts: {stats['concept_count']}")
print(f"Sections: {stats['section_count']}")
print(f"Code refs: {stats['code_ref_count']}")
Code Reference Patterns¶
The system detects:
Integration¶
With Code Graph (blarify)¶
# 1. Import code graph
blarify = BlarifyIntegration(connector)
blarify.import_blarify_output(Path("code_graph.json"))
# 2. Import documentation
doc_integration.import_documentation(Path("docs/"))
# 3. Link them
doc_integration.link_docs_to_code()
With Memory System¶
# Import docs and link to memories
doc_integration.import_documentation(Path("docs/"))
doc_integration.link_docs_to_memories()
Testing¶
# Full test (requires Neo4j running)
python scripts/test_doc_graph.py
# Parsing only (no Neo4j)
python scripts/test_doc_parsing_standalone.py
Test Results (5 real files):
- 187 sections extracted
- 362 concepts identified
- 5 code references found
- 0 errors ✓
Files¶
Implementation:
src/amplihack/memory/neo4j/doc_graph.py
CLI Tools:
scripts/import_docs_to_neo4j.pyscripts/test_doc_graph.pyscripts/test_doc_parsing_standalone.py
Documentation:
docs/documentation_knowledge_graph.md(full guide)docs/doc_graph_quick_reference.md(this file)
Common Patterns¶
Import all project docs¶
from pathlib import Path
# Find all markdown files
project_root = Path(".")
doc_files = list(project_root.glob("**/*.md"))
# Import each one
for doc_file in doc_files:
try:
stats = doc_integration.import_documentation(doc_file)
print(f"✓ {doc_file.name}: {stats}")
except Exception as e:
print(f"✗ {doc_file.name}: {e}")
Find documentation for current task¶
def get_relevant_docs(task_description: str) -> List[Dict]:
"""Find documentation relevant to a task."""
# Extract key terms from task
key_terms = extract_keywords(task_description)
# Search for each term
all_docs = []
for term in key_terms:
docs = doc_integration.query_relevant_docs(term, limit=3)
all_docs.extend(docs)
# Deduplicate and sort by relevance
unique_docs = {doc['path']: doc for doc in all_docs}
return list(unique_docs.values())
Update documentation graph¶
# Re-import changed files (idempotent)
changed_files = get_changed_markdown_files()
for file_path in changed_files:
doc_integration.import_documentation(file_path)
# Rebuild links
doc_integration.link_docs_to_code()
doc_integration.link_docs_to_memories()
Performance Tips¶
- Batch imports: Import multiple files in one transaction
- Link after import: Import all docs first, then link
- Use project_id: Scope queries to specific projects
- Cache results: Query results are stable until docs change
Troubleshooting¶
Neo4j connection fails:
No concepts extracted:
- Check markdown has headings
- Check for bold text (important)
- Check for code blocks with language
No code links created:
- Ensure code graph imported first (blarify)
- Check code references in docs (@file.py)
- Verify CodeFile nodes exist in Neo4j
Next Steps¶
After importing documentation:
- Test queries: Verify you can find relevant docs
- Link to code: Create doc-code relationships
- Link to memories: Connect learnings to docs
- Use in agents: Provide doc context to agents
Status: Implementation complete ✓ Tested: 5 real files, 0 errors Ready: For production use