Comprehensive Architecture & Design Documentation
Azure Tenant Grapher is a comprehensive cloud infrastructure discovery, documentation, and replication tool. It discovers every resource in an Azure tenant, stores the results in a richly-typed Neo4j graph database, and offers extensive tooling for visualization, analysis, documentation, and Infrastructure-as-Code (IaC) generation.
Comprehensive Azure resource discovery across all subscriptions with identity import from Microsoft Graph API. Supports filtering by subscriptions and resource groups.
Rich Neo4j graph with typed nodes, relationships, and RBAC modeling. Includes extensible relationship engine with modular rules.
Interactive 3D graph visualization with filtering, search, and ResourceGroup labels. Desktop GUI with Electron and React.
Generate Terraform, Bicep, and ARM templates from the graph with transformation rules and deployment scripts.
Natural language queries over the graph using MCP (Model Context Protocol) and AutoGen agents.
Automated DFD creation, threat enumeration using STRIDE methodology, and comprehensive security reports.
src/services/azure_discovery_service.py - Discovers Azure resources using Azure SDK with pagination and rate limiting support.
src/services/resource_processing_service.py (110 lines) - Orchestrates resource processing, coordinates AAD import, and manages concurrent processing with progress tracking.
src/container_manager.py (746 lines) - Manages Neo4j Docker container lifecycle including:
The relationship rules system is modular and extensible. Each rule implements the RelationshipRule interface and can create specific types of relationships in the graph. Rules are registered in src/relationship_rules/__init__.py.
src/iac/traverser.py) - Queries Neo4j and extracts resources with relationshipssrc/iac/engine.py) - Orchestrates the entire generation pipelinesrc/iac/emitters/terraform_emitter.py) - Converts resources to Terraform HCL/JSON format with 50+ Azure resource types mappedsrc/iac/validators/) - Validates subnet containment, address space conflicts, and Terraform syntaxThe agent mode leverages the Model Context Protocol (MCP) to provide a standardized interface between the LLM and Neo4j database. Key components:
src/mcp_server.py - Launches the MCP server process (uvx mcp-neo4j-cypher)src/agent_mode.py - Orchestrates the agent workflow with multi-step tool chainingDashboard showing Neo4j status, resource counts, and system health metrics.
Interactive scanning interface with real-time progress tracking and log streaming.
Generate tenant specifications with format options (YAML/JSON/Markdown).
IaC generation interface with format selection, subset filtering, and transformation rules.
3D graph visualization with search, filtering, and navigation controls.
Interactive chat interface for natural language queries over the graph.
Generate threat models and security reports for the tenant.
Manage environment variables and Azure credentials.
This threat model follows the Microsoft Threat Modeling Tool methodology using the STRIDE framework. The analysis focuses on the current architecture: a single-user development tool running on a developer's system using Docker for the database.
| Threat | Impact | Mitigation |
|---|---|---|
| T1: Malicious process impersonates CLI/GUI to access Neo4j | HIGH - Unauthorized access to sensitive tenant data | ✅ Neo4j requires password authentication. ⚠️ Password stored in .env file (file permissions critical) |
| T2: Stolen Azure credentials used to scan wrong tenant | MEDIUM - Unauthorized tenant discovery | ✅ Uses Azure CLI authentication (tokens time-limited). ⚠️ No additional verification of tenant ID |
| T3: Man-in-the-middle attack on localhost Neo4j connection | LOW - Requires local access | ⚠️ bolt:// protocol not encrypted. ✅ localhost-only reduces exposure |
| Threat | Impact | Mitigation |
|---|---|---|
| T4: Malicious modification of graph data in Neo4j | CRITICAL - Corrupted infrastructure documentation leading to failed deployments | ⚠️ No audit logging of graph changes. ⚠️ No backup automation by default |
| T5: Tampering with generated IaC before deployment | CRITICAL - Deployment of malicious infrastructure | ❌ No integrity checks on generated files. ✅ User reviews in deploy.sh |
| T6: Modified .env file with malicious credentials | HIGH - Unauthorized access to Azure resources or LLM usage | ⚠️ .env file has user-only permissions. ❌ No integrity verification |
| T7: Supply chain attack via compromised dependencies | CRITICAL - Arbitrary code execution | ✅ uv.lock pins versions. ⚠️ No vulnerability scanning in CI |
| Threat | Impact | Mitigation |
|---|---|---|
| T8: No audit trail for scan operations | MEDIUM - Cannot prove who scanned tenant | ✅ Logs stored in logs/ directory. ⚠️ Logs not signed or timestamped |
| T9: No audit trail for IaC deployments | HIGH - Cannot prove who deployed what resources | ✅ Azure Activity Log tracks deployments. ❌ No local deployment records |
| Threat | Impact | Mitigation |
|---|---|---|
| T10: Neo4j data exposed if Docker port is exposed | CRITICAL - Full tenant data exposure | ✅ Default docker-compose binds to localhost only. ⚠️ User can modify |
| T11: Credentials in .env file exposed | CRITICAL - Azure tenant compromise, API key theft | ✅ .env in .gitignore. ⚠️ File permissions not enforced. ❌ No encryption |
| T12: Secrets stored in Neo4j graph | HIGH - Key Vault secrets, connection strings exposed | ⚠️ Graph stores full resource properties. ❌ No secret redaction. ⚠️ Database not encrypted at rest by default |
| T13: Generated IaC files contain sensitive data | HIGH - Secrets, keys, connection strings in plain text | ⚠️ Output directory permissions user-only. ❌ No secret detection |
| T14: Logs contain sensitive information | MEDIUM - Resource names, IDs, configuration details | ⚠️ Logs stored locally with user permissions. ❌ No sensitive data filtering |
| Threat | Impact | Mitigation |
|---|---|---|
| T15: Resource exhaustion from scanning large tenants | MEDIUM - Developer workstation becomes unresponsive | ✅ Rate limiting in discovery service. ✅ Pagination support. ⚠️ No memory limits |
| T16: Neo4j container consumes excessive resources | MEDIUM - System performance degradation | ⚠️ No resource limits in docker-compose. ✅ Can be configured by user |
| T17: Disk space exhaustion from graph data | LOW - Tool stops working | ✅ Backup and cleanup utilities available. ⚠️ No automatic cleanup |
| Threat | Impact | Mitigation |
|---|---|---|
| T18: Docker socket access enables container escape | HIGH - Full system compromise | ✅ Tool only manages Neo4j container. ⚠️ User must have Docker access. ❌ No additional sandboxing |
| T19: Malicious deployment script execution | CRITICAL - Arbitrary Azure resource creation | ⚠️ deploy.sh generated with user review expected. ❌ No script validation |
| T20: Agent mode LLM prompt injection | HIGH - Unauthorized graph queries or modifications | ✅ System message restricts agent scope. ⚠️ No query validation. ⚠️ write_neo4j_cypher tool available |