Azdoit¶
This directory contains documentation for the azdoit enhancement to azlin, which provides intelligent Azure operations through natural language commands.
Overview¶
Azdoit is an intelligent orchestration layer that: - Parses natural language commands into Azure operations - Selects optimal execution strategies - Provides cost estimation before execution - Handles failures with automatic recovery - Maintains state across operations
Architecture¶
The system consists of several key components:
Strategy System - Four implementation strategies: - Azure CLI strategy (direct CLI commands) - Terraform strategy (infrastructure as code) - MCP Server strategy (Model Context Protocol) - Custom code strategy (Python code generation)
State Management - JSON-based persistence: - Tracks objective execution - Stores configuration preferences - Maintains operation history
Cost Estimation - Azure Pricing API integration: - Estimates costs before execution - Respects user-defined cost limits - Provides cost breakdowns
Recovery Agent - MS Learn integration: - Analyzes failures - Researches solutions - Retries with corrections
Key Principles¶
Bricks & Studs Philosophy¶
Bricks (Modules): - Self-contained with clear boundaries - Single responsibility - No circular dependencies - Independently testable
Studs (Interfaces): - Stable public APIs - Full type hints - Clear contracts - Observable results
Design Goals¶
- Minimal - Every module must justify its existence
- Clear - Obvious what each module does
- Testable - Easy to write unit and integration tests
- Regeneratable - Can rebuild any module from contracts
- Observable - Returns structured results, no hidden state
Module Summary¶
Core Modules (11 new files)¶
- strategy_selector.py - Choose optimal execution strategy
- objective_state.py - Persist objectives to JSON
- cost_estimator.py - Estimate Azure resource costs
- recovery_agent.py - Research and retry failures
- terraform_generator.py - Generate Terraform HCL
- mcp_client.py - Azure MCP Server client
- mslearn_client.py - MS Learn documentation search
- strategies/base.py - Strategy interface
- strategies/azure_cli.py - Azure CLI strategy
- strategies/terraform.py - Terraform strategy
- strategies/mcp_server.py - MCP Server strategy
- strategies/custom_code.py - Python code gen strategy
Plus: - errors.py - Exception hierarchy - config.py - Configuration management
Integration Points¶
- Existing: IntentParser, CommandExecutor
- New: Strategy system, state management, cost estimation, recovery
JSON Schemas¶
- Objective State - Persistent execution state
- Configuration - User preferences and settings
Implementation Status¶
- API contracts defined
- Architecture documented
- Implementation guide created
- Error hierarchy implemented
- Base strategy interface implemented
- Objective state manager implemented
- Azure CLI strategy implemented
- Strategy selector implemented
- Terraform strategy implemented
- Supporting services implemented
- Advanced strategies implemented
- CLI integration completed
- Tests written
- Documentation updated
Example Usage¶
Basic Flow¶
from azlin.agentic import handle_do_command
# User runs: azlin do "create 3 VMs called test-{1,2,3}"
result = handle_do_command(
"create 3 VMs called test-{1,2,3}",
resource_group="my-rg",
cost_limit=50.0,
prefer_terraform=True
)
if result.success:
print(f"Created {len(result.resources_created)} resources")
print(f"Total cost: ${result.cost_estimate:.2f}")
else:
print(f"Failed: {result.error}")
Strategy Selection¶
from azlin.agentic.strategy_selector import StrategySelector
from azlin.agentic.strategies.base import ExecutionContext
# Create context
context = ExecutionContext(
intent={"intent": "provision_vm", ...},
parameters={"vm_name": "test", "count": 3},
resource_group="my-rg",
cost_limit=50.0
)
# Select strategy
selector = StrategySelector()
strategy = selector.select(context, preferences={"prefer_terraform": True})
# Execute
result = strategy.execute(context)
State Management¶
from azlin.agentic.objective_state import ObjectiveStateManager
# Create objective
manager = ObjectiveStateManager()
objective = manager.create_objective(
user_request="create 3 VMs",
intent={...},
parameters={...}
)
# Update with result
manager.update_objective(
objective.objective_id,
status="completed",
result=execution_result
)
# List all objectives
objectives = manager.list_objectives(status="completed")
Recovery Flow¶
from azlin.agentic.recovery_agent import RecoveryAgent
# Execution failed
if not result.success:
recovery = RecoveryAgent()
# Analyze failure and create recovery plan
plan = recovery.analyze_failure(context, result)
if plan.strategy == "retry":
# Attempt recovery
recovered = recovery.attempt_recovery(context, result, max_attempts=3)
if recovered and recovered.success:
print("Recovery succeeded!")
Testing Strategy¶
Unit Tests¶
Each module has comprehensive unit tests:
pytest tests/unit/test_strategy_selector.py -v
pytest tests/unit/test_objective_state.py -v
pytest tests/unit/strategies/ -v
Integration Tests¶
End-to-end scenarios:
Coverage Goals¶
- Unit tests: >80% coverage per module
- Integration tests: All critical paths
- Error scenarios: All exception types
Configuration¶
User Configuration File¶
~/.azlin/azdoit/config.json:
{
"strategy_preferences": {
"prefer_terraform": false,
"max_cost": 100.0
},
"cost_estimation": {
"enabled": true,
"confirmation_threshold": 10.0
},
"recovery": {
"enabled": true,
"max_attempts": 3
},
"mcp_server": {
"enabled": false,
"url": "https://azure-mcp.azurewebsites.net"
}
}
State Directory¶
~/.azlin/azdoit/state/: - Contains JSON files for each objective - One file per objective: {uuid}.json - Automatic cleanup of old objectives
CLI Commands¶
Enhanced azlin do Command¶
# Basic usage
azlin do "create a VM called test"
# With strategy preference
azlin do "create 3 VMs" --prefer-terraform
# With cost limit
azlin do "provision fleet" --cost-limit 50
# Dry run
azlin do "delete all VMs" --dry-run
# Disable recovery
azlin do "create VM" --no-recovery
# View objectives
azlin do list
azlin do show --objective-id {uuid}
Dependencies¶
New Dependencies¶
anthropic>=0.40.0- Already in PR #156- No additional dependencies needed for core functionality
Optional Dependencies¶
hcl2- For Terraform HCL parsing (if needed)requests- For MCP Server client (standard library alternative:urllib)
Performance¶
Expected Performance¶
- Intent parsing: <2 seconds
- Strategy selection: <100ms
- Azure CLI execution: 30-300 seconds (depends on operation)
- Terraform execution: 60-600 seconds (depends on complexity)
- State persistence: <10ms
Optimization Opportunities¶
- Cache pricing data (reduce API calls)
- Parallel strategy scoring (if multiple strategies viable)
- Lazy load strategies (only load when needed)
- Background state cleanup (don't block user)
Security¶
Considerations¶
- Command Injection - Use list args, not shell=True
- API Key Exposure - Never log full keys
- Path Traversal - Validate file paths
- Cost Limits - Respect user-defined limits
- Resource Cleanup - Always rollback on failure
Best Practices¶
- Validate all user inputs
- Use subprocess safely
- Sanitize file paths
- Respect cost constraints
- Clean up temporary files
Troubleshooting¶
Common Issues¶
Strategy not selected: - Check can_handle() implementation - Verify intent type matches - Check cost limits
State file corrupted: - Delete file from ~/.azlin/azdoit/state/ - Run cleanup: azlin do cleanup
Cost estimation fails: - Check network connectivity - Verify Azure Pricing API access - Use fallback estimates
Recovery fails: - Check error message - Verify MS Learn API access - Try manual recovery
Contributing¶
When adding new features:
- Design API contract first - Update contracts document
- Update architecture - Add to architecture diagram
- Write tests - Before implementation
- Implement module - Follow implementation guide
- Update documentation - Keep docs in sync
Philosophy Alignment¶
This design follows azlin's core philosophy:
- Ruthless Simplicity - Every module justified
- Bricks & Studs - Clear interfaces, isolated modules
- Zero-BS - No stubs, no TODOs, real implementations
- Quality over Speed - Well-tested, robust code
- Regeneratable - Can rebuild from contracts
Next Steps¶
- Review contracts - Provide feedback on APIs
- Approve architecture - Verify design is sound
- Begin implementation - Follow 5-week plan
- Iterate - Improve based on real usage
- Document learnings - Update guides with insights
Questions?¶
For more information, see the main AI Command Documentation or open an issue for clarification.
Design Status: ✅ Complete and ready for review
Implementation Status: 🚧 Pending approval
Integration Status: ⏳ Awaiting implementation