Meta-Delegation Tutorial¶
Learn meta-delegation through hands-on examples in 30 minutes.
Prerequisites¶
Before starting this tutorial, you should have:
- amplihack installed and working
- Familiarity with Python
- Basic understanding of AI agents
- 30 minutes of focused time
What You'll Learn¶
By the end of this tutorial, you'll be able to:
- Run basic meta-delegation tasks
- Choose the right persona for your task
- Write effective success criteria
- Interpret results and evidence
- Handle failures and partial successes
Step 1: Your First Meta-Delegation (5 minutes)¶
Let's start with a simple task: creating a Python function with tests.
Create a Python Script¶
# first_meta_delegation.py
from amplihack.meta_delegation import run_meta_delegation
# Define what we want
goal = "Create a Python function that calculates factorial of a number"
success_criteria = "Function exists, handles edge cases, has tests, all tests pass"
# Run meta-delegation
result = run_meta_delegation(
goal=goal,
success_criteria=success_criteria,
persona_type="junior_dev",
platform="claude-code"
)
# Check results
print(f"Status: {result.status}")
print(f"Success Score: {result.success_score}/100")
print(f"\nEvidence collected:")
for item in result.evidence:
print(f" - {item.type}: {item.path}")
Run It¶
Expected Output¶
[Meta-Delegator] Starting subprocess with junior_dev persona
[Meta-Delegator] Subprocess running (PID: 12345)
[Meta-Delegator] Task in progress...
[Meta-Delegator] Task completed in 45 seconds
[Meta-Delegator] Collecting evidence...
[Meta-Delegator] Evaluating success criteria...
Status: SUCCESS
Success Score: 95/100
Evidence collected:
- code_file: factorial.py
- test_file: test_factorial.py
- execution_log: subprocess_output.log
What Happened?¶
- Subprocess Created: A new isolated environment was created
- Agent Executed: The junior_dev agent ran Claude Code/Copilot/Amplifier
- Code Generated: The agent created
factorial.pyandtest_factorial.py - Tests Run: The agent executed tests to validate the implementation
- Evidence Collected: All artifacts were captured
- Success Evaluated: The evaluator scored the results
Step 2: Understanding Personas (10 minutes)¶
Personas change how the agent approaches your task. Let's compare.
Experiment: Same Task, Different Personas¶
# persona_comparison.py
from amplihack.meta_delegation import run_meta_delegation
goal = "Analyze the security of a user authentication system"
success_criteria = "Identify vulnerabilities, suggest fixes, provide examples"
# Try each persona
personas = ["guide", "qa_engineer", "architect", "junior_dev"]
results = {}
for persona in personas:
print(f"\n{'='*60}")
print(f"Running with {persona} persona...")
print('='*60)
result = run_meta_delegation(
goal=goal,
success_criteria=success_criteria,
persona_type=persona,
platform="claude-code"
)
results[persona] = result
print(f"Status: {result.status}")
print(f"Score: {result.success_score}/100")
print(f"Evidence: {len(result.evidence)} items")
# Compare results
print("\n" + "="*60)
print("COMPARISON")
print("="*60)
for persona, result in results.items():
print(f"{persona:12} | Score: {result.success_score:3}/100 | Evidence: {len(result.evidence):2} items")
Expected Output¶
============================================================
Running with guide persona...
============================================================
[Meta-Delegator] Guide persona: Using Socratic approach...
Status: SUCCESS
Score: 88/100
Evidence: 8 items
============================================================
Running with qa_engineer persona...
============================================================
[Meta-Delegator] QA Engineer persona: Using rigorous testing approach...
Status: SUCCESS
Score: 95/100
Evidence: 15 items
============================================================
Running with architect persona...
============================================================
[Meta-Delegator] Architect persona: Using strategic design approach...
Status: SUCCESS
Score: 90/100
Evidence: 10 items
============================================================
Running with junior_dev persona...
============================================================
[Meta-Delegator] Junior Dev persona: Using task-focused approach...
Status: SUCCESS
Score: 78/100
Evidence: 6 items
============================================================
COMPARISON
============================================================
guide | Score: 88/100 | Evidence: 8 items
qa_engineer | Score: 95/100 | Evidence: 15 items
architect | Score: 90/100 | Evidence: 10 items
junior_dev | Score: 78/100 | Evidence: 6 items
Analysis¶
- QA Engineer: Most thorough, highest score, most evidence (perfect for security analysis)
- Architect: Strategic overview, good documentation, solid score
- Guide: Explanatory, educational approach, balanced evidence
- Junior Dev: Task-focused, less comprehensive but fast
Lesson: Choose persona based on what you need:
- Need teaching? →
guide - Need thoroughness? →
qa_engineer - Need design? →
architect - Need quick implementation? →
junior_dev
Step 3: Writing Effective Success Criteria (5 minutes)¶
Success criteria determine how well the agent succeeded. Let's practice writing good ones.
Bad Success Criteria ❌¶
# Too vague
success_criteria = "Make it good"
# Not measurable
success_criteria = "Code should be clean"
# Too strict
success_criteria = "Must achieve 100% code coverage"
Good Success Criteria ✅¶
# Specific and measurable
success_criteria = """
- Function accepts integer input
- Returns correct factorial value
- Handles n=0 (returns 1)
- Handles negative numbers (raises ValueError)
- Has at least 5 test cases
- All tests pass
"""
# Clear deliverables
success_criteria = """
Deliverables:
1. Working implementation in calculator.py
2. Test suite in test_calculator.py with >80% coverage
3. README.md with usage examples
4. All tests pass
"""
Exercise: Write Your Own¶
Try improving these criteria:
Original:
Improved:
goal = "Create a REST API for user management"
success_criteria = """
- API has GET /users endpoint (returns user list)
- API has POST /users endpoint (creates user)
- API validates required fields (name, email)
- Returns proper HTTP status codes (200, 201, 400)
- Has tests for each endpoint
- All tests pass
"""
Step 4: Interpreting Evidence (5 minutes)¶
Evidence is proof of what the agent accomplished. Let's analyze it.
Example: Examining Evidence¶
# examine_evidence.py
from amplihack.meta_delegation import run_meta_delegation
result = run_meta_delegation(
goal="Create a module for parsing CSV files",
success_criteria="Module can read CSV, handle errors, has tests",
persona_type="junior_dev"
)
print(f"Status: {result.status}\n")
# Group evidence by type
evidence_by_type = {}
for item in result.evidence:
if item.type not in evidence_by_type:
evidence_by_type[item.type] = []
evidence_by_type[item.type].append(item)
# Display organized evidence
for evidence_type, items in evidence_by_type.items():
print(f"{evidence_type.upper()}:")
for item in items:
print(f" - {item.path}")
if item.excerpt:
print(f" Preview: {item.excerpt[:100]}...")
print()
# Read specific evidence
code_files = result.get_evidence_by_type("code_file")
if code_files:
main_code = code_files[0]
print(f"\nGenerated Code ({main_code.path}):")
print("=" * 60)
print(main_code.content)
Expected Output¶
Status: SUCCESS
CODE_FILE:
- csv_parser.py
Preview: import csv\nfrom typing import List, Dict\n\nclass CSVParser:\n def __init__(self, file_p...
- __init__.py
Preview: from .csv_parser import CSVParser\n\n__all__ = ['CSVParser']\n...
TEST_FILE:
- test_csv_parser.py
Preview: import pytest\nfrom csv_parser import CSVParser\n\nclass TestCSVParser:\n def test_ba...
DOCUMENTATION:
- README.md
Preview: # CSV Parser Module\n\nA simple CSV parsing module with error handling.\n\n## Usage\...
EXECUTION_LOG:
- subprocess_output.log
Preview: [2026-01-20 10:30:15] Starting task: Create a module for parsing CSV files\n[202...
Generated Code (csv_parser.py):
============================================================
import csv
from typing import List, Dict
class CSVParser:
def __init__(self, file_path: str):
self.file_path = file_path
def parse(self) -> List[Dict[str, str]]:
"""Parse CSV file and return list of dictionaries."""
try:
with open(self.file_path, 'r') as f:
reader = csv.DictReader(f)
return list(reader)
except FileNotFoundError:
raise ValueError(f"File not found: {self.file_path}")
except csv.Error as e:
raise ValueError(f"CSV parsing error: {e}")
Evidence Types¶
Common evidence types you'll see:
- code_file: Source code files generated
- test_file: Test files created
- documentation: README, guides, specs
- execution_log: Full subprocess output
- test_results: Test run output
- validation_report: Success criteria evaluation details
Step 5: Handling Failures (5 minutes)¶
Not all delegations succeed. Let's learn to handle failures gracefully.
Example: Dealing with Failure¶
# handle_failures.py
from amplihack.meta_delegation import run_meta_delegation
# Intentionally difficult task
goal = "Implement a quantum computer simulator in Python"
success_criteria = """
- Simulates quantum gates (Hadamard, CNOT, Pauli)
- Handles multi-qubit systems
- Has comprehensive tests
- All tests pass
"""
result = run_meta_delegation(
goal=goal,
success_criteria=success_criteria,
persona_type="junior_dev",
platform="claude-code"
)
# Handle different outcomes
if result.status == "SUCCESS":
print("✓ Task completed successfully!")
print(f"Score: {result.success_score}/100")
elif result.status == "PARTIAL":
print("⚠ Task partially completed")
print(f"Score: {result.success_score}/100")
print(f"\nWhat was achieved:")
for item in result.evidence:
print(f" - {item.type}: {item.path}")
print(f"\nWhat's missing:")
print(result.partial_completion_notes)
else: # FAILURE
print("✗ Task failed")
print(f"Score: {result.success_score}/100")
print(f"Reason: {result.failure_reason}")
print(f"\nDiagnostic information:")
print(f"Execution time: {result.duration_seconds}s")
# Check logs for clues
log_evidence = result.get_evidence_by_type("execution_log")
if log_evidence:
log = log_evidence[0]
print(f"\nLast 20 lines of execution log:")
lines = log.content.split('\n')
print('\n'.join(lines[-20:]))
Expected Output (Partial Success)¶
⚠ Task partially completed
Score: 65/100
What was achieved:
- code_file: quantum_simulator.py
- test_file: test_quantum_simulator.py
- documentation: README.md
- execution_log: subprocess_output.log
What's missing:
- Multi-qubit CNOT gate not fully implemented
- Only 8 of 12 tests passing
- Performance tests missing
Recommendations:
1. Review the partial implementation in quantum_simulator.py
2. Complete the CNOT gate implementation
3. Fix the 4 failing tests
4. Consider splitting into multiple delegations
Recovery Strategies¶
When delegation fails or partially succeeds:
- Review Evidence: Check what was produced
- Analyze Logs: Look for error messages
- Adjust Task Scope: Break into smaller pieces
- Try Different Persona: QA engineer might find issues guide missed
- Refine Success Criteria: Make them more specific or achievable
Putting It All Together¶
Final Exercise: Complete Workflow¶
Create a meta-delegation for a real task:
# complete_workflow.py
from amplihack.meta_delegation import run_meta_delegation
# Step 1: Design phase
print("Phase 1: Architecture Design")
design_result = run_meta_delegation(
goal="Design a Python package for sending email notifications",
success_criteria="""
- Architecture document with module structure
- API specification with function signatures
- Error handling strategy
- Documentation plan
""",
persona_type="architect"
)
if design_result.status != "SUCCESS":
print(f"Design failed: {design_result.failure_reason}")
exit(1)
print(f"✓ Design complete (score: {design_result.success_score}/100)\n")
# Step 2: Implementation phase
print("Phase 2: Implementation")
impl_result = run_meta_delegation(
goal="Implement the email notification package following the architecture",
success_criteria="""
- email_notifier.py with send_email() function
- Support for SMTP and SendGrid backends
- Configuration via environment variables
- Has comprehensive tests
- All tests pass
""",
persona_type="junior_dev",
# Pass context from design phase
context=design_result.get_evidence_by_type("architecture_doc")[0].content
)
if impl_result.status != "SUCCESS":
print(f"Implementation failed: {impl_result.failure_reason}")
exit(1)
print(f"✓ Implementation complete (score: {impl_result.success_score}/100)\n")
# Step 3: QA phase
print("Phase 3: Quality Assurance")
qa_result = run_meta_delegation(
goal="Perform thorough QA on the email notification package",
success_criteria="""
- Identify any bugs or issues
- Verify test coverage >85%
- Check error handling
- Validate documentation
- Generate QA report
""",
persona_type="qa_engineer"
)
print(f"✓ QA complete (score: {qa_result.success_score}/100)\n")
# Summary
print("="*60)
print("PROJECT COMPLETE")
print("="*60)
print(f"Design score: {design_result.success_score}/100")
print(f"Implementation score: {impl_result.success_score}/100")
print(f"QA score: {qa_result.success_score}/100")
print(f"Overall: {(design_result.success_score + impl_result.success_score + qa_result.success_score) / 3:.1f}/100")
What's Next?¶
You've completed the tutorial! You now know how to:
- ✓ Run basic meta-delegations
- ✓ Choose appropriate personas
- ✓ Write effective success criteria
- ✓ Interpret evidence and results
- ✓ Handle failures and partial successes
Continue Learning¶
- How-To Guide: Common recipes and patterns
- Concepts: Deep dive into architecture
- Reference: Complete API documentation
- Troubleshooting: Fix common problems
Try These Advanced Topics¶
- Custom Scenarios: Use Gadugi to generate test scenarios
- Evidence Filtering: Find specific artifacts in results
- Multi-Platform: Run same task on different platforms
- Chained Delegations: Pipeline multiple agents together
Status: [PLANNED - Implementation Pending]
This tutorial describes the intended usage of the meta-delegation system once implemented.