PM Label-Triggered Delegation - Test Plan¶
Overview¶
This document describes the testing approach for the pm:delegate label-triggered workflow.
Feature: GitHub Actions workflow that triggers PM Architect delegation when pm:delegate label is added to an issue or PR.
Issue: #1523
Components¶
- Workflow File:
.github/workflows/pm-label-delegate.yml - Delegation Script:
~/.amplihack/.claude/skills/pm-architect/scripts/delegate_response.py - Label:
pm:delegate(will be created if doesn't exist)
Test Strategy¶
Since GitHub Actions can only be fully tested in the CI environment, testing consists of:
- Pre-commit Validation (Completed)
- Manual Testing (After merge)
- Integration Testing (In production)
Pre-commit Validation ✅¶
Status: PASSED
Validated files:
.github/workflows/pm-label-delegate.yml- YAML syntax valid~/.amplihack/.claude/skills/pm-architect/scripts/delegate_response.py- Python formatting and type checking passed
Manual Testing Plan¶
Test 1: Issue Label Trigger¶
Objective: Verify workflow triggers on issue labeling
Steps:
- Create a test issue with a simple question (e.g., "What is the project structure?")
- Add the
pm:delegatelabel to the issue - Wait for workflow to complete (check Actions tab)
- Verify comment is posted with PM Architect response
Expected Result:
- Workflow runs successfully
- Comment posted within 5-10 minutes
- Response is relevant and helpful
- Response formatted correctly with header/footer
Success Criteria:
- ✅ Workflow completes without errors
- ✅ Comment posted to issue
- ✅ Response quality is reasonable
- ✅ No secrets exposed in logs
Test 2: PR Label Trigger¶
Objective: Verify workflow triggers on PR labeling
Steps:
- Create a test PR (can be trivial change)
- Add description asking for review feedback
- Add the
pm:delegatelabel to the PR - Wait for workflow to complete
- Verify comment is posted with PM Architect analysis
Expected Result:
- Workflow runs successfully
- Comment posted within 5-10 minutes
- Response analyzes PR appropriately
- Response formatted correctly
Success Criteria:
- ✅ Workflow completes without errors
- ✅ Comment posted to PR
- ✅ Response addresses PR context
- ✅ No secrets exposed in logs
Test 3: Error Handling¶
Objective: Verify graceful error handling
Steps:
- Create issue with extremely long body (>10KB text)
- Add
pm:delegatelabel - Verify workflow handles large input gracefully
Expected Result:
- Workflow either succeeds or posts error comment
- No workflow crash or timeout
- Error message is helpful if failure occurs
Success Criteria:
- ✅ Workflow doesn't crash
- ✅ Error message posted if failure
- ✅ No secrets in error output
Test 4: Multiple Labels¶
Objective: Verify selective triggering
Steps:
- Create issue
- Add multiple labels including
pm:delegate - Verify workflow triggers only for
pm:delegate - Remove and re-add
pm:delegate - Verify workflow triggers again
Expected Result:
- Workflow only triggers on
pm:delegatelabel addition - Works with other labels present
- Can be re-triggered by removing and re-adding label
Security Testing¶
Security Test 1: API Key Masking¶
Check: Review workflow logs to ensure API key never appears
Steps:
- Run workflow on test issue
- Download workflow logs
- Search for any occurrence of API key or patterns that look like keys
Expected: No API keys visible in any log output
Security Test 2: User Input Sanitization¶
Check: Verify malicious user input doesn't break workflow
Steps:
- Create issue with shell-injection-like content (e.g.,
$(whoami)) - Add
pm:delegatelabel - Verify workflow handles input safely
Expected: Input treated as literal text, no code execution
Security Test 3: Permission Boundaries¶
Check: Verify workflow has minimal required permissions
Review:
- Workflow has read-only access to repo contents
- Workflow can only write comments (not code changes)
- No elevated permissions granted
Expected: Permissions match specification in workflow file
Performance Testing¶
Performance Test 1: Response Time¶
Objective: Measure typical response time
Steps:
- Add
pm:delegatelabel to test issue - Note timestamp of label addition
- Note timestamp of response comment
- Calculate duration
Expected: Response within 5-10 minutes for simple queries
Performance Test 2: Timeout Handling¶
Objective: Verify 30-minute timeout works
Steps:
- (If possible) create scenario that causes long execution
- Verify workflow terminates at 30-minute mark
- Verify timeout error is reported
Expected: Workflow respects timeout, reports timeout error
Integration Testing¶
Integration Test 1: With Existing PM Workflows¶
Objective: Verify no conflicts with other PM workflows
Steps:
- Have multiple PM workflows active (daily status, roadmap review, triage)
- Trigger
pm:delegateworkflow - Verify all workflows coexist without issues
Expected: No workflow conflicts or resource contention
Integration Test 2: Auto Mode Integration¶
Objective: Verify auto mode spawns correctly
Steps:
- Check
~/.amplihack/.claude/runtime/logs/for auto mode session logs - Verify logs are created when delegation runs
- Verify logs contain expected content
Expected: Auto mode logs created in correct location
Test Schedule¶
- Immediate (PR review phase):
- Security review of workflow file
- Code review of delegation script
-
Pre-commit validation (DONE)
-
After PR Merge:
- Test 1: Issue label trigger
- Test 2: PR label trigger
-
Security Test 1-3
-
Within 24 Hours of Merge:
- Test 3: Error handling
- Test 4: Multiple labels
-
Performance Test 1
-
Within 1 Week of Merge:
- Integration Test 1-2
- Performance Test 2 (if applicable)
Success Metrics¶
Overall feature is successful if:
- ✅ Reliability: 95%+ of triggers result in successful response
- ✅ Security: No secrets exposed in any logs
- ✅ Performance: 90%+ of responses within 10 minutes
- ✅ Quality: Responses are relevant and actionable
- ✅ Stability: No workflow crashes or hangs
Rollback Plan¶
If critical issues discovered:
- Disable workflow by removing trigger events from YAML
- Push emergency fix
- Re-enable after verification
Alternative: Remove pm:delegate label from repository to prevent triggering
Notes¶
- GitHub Actions cannot be tested locally (no act or similar tool reliable for this)
- Real-world testing is required after merge
- Workflow will use actual API credits during testing
- Consider creating test repository for validation before production use
Sign-off¶
- Pre-commit validation passed
- Security review completed
- Code review completed
- Test plan reviewed and approved
- Ready for merge and testing