Workflow Issue Extraction (Step 03b)¶
Home > Reference > Workflow Issue Extraction
Reference for the three-tier issue-number extraction logic in default-workflow step 03b. This step resolves a GitHub issue number from the workflow context so that downstream steps can link the working branch, commit, and pull request to the correct issue.
Contents¶
- Overview
- Three-Tier Extraction
- Tier 1 — Direct issue URL
- Tier 2 — Pull request URL
- Tier 3 — Bare #N reference
- Output Contract
- Shell Security Constraints
- Configuration
- Examples
- Troubleshooting
- Related
Overview¶
Step 03b accepts a free-form task_description string (e.g. a copied GitHub issue URL, a PR URL, or a plain-text description containing #123) and produces a canonical issue_number integer that the rest of the workflow uses.
task_description ──► 03b extraction ──► issue_number (int | null)
branch_prefix
issue_title (str | null)
If no issue number can be resolved, issue_number is null and the workflow continues with an anonymous branch name.
Three-Tier Extraction¶
The tiers are evaluated in order; the first match wins.
Tier 1 — Direct issue URL¶
Pattern: https://github.com/<owner>/<repo>/issues/<N>
When task_description contains a URL whose path component is /issues/<digits>, the issue number is extracted directly from the URL without any network call.
Input: "Fix the crash described in https://github.com/rysweet/amplihack/issues/3960"
Match: issues/3960
Output: issue_number=3960
This tier is purely regex-based and never calls gh.
Tier 2 — Pull request URL¶
Pattern: https://github.com/<owner>/<repo>/pull/<N>
When task_description contains a PR URL, step 03b calls gh pr view to retrieve the list of issues that the PR closes:
gh pr view <N> \
--repo <owner>/<repo> \
--json closingIssuesReferences \
--jq '.closingIssuesReferences[0].number'
The first closing issue reference is used. If the PR closes no issues, extraction falls through to Tier 3.
Input: "Continue work on https://github.com/rysweet/amplihack/pull/4143"
gh call: closingIssuesReferences → [3960, 3983]
Output: issue_number=3960 (first reference)
Timeout: 60 seconds. If gh does not respond within the timeout, Tier 2 is skipped and extraction falls through to Tier 3.
Tier 3 — Bare #N reference¶
Pattern: #<digits> anywhere in task_description
When neither a direct issue URL nor a PR URL is found, step 03b scans task_description for bare #N patterns. Each candidate is verified with:
The command returns the issue URL. If the URL path contains /issues/, the candidate is accepted. Candidates that return an empty URL (non-existent or wrong repo) are skipped.
Input: "Resume work on #3983 and #3960"
gh verify 3983 → url: "https://github.com/rysweet/amplihack/issues/3983"
→ contains /issues/ → issue_number=3983
Timeout: 60 seconds per candidate.
If no candidate passes verification, issue_number is null.
Output Contract¶
After step 03b completes, the workflow context contains:
| Field | Type | Description |
|---|---|---|
issue_number | int \| null | Resolved GitHub issue number, or null if unresolvable |
Planned enhancements:
issue_title(fetched from GitHub),branch_prefix(inferred from context), andextraction_tier(which tier produced the result) are targeted for a follow-up to improve debuggability.
Example output (Tier 1)¶
Example output (no match)¶
Shell Security Constraints¶
Step 03b follows the same shell-security rules as the rest of the default workflow:
| Constraint | Detail |
|---|---|
No shell=True | All gh subprocess calls use a list-form argv; no shell interpolation |
| Heredoc delimiter single-quoted | The EOFISSUECREATION delimiter in the issue-creation heredoc is always single-quoted ('EOFISSUECREATION') to prevent shell metacharacter expansion from task_description |
| 60-second timeout | Every gh subprocess call carries timeout=60; hung subprocesses do not stall the workflow |
| No credential logging | gh output is captured in memory and never written to disk unredacted |
Configuration¶
Step 03b reads the following values from the workflow context:
| Context key | Required | Description |
|---|---|---|
task_description | Yes | Free-form string describing the task |
repo_path | Yes | Absolute path used to derive owner/repo from git remote get-url origin |
expected_gh_account | No | If set, gh api /user is checked before Tier ⅔ network calls |
No environment variables are specific to step 03b; it relies on the authenticated gh CLI configured in the user's shell.
Examples¶
Resolving from an issue URL¶
# Recipe context snippet
task_description: |
Fix path-traversal bug.
See https://github.com/rysweet/amplihack/issues/3960
Resolving from a PR URL¶
No issues/ URL found → Tier 2
gh pr view 4143 --json closingIssuesReferences
→ [3960, 3983]
issue_number: 3960
Resolving from a bare #N pattern¶
No issues/ URL, no pull/ URL → Tier 3
gh issue view 3983 → url contains /issues/ → accepted
issue_number: 3983
No resolvable issue¶
Troubleshooting¶
gh: command not found
Install and authenticate the GitHub CLI: gh auth login. Step 03b degrades gracefully — Tier 1 (regex-only) still works without gh.
Tier 2 returns no closing issues
The PR may not use closing keywords (Closes #N, Fixes #N). Add a closing keyword to the PR description, or supply the issue URL directly in task_description.
Tier 3 skips a valid issue number
The issue may not exist in the repository that gh is authenticated against, or the gh issue view call returned an empty URL. Check that the issue exists and that gh auth status shows the correct account. Closed issues that are still in the repository will also resolve successfully — verification is URL-based (*/issues/*), not state-based.
Tier 3 does not resolve a #N that is present
The #N pattern requires at least one digit. Values like # alone or #abc are not matched. Check that the issue exists in the resolved repository and that gh issue view <N> returns a URL containing /issues/.
Related¶
- Workflow Execution Guardrails Reference — canonical execution root and GitHub identity gate (also part of the default workflow)
- How to Configure Workflow Execution Guardrails
- Default Workflow — full step list
- Lock Session ID Sanitization — the security fix delivered alongside this extraction improvement (PR #4143)
- Issues #3960 and #3983 — originating work
- PR #4143 — implementation
Metadata
| Field | Value |
|---|---|
| Status | Planned / PR #4143 |
| Issues | #3960, #3983 |
| PR | #4143 |
| Recipe file | amplifier-bundle/recipes/default-workflow.yaml (step 03b) |
| Test file | amplifier-bundle/tools/test_default_workflow_fixes.py |
| Gadugi spec | tests/gadugi/lock-recovery-and-issue-extraction.yaml |