amplihack Recipe Runner
A code-enforced workflow execution engine that reads declarative YAML recipe files and executes them step-by-step using AI agents. Unlike prompt-based workflow instructions that models can interpret loosely or skip, the Recipe Runner controls the execution loop in compiled Rust code — making it physically impossible to skip steps.
Feature Highlights
- Comprehensive test suite covering unit, integration, recipe, example, and property-based testing
- Parallel step execution, tag filtering, JSONL audit logs
- Recipe composition via
extends, pre/post/on_error hooks - Safe condition language with recursive descent parser
Quick Start
# Build
cargo build --release
# Run a recipe
recipe-runner-rs path/to/recipe.yaml
# With context overrides
recipe-runner-rs recipe.yaml --set task_description="Add auth" --set repo_path="."
# Dry run
recipe-runner-rs recipe.yaml --dry-run
See the Quick Start guide for a more detailed walkthrough.
Quick Start
Get up and running with the amplihack Recipe Runner in minutes.
Install
# Clone the repository
git clone https://github.com/rysweet/amplihack-recipe-runner.git
cd amplihack-recipe-runner
# Build in release mode
cargo build --release
# The binary is at target/release/recipe-runner-rs
# Optionally copy it to your PATH:
cp target/release/recipe-runner-rs ~/.local/bin/
Your First Recipe
Create a file called hello.yaml:
name: "hello-world"
description: "A minimal recipe to verify your setup"
version: "1.0.0"
context:
greeting: "Hello from the Recipe Runner!"
steps:
- id: "greet"
command: "echo '{{greeting}}'"
Run It
recipe-runner-rs hello.yaml
You should see the greeting printed to stdout.
Override Context
Pass --set to override context variables at runtime:
recipe-runner-rs hello.yaml --set greeting="Howdy, partner!"
Dry Run
Use --dry-run to see what would execute without actually running anything:
recipe-runner-rs hello.yaml --dry-run
Using Agent Steps
Agent steps invoke an AI agent instead of a shell command:
name: "analyze-project"
description: "Analyze a codebase with an AI agent"
version: "1.0.0"
context:
repo_path: "."
steps:
- id: "analyze"
agent: "amplihack:core:architect"
prompt: "Analyze the project at {{repo_path}} and summarize its structure"
output: "analysis"
parse_json: true
- id: "report"
command: "echo 'Analysis complete'"
condition: "analysis"
Next Steps
- YAML Recipe Format — Full schema reference
- CLI Reference — All flags and subcommands
- Condition Language — Conditional step execution
- Architecture — How it works under the hood
YAML Recipe Format Reference
Complete schema reference for amplihack recipe runner YAML files.
Top-Level Fields
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
name | string | yes | — | Recipe name |
version | string | no | "1.0" | Semantic version |
description | string | no | "" | Human-readable description |
author | string | no | "" | Author name |
tags | list of strings | no | [] | Recipe tags for categorisation |
context | map | no | {} | Default variable values for templates |
extends | string | no | — | Parent recipe name (for inheritance). Note: only single-level inheritance is supported; extended recipes cannot themselves use extends. |
recursion | RecursionConfig | no | see below | Sub-recipe recursion limits |
hooks | RecipeHooks | no | — | Lifecycle hooks |
steps | list of Step | yes | — | Ordered list of steps to execute |
RecursionConfig
Controls sub-recipe nesting limits.
| Field | Type | Default | Description |
|---|---|---|---|
max_depth | int | 6 | Maximum sub-recipe recursion depth |
max_total_steps | int | 200 | Maximum total steps across all sub-recipes |
RecipeHooks
Shell commands executed at lifecycle boundaries.
| Field | Type | Description |
|---|---|---|
pre_step | string | Shell command to run before each step |
post_step | string | Shell command to run after each step |
on_error | string | Shell command to run on step failure |
Hook commands receive context variables via template substitution.
Step Fields
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
id | string | yes | — | Unique step identifier |
type | string | no | inferred | "bash", "agent", or "recipe" (see inference rules) |
command | string | no | — | Shell command (bash steps) |
agent | string | no | — | Agent reference (agent steps) |
prompt | string | no | — | Prompt template (agent steps) |
output | string | no | — | Variable name to store step output in context |
condition | string | no | — | Expression that must be truthy to execute |
parse_json | bool | no | false | Extract JSON from step output |
parse_json_required | bool | no | false | Fail the step if JSON extraction fails (see below) |
mode | string | no | — | Execution mode |
working_dir | string | no | — | Override working directory for this step |
timeout | int | no | — | Step timeout in seconds |
auto_stage | bool | no | true | Git auto-stage after agent steps |
model | string | no | — | Model override for agent steps (e.g., “haiku”, “sonnet”) |
recipe | string | no | — | Sub-recipe name (recipe steps) |
recovery_on_failure | bool | no | false | Attempt agentic recovery if sub-recipe fails (see below) |
context | map | no | — | Context overrides passed to sub-recipe |
continue_on_error | bool | no | false | Continue execution if this step fails |
when_tags | list of strings | no | [] | Step only runs when these tags match active tag filters |
parallel_group | string | no | — | Group name for parallel execution |
Note: The
contextfield provides step-specific variables that are passed as overrides to sub-recipes. In YAML you writecontext:.
Type Inference Rules
When type is omitted, the effective step type is inferred in this order:
recipefield present →recipetypeagentfield present →agenttypepromptpresent withoutcommand→agenttype- Otherwise →
bashtype (default)
An explicit type value always takes precedence.
# Inferred as bash (has command, no agent/recipe/prompt)
- id: build
command: cargo build --release
# Inferred as agent (agent field present)
- id: review
agent: code-reviewer
prompt: "Review {{file}}"
# Inferred as agent (prompt without command)
- id: summarise
prompt: "Summarise the changes in {{diff}}"
# Inferred as recipe (recipe field present)
- id: deploy
recipe: deploy-production
context:
env: staging
# Explicit type overrides inference
- id: special
type: bash
prompt: "This prompt is ignored because type is bash"
command: echo "explicit wins"
Template Syntax
Variables are substituted using {{variable_name}} syntax. Variable names may
contain letters, digits, underscores, hyphens, and dots.
context:
project: my-app
branch: main
steps:
- id: greet
command: echo "Building {{project}} on {{branch}}"
Dot Notation
Nested context values are accessed with dot notation:
context:
deploy:
target: production
region: us-east-1
steps:
- id: deploy
command: ./deploy.sh --target {{deploy.target}} --region {{deploy.region}}
Shell Escaping
When templates are rendered for shell commands, values are shell-escaped
automatically via shell_escape to prevent injection. Undefined variables
resolve to an empty string.
Condition Syntax
The condition field accepts an expression that is evaluated against the
current context. Steps with a falsy condition are skipped.
See conditions.md for the full reference. Supported operators and built-in functions include:
- Comparisons:
==,!=,<,<=,>,>= - Logical:
and,or,not - Membership:
in,not in - Functions:
int(),str(),len(),bool(),float(),min(),max() - Methods:
strip(),lower(),upper(),startswith(),endswith(),replace(),split(),join(),count(),find()
- id: deploy
condition: "branch == 'main' and tests_passed == 'true'"
command: ./deploy.sh
JSON Extraction (parse_json)
When parse_json: true, the runner attempts to extract structured JSON from step
output using three strategies in order:
- Direct parse — the entire trimmed output is valid JSON.
- Markdown fence extraction — JSON inside
```json ... ```fences. - Balanced bracket detection — locates the first
{…}or[…]block with proper depth tracking, string awareness, and escape handling.
If all strategies fail a warning is logged and the raw output is stored.
- id: get-config
command: curl -s https://api.example.com/config
output: api_config
parse_json: true
- id: use-config
command: echo "Region is {{api_config.region}}"
Strict Mode (parse_json_required)
By default, JSON extraction failure degrades the step (status becomes degraded)
but the recipe continues. Set parse_json_required: true to make extraction
failure a hard error that stops the recipe.
- id: must-be-json
command: curl -s https://api.example.com/data
parse_json: true
parse_json_required: true # fails the recipe if output isn't valid JSON
output: api_data
parse_json_required | On extraction failure |
|---|---|
false (default) | Step marked degraded, raw output stored, recipe continues |
true | Step marked failed, recipe stops immediately |
Sub-Recipe Recovery (recovery_on_failure)
When a sub-recipe step fails, set recovery_on_failure: true to trigger an
agentic recovery attempt. The runner sends the failure details to an agent,
which attempts to complete the remaining work.
- id: deploy
recipe: deploy-to-staging
recovery_on_failure: true # agent attempts recovery if deploy fails
If the agent’s recovery output contains “STATUS: COMPLETE” or “recovered”, the step is marked as recovered and the recipe continues. Otherwise, the original failure propagates.
Model Override (model)
Agent steps can override the default model using the model field. The value
is passed to the adapter, which maps it to a specific model identifier.
- id: quick-check
agent: reviewer
prompt: "Quick lint check on {{file_path}}"
model: haiku # fast, cheap model for simple tasks
- id: deep-review
agent: reviewer
prompt: "Thorough security review of {{file_path}}"
model: sonnet # more capable model for complex analysis
Complete Examples
1. Simple Bash-Only Recipe
name: build-and-test
version: "1.0"
description: Build the project and run tests
author: dev-team
tags: [ci, build]
context:
build_mode: release
steps:
- id: clean
command: cargo clean
- id: build
command: cargo build --{{build_mode}}
- id: test
command: cargo test --{{build_mode}}
output: test_results
- id: report
command: echo "Tests complete. Results {{test_results}}"
2. Agent-Based Workflow
name: code-review-workflow
version: "1.0"
description: Automated code review with AI agents
context:
target_branch: main
steps:
- id: get-diff
command: git diff {{target_branch}} --stat
output: diff_summary
- id: review
agent: code-reviewer
prompt: |
Review the following changes against {{target_branch}}:
{{diff_summary}}
Focus on correctness, security, and performance.
output: review_result
parse_json: true
- id: check-approved
condition: "review_result.approved == true"
command: echo "Review passed"
- id: request-changes
condition: "review_result.approved != true"
command: echo "Changes requested — see review_result.comments"
3. Sub-Recipe Composition
name: full-pipeline
version: "2.0"
description: End-to-end pipeline composing smaller recipes
context:
environment: staging
steps:
- id: lint
recipe: lint-check
- id: build
recipe: build-project
context:
build_mode: release
target: "{{environment}}"
- id: deploy
recipe: deploy-service
context:
env: "{{environment}}"
version: "{{build.version}}"
condition: "environment != 'local'"
4. Recipe with Hooks, Tags, and Recursion Limits
name: guarded-pipeline
version: "1.0"
description: Pipeline with lifecycle hooks and safety limits
author: platform-team
tags: [production, safe]
recursion:
max_depth: 3
max_total_steps: 50
hooks:
pre_step: echo "[$(date -Iseconds)] Starting step"
post_step: echo "[$(date -Iseconds)] Finished step"
on_error: |
echo "FAILED — sending alert"
curl -s -X POST https://alerts.example.com/hook \
-d '{"step": "failed", "recipe": "guarded-pipeline"}'
context:
notify: true
steps:
- id: preflight
command: ./scripts/preflight-check.sh
- id: migrate
command: ./scripts/migrate.sh
when_tags: [database]
- id: deploy
command: ./scripts/deploy.sh
when_tags: [deploy]
- id: smoke-test
command: ./scripts/smoke-test.sh
timeout: 120
when_tags: [deploy]
- id: notify
condition: "notify == 'true'"
command: echo "Pipeline complete"
5. Recipe with continue_on_error and Conditions
name: resilient-checks
version: "1.0"
description: Run multiple checks, collecting results even on failures
context:
strict: false
steps:
- id: lint
command: cargo clippy -- -D warnings
output: lint_result
continue_on_error: true
- id: test
command: cargo test 2>&1
output: test_result
continue_on_error: true
- id: audit
command: cargo audit
output: audit_result
continue_on_error: true
- id: gate
condition: "strict == 'true'"
command: |
echo "Lint: {{lint_result}}"
echo "Test: {{test_result}}"
echo "Audit: {{audit_result}}"
# Fail the pipeline in strict mode if any check failed
exit 1
- id: summary
condition: "strict != 'true'"
command: |
echo "=== Check Summary ==="
echo "Lint: {{lint_result}}"
echo "Test: {{test_result}}"
echo "Audit: {{audit_result}}"
echo "Non-strict mode — pipeline continues"
CLI Reference
Complete reference for the recipe-runner-rs command-line interface.
Synopsis
recipe-runner-rs [OPTIONS] [RECIPE] [COMMAND]
recipe-runner-rs list [OPTIONS]
Subcommands
list
Discover and display all available recipes found in the configured search directories.
recipe-runner-rs list
recipe-runner-rs list --recipe-dir ./custom-recipes
recipe-runner-rs list --recipe-dir ./team-recipes --recipe-dir ./personal-recipes
Global Options
-C, --working-dir <DIR>
Set the working directory for recipe execution.
Default: . (current directory)
# Run a recipe from a different directory
recipe-runner-rs deploy.yaml --working-dir /home/user/my-project
# Short form
recipe-runner-rs deploy.yaml -C /home/user/my-project
# Combine with other options
recipe-runner-rs build.yaml -C ../other-repo --dry-run
-R, --recipe-dir <DIR>
Add a directory to the recipe search path. Can be specified multiple times to search across several directories.
# Single directory
recipe-runner-rs my-recipe --recipe-dir ./recipes
# Multiple directories (searched in order)
recipe-runner-rs my-recipe \
--recipe-dir ./project-recipes \
--recipe-dir ~/.config/recipes \
--recipe-dir /opt/shared-recipes
# Short form
recipe-runner-rs my-recipe -R ./recipes -R ../shared
# Combine with list to discover recipes across directories
recipe-runner-rs list -R ./recipes -R /opt/shared-recipes
--set <KEY=VALUE>
Override a context variable. Can be specified multiple times to set several variables. Values are automatically typed using smart parsing (see Smart Context Value Parsing).
# String value
recipe-runner-rs deploy.yaml --set environment=production
# Integer value (auto-detected)
recipe-runner-rs scale.yaml --set replicas=5
# Float value (auto-detected)
recipe-runner-rs tune.yaml --set ratio=0.75
# Boolean value (auto-detected)
recipe-runner-rs build.yaml --set verbose=true
# JSON value (auto-detected)
recipe-runner-rs config.yaml --set data='{"host": "localhost", "port": 8080}'
# Multiple overrides
recipe-runner-rs deploy.yaml \
--set environment=production \
--set replicas=3 \
--set debug=false \
--set version=2.1.0
--dry-run
Parse and validate the recipe without executing any steps. Useful for checking recipe correctness before committing to a run.
recipe-runner-rs deploy.yaml --dry-run
# Combine with --set to validate context overrides
recipe-runner-rs deploy.yaml --dry-run --set environment=staging
# Combine with --progress to see what steps would run
recipe-runner-rs deploy.yaml --dry-run --progress
--no-auto-stage
Disable automatic git staging of file changes made during recipe execution.
recipe-runner-rs codegen.yaml --no-auto-stage
# Useful when you want to review changes before staging
recipe-runner-rs refactor.yaml --no-auto-stage -C /path/to/repo
--validate-only
Parse and validate the recipe, print any warnings, then exit. Does not execute any steps. More thorough than --dry-run as it focuses on surfacing validation warnings.
recipe-runner-rs deploy.yaml --validate-only
# Validate a recipe in a specific directory
recipe-runner-rs my-recipe --validate-only -R ./recipes
# Validate with context overrides to check for missing variables
recipe-runner-rs deploy.yaml --validate-only --set environment=production
--explain
Show the structure of a recipe without executing it. Displays the recipe name, version, and each step with its conditions, agents, and commands.
recipe-runner-rs deploy.yaml --explain
# Explain a recipe found via search path
recipe-runner-rs my-recipe --explain -R ./recipes
Example output:
Recipe: deploy
Version: 1.2.0
Steps:
1. build
Agent: builder
Command: cargo build --release
2. test
Condition: when context.run_tests == true
Agent: tester
Command: cargo test
3. deploy
Agent: deployer
Command: ./scripts/deploy.sh
--progress
Print step progress events to stderr. Emits events when each step starts and completes, useful for monitoring long-running recipes.
recipe-runner-rs deploy.yaml --progress
# Capture progress separately from output
recipe-runner-rs deploy.yaml --progress 2>progress.log
# Combine with JSON output for machine-readable progress + results
recipe-runner-rs deploy.yaml --progress --output-format json
Example stderr output:
[step:start] build (1/3)
[step:complete] build (1/3) — ok
[step:start] test (2/3)
[step:complete] test (2/3) — ok
[step:start] deploy (3/3)
[step:complete] deploy (3/3) — ok
--include-tags <TAGS>
Comma-separated list of tags. Only steps whose when_tags match at least one of the specified tags will run. All other steps are skipped.
# Run only steps tagged "frontend"
recipe-runner-rs build.yaml --include-tags frontend
# Run steps tagged "test" or "lint"
recipe-runner-rs ci.yaml --include-tags test,lint
# Combine with --explain to preview filtered steps
recipe-runner-rs ci.yaml --include-tags test --explain
--exclude-tags <TAGS>
Comma-separated list of tags. Steps whose when_tags match any of the specified tags will be skipped.
# Skip slow integration tests
recipe-runner-rs ci.yaml --exclude-tags slow
# Skip multiple categories
recipe-runner-rs full-pipeline.yaml --exclude-tags slow,experimental,deprecated
# Include some, exclude others
recipe-runner-rs ci.yaml --include-tags test --exclude-tags slow
--audit-dir <DIR>
Directory where JSONL audit log files are written. Each recipe run produces one audit log file.
# Write audit logs to a directory
recipe-runner-rs deploy.yaml --audit-dir ./audit-logs
# Combine with other options for a fully audited production run
recipe-runner-rs deploy.yaml \
--audit-dir /var/log/recipe-runner \
--set environment=production \
--progress
--output-format <FORMAT>
Control the output format. Available formats:
| Format | Description |
|---|---|
text | Human-readable output (default) |
json | Machine-readable JSON output |
# Default text output
recipe-runner-rs deploy.yaml
# JSON output for scripting / CI pipelines
recipe-runner-rs deploy.yaml --output-format json
# Pipe JSON output to jq
recipe-runner-rs deploy.yaml --output-format json | jq '.steps[] | select(.status == "failed")'
# JSON output with progress on stderr
recipe-runner-rs deploy.yaml --output-format json --progress 2>/dev/null
Exit Codes
| Code | Meaning | Description |
|---|---|---|
0 | Success | Recipe completed successfully; all steps passed |
1 | Failure | Recipe failed; at least one step failed during execution |
2 | Parse/validation error | Invalid YAML syntax, unknown fields, or other validation errors |
# Check exit code in scripts
recipe-runner-rs deploy.yaml
if [ $? -eq 0 ]; then
echo "Deploy succeeded"
elif [ $? -eq 1 ]; then
echo "Deploy failed — check step output"
elif [ $? -eq 2 ]; then
echo "Recipe is invalid — check YAML syntax"
fi
# Use && / || for simple chaining
recipe-runner-rs build.yaml && recipe-runner-rs deploy.yaml
# Validate before running
recipe-runner-rs deploy.yaml --validate-only && recipe-runner-rs deploy.yaml
Smart Context Value Parsing (--set)
When using --set KEY=VALUE, the runner automatically determines the value type by attempting each parse strategy in order:
| Priority | Type | Detection | Example |
|---|---|---|---|
| 1 | JSON | Valid JSON object/array | --set data='{"key": "val"}' |
| 2 | Boolean | Literal true or false | --set verbose=true |
| 3 | Integer | Digits only (with optional sign) | --set count=5 |
| 4 | Float | Numeric with decimal point | --set ratio=0.5 |
| 5 | String | Everything else (fallback) | --set name=hello |
# JSON — parsed as a structured object
recipe-runner-rs setup.yaml --set config='{"host": "localhost", "port": 8080}'
recipe-runner-rs setup.yaml --set tags='["web", "api"]'
# Boolean — parsed as bool
recipe-runner-rs build.yaml --set release=true
recipe-runner-rs build.yaml --set skip_tests=false
# Integer — parsed as i64
recipe-runner-rs scale.yaml --set workers=8
recipe-runner-rs scale.yaml --set retries=0
# Float — parsed as f64
recipe-runner-rs tune.yaml --set threshold=0.95
recipe-runner-rs tune.yaml --set learning_rate=0.001
# String — fallback for everything else
recipe-runner-rs deploy.yaml --set branch=main
recipe-runner-rs deploy.yaml --set message="deploy to production"
Environment Variables
RECIPE_RUNNER_RECIPE_DIRS
Additional recipe search directories, separated by colons. These directories are searched in addition to any specified via --recipe-dir.
# Set via environment
export RECIPE_RUNNER_RECIPE_DIRS="/opt/recipes:/home/user/.config/recipes"
recipe-runner-rs my-recipe
# Inline for a single invocation
RECIPE_RUNNER_RECIPE_DIRS=./recipes recipe-runner-rs list
# Combine with --recipe-dir (both are searched)
export RECIPE_RUNNER_RECIPE_DIRS="/opt/shared-recipes"
recipe-runner-rs my-recipe --recipe-dir ./local-recipes
Usage Examples
Basic Usage
# Run a recipe by file path
recipe-runner-rs ./recipes/build.yaml
# Run a recipe by name (searched in recipe directories)
recipe-runner-rs build
# List all discoverable recipes
recipe-runner-rs list
CI/CD Pipeline
# Validate, then run with JSON output and auditing
recipe-runner-rs deploy.yaml --validate-only \
&& recipe-runner-rs deploy.yaml \
--set environment=production \
--set version="$(git describe --tags)" \
--output-format json \
--audit-dir /var/log/deploys \
--progress
Development Workflow
# Preview what a recipe will do
recipe-runner-rs refactor.yaml --explain
# Dry-run with overrides to test logic
recipe-runner-rs refactor.yaml --dry-run \
--set target_module=auth \
--set aggressive=true
# Run without auto-staging to review changes manually
recipe-runner-rs refactor.yaml \
--set target_module=auth \
--no-auto-stage
Selective Step Execution
# Run only unit tests
recipe-runner-rs ci.yaml --include-tags unit
# Run everything except slow tests
recipe-runner-rs ci.yaml --exclude-tags slow,integration
# Explain which steps match the filter
recipe-runner-rs ci.yaml --include-tags unit --explain
Multi-Directory Recipe Management
# Search across project, team, and global recipes
recipe-runner-rs list \
-R ./recipes \
-R ~/team-recipes \
-R /opt/global-recipes
# Or use the environment variable
export RECIPE_RUNNER_RECIPE_DIRS="./recipes:~/team-recipes:/opt/global-recipes"
recipe-runner-rs list
Scripting and Automation
# Capture JSON output for downstream processing
output=$(recipe-runner-rs analyze.yaml --output-format json)
echo "$output" | jq '.summary'
# Run with full observability
recipe-runner-rs deploy.yaml \
--output-format json \
--progress \
--audit-dir ./audit \
--set environment=production \
2>progress.log \
1>result.json
Condition Language Reference
The recipe runner’s condition evaluator is a hand-rolled tokenizer + recursive-descent parser implemented in src/context.rs. Conditions are expressions evaluated to determine if a step should execute. If the condition evaluates to truthy, the step runs; otherwise it’s skipped.
If evaluation itself fails (e.g., a syntax error), the step is marked Failed — not skipped.
Truthiness
| Type | Truthy | Falsy |
|---|---|---|
| Boolean | true | false |
| Number | Any non-zero (e.g., 1, -3.14) | 0, 0.0 |
| String | Non-empty (e.g., "hello") | Empty string "" |
| Array | Non-empty | Empty [] |
| Object | Non-empty | Empty {} |
| Null | — | Always falsy |
Operators
Listed by precedence, lowest to highest:
| Precedence | Operator | Kind | Description |
|---|---|---|---|
| 1 (lowest) | or | Logical | Short-circuit logical OR |
| 2 | and | Logical | Short-circuit logical AND |
| 3 | not | Unary | Logical negation (prefix) |
| 4 (highest) | == | Comparison | Equality (with type coercion) |
!= | Comparison | Inequality | |
< | Comparison | Less than | |
<= | Comparison | Less than or equal | |
> | Comparison | Greater than | |
>= | Comparison | Greater than or equal | |
in | Membership | Substring or array membership | |
not in | Membership | Negated membership (parsed as one token) |
Type coercion in comparisons
- Equality (
==,!=): Same types compare directly. Mixed types fall back to comparing string representations (so5 == "5"istrue). - Ordering (
<,<=,>,>=): Number–Number is numeric. String–String is lexicographic. String–Number attempts to parse the string asf64then compares numerically. All other combinations are incomparable (condition evaluates as falsy). - Membership (
in,not in): Against a string, checks substring containment. Against an array, checks element equality viavalues_equal. Against any other type, evaluates as falsy.
Literals
| Type | Syntax | Notes |
|---|---|---|
| String | "hello" or 'world' | Single or double quotes. Backslash escapes supported (\', \"). |
| Number | 42, 3.14, -7 | All parsed and stored as f64. |
| Boolean | true, True, false, False | Case-sensitive to these exact forms. |
| None | none | Not a keyword — it’s an unknown identifier that resolves to Null. |
Identifiers
Identifiers are alphanumeric names (plus underscores) that look up values in the recipe context.
| Form | Example | Behavior |
|---|---|---|
| Simple | my_var | Looks up my_var in the top-level context. |
| Dot-notation | result.status | Nested lookup: context["result"]["status"]. |
| Unknown | undefined_var | Resolves to Null (falsy). No error raised. |
Dot-notation in identifiers is resolved during parsing — each segment walks one level deeper into nested JSON values. If any segment is missing, the whole expression resolves to Null.
Function Calls
Only whitelisted function names are allowed. Calling an unknown function is an error.
| Function | Signature | Description |
|---|---|---|
int(value) | 1 arg | Convert to integer (i64). Strings are parsed, bools → 0/1, else 0. |
float(value) | 1 arg | Convert to f64. Strings are parsed, bools → 0.0/1.0, else 0.0. |
str(value) | 1 arg | Convert to string. Null → "". Numbers use serde’s to_string(). |
bool(value) | 1 arg | Convert to boolean using the truthiness rules above. |
len(value) | 1 arg | Length of string (bytes), array, or object. Other types return 0. |
min(a, b, ...) | 2+ args | Minimum of values (uses ordering comparison). Requires at least 2 args. |
max(a, b, ...) | 2+ args | Maximum of values (uses ordering comparison). Requires at least 2 args. |
Method Calls
Methods use .method(args) syntax and can only be called on string values. Calling a method on a non-string is an error. Only whitelisted method names are allowed.
| Method | Returns | Description |
|---|---|---|
.strip() | String | Trim whitespace from both ends. |
.lstrip() | String | Trim whitespace from the left (start). |
.rstrip() | String | Trim whitespace from the right (end). |
.lower() | String | Convert to lowercase. |
.upper() | String | Convert to uppercase. |
.title() | String | Title-case each whitespace-separated word. |
.startswith(prefix) | Boolean | True if string starts with prefix. |
.endswith(suffix) | Boolean | True if string ends with suffix. |
.replace(old, new) | String | Replace all occurrences of old with new. |
.split(sep) | Array | Split by sep. If no arg, splits on whitespace. |
.join(arr) | String | Join array elements with the string as separator. |
.count(sub) | Number | Count non-overlapping occurrences of sub. |
.find(sub) | Number | Index of first occurrence of sub. Returns -1 if not found. |
Methods can be chained: name.strip().lower().
Safety Features
- Whitelist-only execution — Only the functions and methods listed above are allowed. Unknown names produce an error, not silent null.
- Dunder blocking — Any expression containing
__(e.g.,__class__,__import__) is rejected before parsing even begins. - No assignment, no side effects — The expression language is pure; it can only read context values and compute results.
- Unknown identifiers are null — Referencing a variable that doesn’t exist returns
Null(falsy) rather than raising an error. This is intentional for optional-variable patterns.
Important Gotchas
1. All numbers are f64
Numbers are parsed and stored as f64 internally. This means str(42) produces "42.0", not "42". If you need the integer string representation, store it as a string in the context instead of using str() on a numeric literal.
2. shell_escape::escape() wraps values in single quotes
When using render_shell() for template expansion, empty strings become '' (two single quotes), not the empty string. This is correct for shell safety but may surprise you in conditions that check the rendered result.
3. Unknown identifiers are null by design
This is a feature, not a bug. It allows patterns like condition: "optional_var" to work — if optional_var isn’t set, the condition is falsy and the step is skipped without error.
4. not in is a single operator
The tokenizer uses lookahead to parse not in as one token (NotIn), distinct from a standalone not followed by in. This means not in always means “not contained in”, never “negation of the result of in” — though the result is the same.
5. Boolean keywords are case-sensitive
Only true/True and false/False are recognized. TRUE, FALSE, tRue, etc. are treated as regular identifiers and will resolve to Null.
6. none is not a keyword
There is no none or None literal. Writing none creates an identifier lookup that (typically) resolves to Null because no context variable named none exists. This works in practice but is not guaranteed if someone sets a context variable called none.
Examples
Basic truthiness
# Truthy if 'analysis' is set and non-empty in context
condition: "analysis"
# Always true
condition: "true"
# Always false
condition: "false"
String comparison
# Exact match
condition: "status == 'success'"
# Not equal
condition: "status != 'error'"
# Case-insensitive comparison via method
condition: "status.lower() == 'success'"
Numeric comparison
# Greater than
condition: "count > 0"
# Compound range check
condition: "count > 0 and count < 10"
# With function conversion
condition: "int(exit_code) == 0"
Logical operators
# Negation
condition: "not skip_tests"
# AND
condition: "has_tests and not skip_tests"
# OR
condition: "use_cache or force_rebuild"
# Combined with parentheses
condition: "(status == 'success' or status == 'partial') and not skip"
Membership tests
# Substring containment
condition: "'error' in output"
# Negated containment
condition: "'error' not in output"
# Array membership (items is an array in context)
condition: "'admin' in roles"
Function calls
# Length check
condition: "len(items) > 0"
# Type conversion
condition: "int(retry_count) < 3"
# Boolean conversion
condition: "bool(result)"
# Min/max
condition: "max(score_a, score_b) >= 80"
Method calls
# String prefix check
condition: "name.startswith('test_')"
# String suffix check
condition: "filename.endswith('.py')"
# Chained methods
condition: "input.strip().lower() == 'yes'"
# Replace and check
condition: "path.replace('\\', '/').startswith('/home')"
# Split and check length
condition: "len(csv_line.split(',')) > 3"
# Find (returns index or -1)
condition: "message.find('WARNING') >= 0"
# Count occurrences
condition: "log_output.count('ERROR') == 0"
Nested context access
# Dot-notation for nested values
condition: "result.status == 'ok'"
# Deep nesting
condition: "response.data.count > 0"
Optional variable patterns
# Skip step if variable isn't set (resolves to null → falsy)
condition: "optional_feature"
# Guard with default-like logic
condition: "config.verbose and len(debug_output) > 0"
Cross-type equality
# Number-string coercion: this is true if exit_code is 0 (the number)
condition: "exit_code == '0'"
# But be careful: str(42) gives "42.0", not "42"
# So this does NOT work as expected:
# condition: "str(count) == '42'" # produces "42.0" == "42" → false
Tutorial Examples
Progressive tutorials that teach one recipe runner feature at a time. Each tutorial is a self-contained YAML recipe you can run directly.
Source: examples/tutorials/
Tutorials
| # | Recipe | Feature | Run it |
|---|---|---|---|
| 01 | hello-world | Simplest recipe — one bash step | recipe-runner-rs examples/tutorials/01-hello-world.yaml |
| 02 | variables | Template {{variables}} and context | recipe-runner-rs examples/tutorials/02-variables.yaml |
| 03 | conditions | Conditional step execution | recipe-runner-rs examples/tutorials/03-conditions.yaml |
| 04 | multi-step-pipeline | Sequential steps with output chaining | recipe-runner-rs examples/tutorials/04-multi-step-pipeline.yaml |
| 05 | working-directories | Per-step working_dir | recipe-runner-rs examples/tutorials/05-working-directories.yaml |
| 06 | parse-json | JSON extraction from output | recipe-runner-rs examples/tutorials/06-parse-json.yaml |
| 07 | error-handling | continue_on_error | recipe-runner-rs examples/tutorials/07-error-handling.yaml |
| 08 | hooks | Pre/post/on_error hooks | recipe-runner-rs examples/tutorials/08-hooks.yaml |
| 09 | tags | when_tags + --include-tags | recipe-runner-rs examples/tutorials/09-tags.yaml --include-tags fast |
| 10 | parallel-groups | parallel_group concurrent execution | recipe-runner-rs examples/tutorials/10-parallel-groups.yaml |
| 11 | extends | Recipe inheritance via extends | recipe-runner-rs examples/tutorials/11-extends.yaml |
| 12 | recursion-limits | recursion config | recipe-runner-rs examples/tutorials/12-recursion-limits.yaml |
| 13 | timeouts | Step-level timeout | recipe-runner-rs examples/tutorials/13-timeouts.yaml |
| 14 | dry-run | --dry-run mode | recipe-runner-rs examples/tutorials/14-dry-run.yaml --dry-run |
Recommended Order
Start with 01-hello-world and work through sequentially. Each tutorial builds on concepts from previous ones.
Workflow Pattern Examples
Real-world workflow patterns that show how to compose recipe runner features for common development scenarios.
Source: examples/patterns/
Patterns
| Pattern | Recipe | Description |
|---|---|---|
| CI Pipeline | ci-pipeline.yaml | Gated build pipeline: checkout → deps → lint → test → build → package. Each step gates on prior success. |
| Code Review | code-review.yaml | Automated review: git diff → agent analysis → issue detection → review comments. |
| Deploy Pipeline | deploy-pipeline.yaml | Full deployment: pre-flight → build → integration test → staging → smoke test → promote. |
| Investigation | investigation.yaml | Systematic research: scope → explore (find/grep) → analyze → synthesize → document. |
| Migration | migration.yaml | Fail-fast migration: backup → validate → migrate → smoke test → verify. |
| Multi-Agent Consensus | multi-agent-consensus.yaml | Multiple agents analyze independently → synthesize votes → apply decision. |
| Quality Audit | quality-audit.yaml | Audit loop: lint → analyze → fix → re-lint → verify improvement. |
| Self-Improvement | self-improvement.yaml | Closed loop: eval → analyze errors → research → apply → re-eval → compare. |
Combining Patterns
Patterns compose via sub-recipe steps, hooks, tags, and parallel groups. Here’s a full deployment recipe that chains three patterns together — CI first, then review, then deploy — with quality audit as a gate between stages:
name: "ship-release"
description: "CI → Review → Quality Gate → Deploy"
version: "1.0"
context:
repo_path: "."
environment: "staging"
hooks:
on_error: "echo 'Pipeline failed at step: $STEP_ID' >> pipeline.log"
steps:
# ── Stage 1: Build & Test (sub-recipe) ──
- id: "ci"
recipe: "ci-pipeline"
context:
repo_path: "{{repo_path}}"
output: "ci_result"
# ── Stage 2: Parallel code reviews ──
- id: "security-review"
agent: "amplihack:security"
parallel_group: "reviews"
prompt: "Review {{repo_path}} for security vulnerabilities."
output: "security_findings"
- id: "architecture-review"
agent: "amplihack:architect"
parallel_group: "reviews"
prompt: "Review {{repo_path}} for architectural issues."
output: "arch_findings"
# ── Stage 3: Quality gate (sub-recipe, conditional) ──
- id: "quality-gate"
recipe: "quality-audit"
condition: "ci_result and 'PASS' in ci_result"
context:
repo_path: "{{repo_path}}"
output: "audit_result"
# ── Stage 4: Deploy (tagged — only runs with --include-tags release) ──
- id: "deploy"
recipe: "deploy-pipeline"
when_tags: ["release"]
condition: "'PASS' in audit_result"
context:
repo_path: "{{repo_path}}"
environment: "{{environment}}"
output: "deploy_result"
# ── Notification ──
- id: "notify"
command: |
echo "Release pipeline complete."
echo "CI: {{ci_result}}"
echo "Audit: {{audit_result}}"
echo "Deploy: {{deploy_result}}"
This recipe demonstrates:
- Sub-recipes (
recipe:) — CI, quality audit, and deploy each run as self-contained workflows - Parallel groups (
parallel_group:) — security and architecture reviews run concurrently - Conditional gates (
condition:) — quality audit only runs if CI passed; deploy only if audit passed - Tag filtering (
when_tags:) — deploy step only executes when--include-tags releaseis passed - Error hooks (
hooks.on_error:) — logs which step failed for post-mortem - Output chaining — each stage’s result flows into the next stage’s conditions
Testing & Edge-Case Recipes
Recipes designed to exercise specific recipe runner features and edge cases. Useful as regression tests and as references for condition syntax.
Source: recipes/testing/
Recipes
| Recipe | What It Tests |
|---|---|
| all-condition-operators | Every comparison and boolean operator: ==, !=, <, <=, >, >=, and, or, not, in, not in |
| all-functions | All whitelisted functions: int(), str(), len(), bool(), float(), min(), max() |
| all-methods | All whitelisted string methods: strip(), lstrip(), rstrip(), lower(), upper(), title(), startswith(), endswith(), replace(), split(), join(), count(), find() |
| output-chaining | Step output stored in context and referenced by subsequent steps via {{variable}} |
| json-extraction-strategies | All 3 JSON extraction strategies: direct parse, markdown fence, balanced braces |
| step-type-inference | Automatic step type detection: bash (command), agent (agent field), recipe (recipe field), agent (prompt-only) |
| continue-on-error-chain | continue_on_error: true allowing subsequent steps to run after failures |
| nested-context | Dot-notation access to nested context values: {{config.database.host}} |
| large-context | Many context variables and long values to test template rendering at scale |
| empty-and-edge-cases | Empty strings, missing variables, whitespace-only values, special characters |
Production Recipes
These recipes ship with amplihack and demonstrate real-world workflow patterns at scale.
Source: amplifier-bundle/recipes/
Development Workflows
| Recipe | Description |
|---|---|
| default-workflow | Complete development lifecycle: requirements → design → implement → test → merge |
| verification-workflow | Lightweight workflow for trivial changes: config edits, doc updates, single-file fixes |
| qa-workflow | Minimal workflow for simple questions and informational requests |
| investigation-workflow | Systematic investigation with parallel agent deployment |
| guide | Interactive guide to amplihack features |
Quality & Reliability
| Recipe | Description |
|---|---|
| quality-audit-cycle | Iterative audit loop: lint → analyze → fix → re-lint → verify improvement |
| self-improvement-loop | Closed-loop eval improvement: eval → analyze → research → improve → re-eval → compare |
| domain-agent-eval | Evaluate domain agents: eval harness + teaching evaluation + combined report |
| long-horizon-memory-eval | 1000-turn memory stress test with self-improvement loop |
| sdk-comparison | Run L1-L12 eval on all 4 SDKs and generate comparative report |
Multi-Agent Decision Making
| Recipe | Description |
|---|---|
| consensus-workflow | Multi-agent consensus at critical decision points with structured checkpoints |
| debate-workflow | Multi-agent structured debate for complex decisions requiring diverse perspectives |
| n-version-workflow | N-version programming: generate multiple independent implementations, pick best |
| cascade-workflow | 3-level fallback cascade: primary → secondary → tertiary |
Orchestration
| Recipe | Description |
|---|---|
| smart-orchestrator | Task classifier + goal-seeking loop with up to 3 execution rounds |
| auto-workflow | Autonomous multi-turn workflow — continues until task complete or max iterations |
Migration
| Recipe | Description |
|---|---|
| oxidizer-workflow | Automated Python-to-Rust migration with quality audit cycles and degradation checks |
Architecture — amplihack-recipe-runner
Rust implementation of the amplihack recipe runner. Parses YAML recipe files, evaluates conditions in a sandboxed expression language, and executes steps (bash commands, AI agent prompts, or nested sub-recipes) through a pluggable adapter layer.
Module Dependency Diagram
┌─────────────────────────────────────────────────────────────────────┐
│ main.rs (CLI) │
│ clap args → parse → build runner → execute → format output │
└──────┬──────────┬───────────┬──────────┬───────────┬───────────────┘
│ │ │ │ │
▼ ▼ ▼ ▼ ▼
parser.rs runner.rs discovery.rs adapters/ models.rs
│ │ │ │ │ cli_subprocess.rs
│ │ │ │ │ │
│ │ │ └───────┘ │
│ │ │ │
│ ▼ ▼ │
│ context.rs agent_resolver.rs │
│ │ │
└───────┴────────────────────────────┘
models.rs (shared types)
graph TD
main[main.rs — CLI] --> parser[parser.rs]
main --> runner[runner.rs]
main --> discovery[discovery.rs]
main --> cli_sub[cli_subprocess.rs]
lib[lib.rs — Public API] --> parser
lib --> runner
lib --> discovery
runner --> context[context.rs]
runner --> agent_resolver[agent_resolver.rs]
runner --> discovery
runner --> adapters[adapters/mod.rs — Adapter trait]
cli_sub --> adapters
parser --> models[models.rs]
runner --> models
context --> models
discovery --> models
main --> models
lib --> models
Module Roles at a Glance
| Module | Responsibility |
|---|---|
main.rs | CLI interface (clap), subcommands, output formatting |
lib.rs | Public library API for embedding |
models.rs | Shared data types (Recipe, Step, StepResult, …) |
parser.rs | YAML deserialization, validation, typo detection |
context.rs | Template rendering, sandboxed condition evaluation |
runner.rs | Orchestration: hooks, conditions, audit, recursion |
agent_resolver.rs | Agent reference → markdown file resolution |
discovery.rs | Multi-directory recipe discovery and manifest sync |
adapters/mod.rs | Adapter trait definition |
adapters/cli_subprocess.rs | Subprocess execution for bash and agent steps |
Data Flow
YAML file
│
▼
┌──────────┐ file size check ┌────────────┐
│ parser.rs │ ──────────────────► │ serde_yaml │
└──────────┘ MAX_YAML_SIZE 1MB │ deserialize │
│ └─────┬──────┘
│ validate: name, steps, │
│ unique IDs, field typos ▼
│ Recipe (models.rs)
▼
┌──────────┐ merge recipe.context
│ runner.rs │ + user overrides (--set)
└──────────┘
│
│ for each step:
│ 1. Tag filter (when_tags vs active/exclude)
│ 2. Condition evaluation (context.evaluate)
│ 3. Template rendering (context.render / render_shell)
│ 4. Dispatch: Bash │ Agent │ Sub-Recipe
│ 5. Optional JSON parse of output
│ 6. Store output in context
│ 7. Write JSONL audit entry
│ 8. Run post_step / on_error hook
│
▼
RecipeResult
├── success: bool
├── step_results: Vec<StepResult>
├── context: final variable state
└── duration: wall-clock time
Parse Phase
RecipeParser::parse_filereads the file and rejects anything over 1 MB (YAML bomb protection).serde_yamldeserializes intoRecipe. Step fields likecommand,agent,prompt, andrecipedetermine the implicitStepTypeviaStep::effective_type().- Structural validation: name must be non-empty, at least one step required, step IDs must be unique.
validate_with_yamlinspects raw YAML keys and reports unknown fields using edit-distance typo detection (e.g., “comand” → did you mean “command”?).
Execute Phase
RecipeRunner::execute merges the recipe’s context map with any user-supplied
--set KEY=VALUE overrides, then iterates steps sequentially:
- Tag filter —
should_skip_by_tagscheckswhen_tagsagainstactive_tags/exclude_tags. - Condition —
RecipeContext::evaluateruns a sandboxed boolean expression (see Safety Model). - Dispatch — routes to
execute_bash_step,execute_agent_step, orexecute_sub_recipeon the adapter. - Output capture — if
parse_jsonis set, the runner tries three extraction strategies (direct parse → markdown fence → balanced brackets), with an optional retry that re-prompts the agent for JSON-only output. - Context update — step output is stored under
step.output(orstep.id) in the context for downstream templates. - Hooks —
pre_stepruns before dispatch,post_stepafter success,on_errorafter failure. Hook commands are rendered through the context. - Audit — each step result is appended to a JSONL file
(
<audit_dir>/<recipe>_<timestamp>.jsonl).
Core Types (models.rs)
Step
#![allow(unused)]
fn main() {
struct Step {
id: String,
command: Option<String>, // Bash step
agent: Option<String>, // Agent reference
prompt: Option<String>, // Agent prompt
recipe: Option<String>, // Sub-recipe name
output: Option<String>, // Context variable for result
condition: Option<String>, // Boolean expression
parse_json: Option<bool>, // Auto-parse output as JSON
mode: Option<String>, // Execution mode
working_dir: Option<String>, // Override cwd
timeout: Option<u64>, // Seconds
auto_stage: Option<bool>, // git add -A after agent steps
continue_on_error: Option<bool>, // Don't fail-fast
when_tags: Option<Vec<String>>, // Tag-based filtering
parallel_group: Option<String>, // Concurrent step grouping (fully implemented)
sub_context: Option<HashMap<…>>, // Context overrides for sub-recipe
}
}
Step::effective_type() infers the step type from which fields are present:
recipe → Recipe, agent/prompt → Agent, command → Bash.
Recipe
#![allow(unused)]
fn main() {
struct Recipe {
name: String,
version: Option<String>,
description: Option<String>,
author: Option<String>,
tags: Option<Vec<String>>,
context: Option<HashMap<String, Value>>,
steps: Vec<Step>,
recursion: Option<RecursionConfig>, // max_depth (6), max_total_steps (200)
hooks: Option<RecipeHooks>, // pre_step, post_step, on_error
extends: Option<String>, // Parent recipe (inheritance)
}
}
Result Types
#![allow(unused)]
fn main() {
struct StepResult {
step_id: String,
status: StepStatus, // Pending | Running | Completed | Skipped | Failed
output: Option<String>,
error: Option<String>,
duration: Duration,
}
struct RecipeResult {
recipe_name: String,
success: bool,
step_results: Vec<StepResult>,
context: HashMap<String, Value>, // Final state (skipped in JSON serialization)
duration: Duration,
}
}
CLI Interface (main.rs)
recipe-runner-rs [OPTIONS] [RECIPE] [COMMAND]
Commands:
list List discovered recipes
Arguments:
[RECIPE] Path to a .yaml recipe file
Options:
-C, --working-dir <DIR> Working directory (default: ".")
-R, --recipe-dir <DIR> Additional recipe search directories (repeatable)
--set <KEY=VALUE> Context variable overrides (repeatable)
--dry-run Log steps without executing
--validate-only Parse and validate, then exit
--explain Print step plan without executing
--progress Emit progress to stderr (StderrListener)
--include-tags <TAGS> Only run steps matching these tags (comma-separated)
--exclude-tags <TAGS> Skip steps matching these tags (comma-separated)
--audit-dir <DIR> Directory for JSONL audit logs
--output-format <FMT> Output format: text (default) or json
--set values are auto-typed: JSON objects/arrays are parsed as-is, true/false
become booleans, numeric strings become numbers, everything else stays a string.
Adapter Pattern
The Adapter trait decouples the runner from any specific execution backend:
#![allow(unused)]
fn main() {
trait Adapter {
fn execute_agent_step(
&self, prompt: &str, agent_name: &str,
system_prompt: Option<&str>, mode: Option<&str>,
working_dir: Option<&str>, model: Option<&str>,
) -> Result<String>;
fn execute_bash_step(
&self, command: &str, working_dir: Option<&str>,
timeout: Option<u64>,
) -> Result<String>;
fn is_available(&self) -> bool;
fn name(&self) -> &str;
}
}
CLISubprocessAdapter
The production adapter spawns subprocesses:
- Bash steps —
/bin/bash -c <command>, optionally wrapped withtimeout. - Agent steps —
claude -p <prompt>in an isolated temp directory. ANON_INTERACTIVE_FOOTER(“Proceed autonomously. Do not ask questions.”) is appended to prevent the nested Claude session from hanging on prompts.
Timeout enforcement: A background heartbeat thread monitors the deadline.
It logs progress every 2 seconds. On expiry it sends SIGTERM, waits 5 seconds,
then escalates to SIGKILL.
Environment propagation: build_child_env() forwards session-tracking
variables (AMPLIHACK_SESSION_DEPTH, AMPLIHACK_TREE_ID, AMPLIHACK_MAX_DEPTH,
AMPLIHACK_MAX_SESSIONS) and strips CLAUDECODE to prevent nested session
confusion.
Execution Flow
Lifecycle of a Recipe Run
CLI args
│
├─ --validate-only ──► parse + validate ──► print warnings ──► exit
├─ --explain ─────────► parse ──► print step plan ──► exit
│
▼
RecipeRunner::execute(recipe, user_context)
│
├─ Check recursion limits (depth ≤ max_depth, total_steps ≤ max_total_steps)
├─ Merge recipe.context + user_context
├─ Open JSONL audit log (if --audit-dir set)
│
│ ┌─── for each step ──────────────────────────────────────────┐
│ │ │
│ │ 1. should_skip_by_tags(step) ──► skip if filtered out │
│ │ 2. run_hook(pre_step) │
│ │ 3. evaluate condition ──► Skipped if false │
│ │ 4. render templates in command/prompt │
│ │ 5. dispatch: │
│ │ ├─ Bash → adapter.execute_bash_step() │
│ │ ├─ Agent → resolve agent, adapter.execute_agent_step() │
│ │ └─ Recipe → execute_sub_recipe() (recursive) │
│ │ 6. parse JSON output (if parse_json, with retry) │
│ │ 7. store output in context │
│ │ 8. maybe_auto_stage (git add -A for agent steps) │
│ │ 9. run_hook(post_step) or run_hook(on_error) │
│ │ 10. write JSONL audit entry │
│ │ 11. fail-fast unless continue_on_error │
│ │ │
│ └─────────────────────────────────────────────────────────────┘
│
▼
RecipeResult { success, step_results, context, duration }
Sub-Recipe Execution
When a step has step_type: Recipe:
- The runner searches for the recipe file using
discovery::find_recipeacrossrecipe_search_dirs, then falls back to a direct path relative toworking_dir. - Recursion depth is checked against
RecursionConfig::max_depth(default 6).total_stepsis checked againstmax_total_steps(default 200). - The sub-recipe’s context inherits from the parent context, merged with any
sub_contextoverrides defined on the step. - A new
execute_with_depth(recipe, context, depth + 1)call runs the sub-recipe. Depth and total-step counters are tracked viaCell<u32>. - After execution, the sub-recipe’s final context is propagated back into the parent context.
Hooks
Defined in RecipeHooks:
hooks:
pre_step: "echo 'Starting step {{step_id}}'"
post_step: "echo 'Completed step {{step_id}}'"
on_error: "notify-send 'Step {{step_id}} failed'"
Hooks are shell commands rendered through the context. pre_step runs before
every step dispatch. post_step runs after a successful step. on_error runs
after a failed step. Hook failures are logged but do not abort the recipe.
Execution Listeners
The ExecutionListener trait provides real-time progress callbacks:
#![allow(unused)]
fn main() {
trait ExecutionListener {
fn on_step_start(&self, step_id: &str, step_type: &str);
fn on_step_complete(&self, result: &StepResult);
fn on_output(&self, step_id: &str, line: &str);
}
}
| Implementation | Behavior |
|---|---|
NullListener | No-op (default) |
StderrListener | Emits progress emojis and timing to stderr |
Activated with --progress.
Safety Model
Condition Evaluator (context.rs)
The condition evaluator is a hand-written recursive descent parser that
evaluates boolean expressions over recipe context variables. It does not
call eval() or execute arbitrary code.
Supported syntax:
status == "ok" and (retries < 3 or force == true)
len(items) > 0
name.startswith("test_")
value not in "blocked,disabled"
Operator precedence (low → high): or, and, not, comparison
(==, !=, <, <=, >, >=, in, not in).
Security constraints:
| Rule | Rationale |
|---|---|
No __ (dunder) access | Blocks dunder attribute introspection |
| Whitelisted functions only | int, str, len, bool, float, min, max |
| Whitelisted methods only | strip, lower, upper, startswith, endswith, replace, split, join, count, find, and variants |
| No assignment operators | Expressions are read-only |
| No function definitions | Grammar does not support fn, def, lambda |
The tokenizer produces typed tokens (String, Number, Ident, Eq,
And, Or, …) and the parser consumes them with lookahead. Unrecognized
tokens produce a parse error rather than silent misbehavior.
Template Rendering
RecipeContext::render replaces {{var}} placeholders with values from the
context. Dot-notation ({{obj.nested.key}}) traverses into JSON objects.
Missing variables render as empty strings.
RecipeContext::render_shell does the same but shell-escapes every substituted
value to prevent injection in bash commands.
Agent Resolver Path Safety (agent_resolver.rs)
Agent references use a namespaced format (namespace:category:name or
namespace:name). Each segment is validated against:
#![allow(unused)]
fn main() {
static SAFE_NAME_RE: Regex = Regex::new(r"^[a-zA-Z0-9_-]+$");
}
This rejects /, .., and any characters that could enable path traversal.
As defense-in-depth, after resolving the file path, the resolver canonicalizes both the candidate path and the search base directory, then verifies the resolved path is a child of the search base. This defends against symlink attacks.
Parser Protections (parser.rs)
- File size limit: 1 MB (
MAX_YAML_SIZE_BYTES). Prevents YAML bombs and memory exhaustion. - Structural validation: Rejects empty names, zero-step recipes, and duplicate step IDs.
- Field typo detection: Unknown top-level and step-level fields trigger warnings. Edit-distance matching suggests corrections.
Subprocess Isolation (cli_subprocess.rs)
- Agent steps execute in a fresh temporary directory that is cleaned up on drop.
CLAUDECODEis stripped from the child environment to prevent the nested Claude process from attaching to the parent’s session.- Session depth tracking (
AMPLIHACK_SESSION_DEPTH) prevents runaway recursive spawning.
Interior Mutability Pattern
The runner tracks recursion state with std::cell::Cell<u32>:
#![allow(unused)]
fn main() {
struct RecipeRunner<A: Adapter> {
// ...
depth: Cell<u32>,
total_steps: Cell<u32>,
// ...
}
}
Why Cell?
RecipeRunner::execute takes &self (shared reference) because the runner is
logically immutable during a run — the adapter, working directory, tag filters,
and listener never change. But recursion tracking requires mutating two counters.
Cell<u32> provides interior mutability for Copy types without runtime
borrow-checking overhead (no RefCell needed). The runner is single-threaded,
so Cell is sufficient and zero-cost.
Usage in Recursion
execute(&self, recipe, context)
│
├─ self.depth.get() checked against max_depth
├─ self.total_steps.get() checked against max_total_steps
│
└─ execute_sub_recipe(&self, step, ctx)
│
├─ self.depth.set(self.depth.get() + 1)
├─ execute_with_depth(&self, sub_recipe, ctx, new_depth)
└─ self.total_steps.set(self.total_steps.get() + sub_step_count)
The RecursionConfig defaults (max_depth: 6, max_total_steps: 200) can be
overridden per-recipe in the YAML:
recursion:
max_depth: 3
max_total_steps: 50
Recipe Discovery (discovery.rs)
Search Directories (default order)
~/.amplihack/.claude/recipes./amplifier-bundle/recipes./src/amplihack/amplifier-bundle/recipes./.claude/recipes
Additional directories can be added with -R <dir> (repeatable).
Discovery Functions
| Function | Purpose |
|---|---|
discover_recipes | Scan directories, return HashMap<name, RecipeInfo> |
list_recipes | Sorted Vec<RecipeInfo> for display |
find_recipe | Locate a single recipe by name → Option<PathBuf> |
verify_global_installation | Check that default dirs exist and contain recipes |
Manifest & Upstream Sync
update_manifest writes _recipe_manifest.json — a map of filenames to their
SHA-256 hashes (first 16 hex chars). check_upstream_changes diffs the current
directory state against the manifest and reports new, modified, or deleted
files.
sync_upstream adds a git remote, fetches, and diffs local recipes against the
upstream branch, returning a summary of changes.
JSONL Audit Log
When --audit-dir is set, each recipe run produces a file:
<audit-dir>/<recipe-name>_<ISO-timestamp>.jsonl
Each line is a JSON object:
{"step_id": "build", "status": "Completed", "duration_ms": 1423, "error": null, "output_len": 256}
Audit logs enable post-hoc analysis of recipe execution without cluttering stdout.
JSON Output Extraction
When parse_json: true is set on a step, the runner extracts structured JSON
from potentially noisy output using three strategies (tried in order):
- Direct parse —
serde_json::from_str(output). Works when the output is pure JSON. - Markdown fence — extracts content between
```jsonand```delimiters. Common in LLM output. - Balanced brackets — finds the first
{or[and matches it to its closing counterpart, counting nesting depth.
If all three fail, the runner optionally retries the agent step with a JSON-only reminder appended to the prompt, then re-applies the extraction pipeline.