amplihack Recipe Runner

A code-enforced workflow execution engine that reads declarative YAML recipe files and executes them step-by-step using AI agents. Unlike prompt-based workflow instructions that models can interpret loosely or skip, the Recipe Runner controls the execution loop in compiled Rust code — making it physically impossible to skip steps.

Feature Highlights

Comprehensive test suite covering unit, integration, recipe, example, and property-based testing
Parallel step execution, tag filtering, JSONL audit logs
Recipe composition via extends, pre/post/on_error hooks
Safe condition language with recursive descent parser

Quick Start

# Build
cargo build --release

# Run a recipe
recipe-runner-rs path/to/recipe.yaml

# With context overrides
recipe-runner-rs recipe.yaml --set task_description="Add auth" --set repo_path="."

# Dry run
recipe-runner-rs recipe.yaml --dry-run

See the Quick Start guide for a more detailed walkthrough.

Quick Start

Get up and running with the amplihack Recipe Runner in minutes.

Install

# Clone the repository
git clone https://github.com/rysweet/amplihack-recipe-runner.git
cd amplihack-recipe-runner

# Build in release mode
cargo build --release

# The binary is at target/release/recipe-runner-rs
# Optionally copy it to your PATH:
cp target/release/recipe-runner-rs ~/.local/bin/

Your First Recipe

Create a file called hello.yaml:

name: "hello-world"
description: "A minimal recipe to verify your setup"
version: "1.0.0"
context:
  greeting: "Hello from the Recipe Runner!"
steps:
  - id: "greet"
    command: "echo '{{greeting}}'"

Run It

recipe-runner-rs hello.yaml

You should see the greeting printed to stdout.

Override Context

Pass --set to override context variables at runtime:

recipe-runner-rs hello.yaml --set greeting="Howdy, partner!"

Dry Run

Use --dry-run to see what would execute without actually running anything:

recipe-runner-rs hello.yaml --dry-run

Using Agent Steps

Agent steps invoke an AI agent instead of a shell command:

name: "analyze-project"
description: "Analyze a codebase with an AI agent"
version: "1.0.0"
context:
  repo_path: "."
steps:
  - id: "analyze"
    agent: "amplihack:core:architect"
    prompt: "Analyze the project at {{repo_path}} and summarize its structure"
    output: "analysis"
    parse_json: true

  - id: "report"
    command: "echo 'Analysis complete'"
    condition: "analysis"

Next Steps

YAML Recipe Format — Full schema reference
CLI Reference — All flags and subcommands
Condition Language — Conditional step execution
Architecture — How it works under the hood

YAML Recipe Format Reference

Complete schema reference for amplihack recipe runner YAML files.

Top-Level Fields

Field	Type	Required	Default	Description
`name`	string	yes	—	Recipe name
`version`	string	no	`"1.0"`	Semantic version
`description`	string	no	`""`	Human-readable description
`author`	string	no	`""`	Author name
`tags`	list of strings	no	`[]`	Recipe tags for categorisation
`context`	map	no	`{}`	Default variable values for templates
`extends`	string	no	—	Parent recipe name (for inheritance). Note: only single-level inheritance is supported; extended recipes cannot themselves use `extends`.
`recursion`	RecursionConfig	no	see below	Sub-recipe recursion limits
`hooks`	RecipeHooks	no	—	Lifecycle hooks
`steps`	list of Step	yes	—	Ordered list of steps to execute

RecursionConfig

Controls sub-recipe nesting limits.

Field	Type	Default	Description
`max_depth`	int	`6`	Maximum sub-recipe recursion depth
`max_total_steps`	int	`200`	Maximum total steps across all sub-recipes

RecipeHooks

Shell commands executed at lifecycle boundaries.

Field	Type	Description
`pre_step`	string	Shell command to run before each step
`post_step`	string	Shell command to run after each step
`on_error`	string	Shell command to run on step failure

Hook commands receive context variables via template substitution.

Step Fields

Field	Type	Required	Default	Description
`id`	string	yes	—	Unique step identifier
`type`	string	no	inferred	`"bash"`, `"agent"`, or `"recipe"` (see inference rules)
`command`	string	no	—	Shell command (bash steps)
`agent`	string	no	—	Agent reference (agent steps)
`prompt`	string	no	—	Prompt template (agent steps)
`output`	string	no	—	Variable name to store step output in context
`condition`	string	no	—	Expression that must be truthy to execute
`parse_json`	bool	no	`false`	Extract JSON from step output
`parse_json_required`	bool	no	`false`	Fail the step if JSON extraction fails (see below)
`mode`	string	no	—	Execution mode
`working_dir`	string	no	—	Override working directory for this step
`timeout`	int	no	—	Step timeout in seconds
`auto_stage`	bool	no	`true`	Git auto-stage after agent steps
`model`	string	no	—	Model override for agent steps (e.g., “haiku”, “sonnet”)
`recipe`	string	no	—	Sub-recipe name (recipe steps)
`recovery_on_failure`	bool	no	`false`	Attempt agentic recovery if sub-recipe fails (see below)
`context`	map	no	—	Context overrides passed to sub-recipe
`continue_on_error`	bool	no	`false`	Continue execution if this step fails
`when_tags`	list of strings	no	`[]`	Step only runs when these tags match active tag filters
`parallel_group`	string	no	—	Group name for parallel execution

Note: The context field provides step-specific variables that are passed as overrides to sub-recipes. In YAML you write context:.

Type Inference Rules

When type is omitted, the effective step type is inferred in this order:

recipe field present → recipe type
agent field present → agent type
prompt present without command → agent type
Otherwise → bash type (default)

An explicit type value always takes precedence.

# Inferred as bash (has command, no agent/recipe/prompt)
- id: build
  command: cargo build --release

# Inferred as agent (agent field present)
- id: review
  agent: code-reviewer
  prompt: "Review {{file}}"

# Inferred as agent (prompt without command)
- id: summarise
  prompt: "Summarise the changes in {{diff}}"

# Inferred as recipe (recipe field present)
- id: deploy
  recipe: deploy-production
  context:
    env: staging

# Explicit type overrides inference
- id: special
  type: bash
  prompt: "This prompt is ignored because type is bash"
  command: echo "explicit wins"

Template Syntax

Variables are substituted using {{variable_name}} syntax. Variable names may contain letters, digits, underscores, hyphens, and dots.

context:
  project: my-app
  branch: main

steps:
  - id: greet
    command: echo "Building {{project}} on {{branch}}"

Dot Notation

Nested context values are accessed with dot notation:

context:
  deploy:
    target: production
    region: us-east-1

steps:
  - id: deploy
    command: ./deploy.sh --target {{deploy.target}} --region {{deploy.region}}

Shell Escaping

When templates are rendered for shell commands, values are shell-escaped automatically via shell_escape to prevent injection. Undefined variables resolve to an empty string.

Condition Syntax

The condition field accepts an expression that is evaluated against the current context. Steps with a falsy condition are skipped.

See conditions.md for the full reference. Supported operators and built-in functions include:

Comparisons: ==, !=, <, <=, >, >=
Logical: and, or, not
Membership: in, not in
Functions: int(), str(), len(), bool(), float(), min(), max()
Methods: strip(), lower(), upper(), startswith(), endswith(), replace(), split(), join(), count(), find()

- id: deploy
  condition: "branch == 'main' and tests_passed == 'true'"
  command: ./deploy.sh

JSON Extraction (`parse_json`)

When parse_json: true, the runner attempts to extract structured JSON from step output using three strategies in order:

Direct parse — the entire trimmed output is valid JSON.
Markdown fence extraction — JSON inside ```json ... ``` fences.
Balanced bracket detection — locates the first {…} or […] block with proper depth tracking, string awareness, and escape handling.

If all strategies fail a warning is logged and the raw output is stored.

- id: get-config
  command: curl -s https://api.example.com/config
  output: api_config
  parse_json: true

- id: use-config
  command: echo "Region is {{api_config.region}}"

Strict Mode (`parse_json_required`)

By default, JSON extraction failure degrades the step (status becomes degraded) but the recipe continues. Set parse_json_required: true to make extraction failure a hard error that stops the recipe.

- id: must-be-json
  command: curl -s https://api.example.com/data
  parse_json: true
  parse_json_required: true  # fails the recipe if output isn't valid JSON
  output: api_data

`parse_json_required`	On extraction failure
`false` (default)	Step marked `degraded`, raw output stored, recipe continues
`true`	Step marked `failed`, recipe stops immediately

Sub-Recipe Recovery (`recovery_on_failure`)

When a sub-recipe step fails, set recovery_on_failure: true to trigger an agentic recovery attempt. The runner sends the failure details to an agent, which attempts to complete the remaining work.

- id: deploy
  recipe: deploy-to-staging
  recovery_on_failure: true  # agent attempts recovery if deploy fails

If the agent’s recovery output contains “STATUS: COMPLETE” or “recovered”, the step is marked as recovered and the recipe continues. Otherwise, the original failure propagates.

Model Override (`model`)

Agent steps can override the default model using the model field. The value is passed to the adapter, which maps it to a specific model identifier.

- id: quick-check
  agent: reviewer
  prompt: "Quick lint check on {{file_path}}"
  model: haiku  # fast, cheap model for simple tasks

- id: deep-review
  agent: reviewer
  prompt: "Thorough security review of {{file_path}}"
  model: sonnet  # more capable model for complex analysis

Complete Examples

1. Simple Bash-Only Recipe

name: build-and-test
version: "1.0"
description: Build the project and run tests
author: dev-team
tags: [ci, build]

context:
  build_mode: release

steps:
  - id: clean
    command: cargo clean

  - id: build
    command: cargo build --{{build_mode}}

  - id: test
    command: cargo test --{{build_mode}}
    output: test_results

  - id: report
    command: echo "Tests complete. Results {{test_results}}"

2. Agent-Based Workflow

name: code-review-workflow
version: "1.0"
description: Automated code review with AI agents

context:
  target_branch: main

steps:
  - id: get-diff
    command: git diff {{target_branch}} --stat
    output: diff_summary

  - id: review
    agent: code-reviewer
    prompt: |
      Review the following changes against {{target_branch}}:
      {{diff_summary}}
      Focus on correctness, security, and performance.
    output: review_result
    parse_json: true

  - id: check-approved
    condition: "review_result.approved == true"
    command: echo "Review passed"

  - id: request-changes
    condition: "review_result.approved != true"
    command: echo "Changes requested — see review_result.comments"

3. Sub-Recipe Composition

name: full-pipeline
version: "2.0"
description: End-to-end pipeline composing smaller recipes

context:
  environment: staging

steps:
  - id: lint
    recipe: lint-check

  - id: build
    recipe: build-project
    context:
      build_mode: release
      target: "{{environment}}"

  - id: deploy
    recipe: deploy-service
    context:
      env: "{{environment}}"
      version: "{{build.version}}"
    condition: "environment != 'local'"

4. Recipe with Hooks, Tags, and Recursion Limits

name: guarded-pipeline
version: "1.0"
description: Pipeline with lifecycle hooks and safety limits
author: platform-team
tags: [production, safe]

recursion:
  max_depth: 3
  max_total_steps: 50

hooks:
  pre_step: echo "[$(date -Iseconds)] Starting step"
  post_step: echo "[$(date -Iseconds)] Finished step"
  on_error: |
    echo "FAILED — sending alert"
    curl -s -X POST https://alerts.example.com/hook \
      -d '{"step": "failed", "recipe": "guarded-pipeline"}'

context:
  notify: true

steps:
  - id: preflight
    command: ./scripts/preflight-check.sh

  - id: migrate
    command: ./scripts/migrate.sh
    when_tags: [database]

  - id: deploy
    command: ./scripts/deploy.sh
    when_tags: [deploy]

  - id: smoke-test
    command: ./scripts/smoke-test.sh
    timeout: 120
    when_tags: [deploy]

  - id: notify
    condition: "notify == 'true'"
    command: echo "Pipeline complete"

5. Recipe with `continue_on_error` and Conditions

name: resilient-checks
version: "1.0"
description: Run multiple checks, collecting results even on failures

context:
  strict: false

steps:
  - id: lint
    command: cargo clippy -- -D warnings
    output: lint_result
    continue_on_error: true

  - id: test
    command: cargo test 2>&1
    output: test_result
    continue_on_error: true

  - id: audit
    command: cargo audit
    output: audit_result
    continue_on_error: true

  - id: gate
    condition: "strict == 'true'"
    command: |
      echo "Lint: {{lint_result}}"
      echo "Test: {{test_result}}"
      echo "Audit: {{audit_result}}"
      # Fail the pipeline in strict mode if any check failed
      exit 1

  - id: summary
    condition: "strict != 'true'"
    command: |
      echo "=== Check Summary ==="
      echo "Lint:  {{lint_result}}"
      echo "Test:  {{test_result}}"
      echo "Audit: {{audit_result}}"
      echo "Non-strict mode — pipeline continues"

CLI Reference

Complete reference for the recipe-runner-rs command-line interface.

Synopsis

recipe-runner-rs [OPTIONS] [RECIPE] [COMMAND]
recipe-runner-rs list [OPTIONS]

Subcommands

`list`

Discover and display all available recipes found in the configured search directories.

recipe-runner-rs list
recipe-runner-rs list --recipe-dir ./custom-recipes
recipe-runner-rs list --recipe-dir ./team-recipes --recipe-dir ./personal-recipes

Global Options

`-C, --working-dir <DIR>`

Set the working directory for recipe execution.

Default: . (current directory)

# Run a recipe from a different directory
recipe-runner-rs deploy.yaml --working-dir /home/user/my-project

# Short form
recipe-runner-rs deploy.yaml -C /home/user/my-project

# Combine with other options
recipe-runner-rs build.yaml -C ../other-repo --dry-run

`-R, --recipe-dir <DIR>`

Add a directory to the recipe search path. Can be specified multiple times to search across several directories.

# Single directory
recipe-runner-rs my-recipe --recipe-dir ./recipes

# Multiple directories (searched in order)
recipe-runner-rs my-recipe \
  --recipe-dir ./project-recipes \
  --recipe-dir ~/.config/recipes \
  --recipe-dir /opt/shared-recipes

# Short form
recipe-runner-rs my-recipe -R ./recipes -R ../shared

# Combine with list to discover recipes across directories
recipe-runner-rs list -R ./recipes -R /opt/shared-recipes

`--set <KEY=VALUE>`

Override a context variable. Can be specified multiple times to set several variables. Values are automatically typed using smart parsing (see Smart Context Value Parsing).

# String value
recipe-runner-rs deploy.yaml --set environment=production

# Integer value (auto-detected)
recipe-runner-rs scale.yaml --set replicas=5

# Float value (auto-detected)
recipe-runner-rs tune.yaml --set ratio=0.75

# Boolean value (auto-detected)
recipe-runner-rs build.yaml --set verbose=true

# JSON value (auto-detected)
recipe-runner-rs config.yaml --set data='{"host": "localhost", "port": 8080}'

# Multiple overrides
recipe-runner-rs deploy.yaml \
  --set environment=production \
  --set replicas=3 \
  --set debug=false \
  --set version=2.1.0

`--dry-run`

Parse and validate the recipe without executing any steps. Useful for checking recipe correctness before committing to a run.

recipe-runner-rs deploy.yaml --dry-run

# Combine with --set to validate context overrides
recipe-runner-rs deploy.yaml --dry-run --set environment=staging

# Combine with --progress to see what steps would run
recipe-runner-rs deploy.yaml --dry-run --progress

`--no-auto-stage`

Disable automatic git staging of file changes made during recipe execution.

recipe-runner-rs codegen.yaml --no-auto-stage

# Useful when you want to review changes before staging
recipe-runner-rs refactor.yaml --no-auto-stage -C /path/to/repo

`--validate-only`

Parse and validate the recipe, print any warnings, then exit. Does not execute any steps. More thorough than --dry-run as it focuses on surfacing validation warnings.

recipe-runner-rs deploy.yaml --validate-only

# Validate a recipe in a specific directory
recipe-runner-rs my-recipe --validate-only -R ./recipes

# Validate with context overrides to check for missing variables
recipe-runner-rs deploy.yaml --validate-only --set environment=production

`--explain`

Show the structure of a recipe without executing it. Displays the recipe name, version, and each step with its conditions, agents, and commands.

recipe-runner-rs deploy.yaml --explain

# Explain a recipe found via search path
recipe-runner-rs my-recipe --explain -R ./recipes

Example output:

Recipe: deploy
Version: 1.2.0

Steps:
  1. build
     Agent: builder
     Command: cargo build --release
  2. test
     Condition: when context.run_tests == true
     Agent: tester
     Command: cargo test
  3. deploy
     Agent: deployer
     Command: ./scripts/deploy.sh

`--progress`

Print step progress events to stderr. Emits events when each step starts and completes, useful for monitoring long-running recipes.

recipe-runner-rs deploy.yaml --progress

# Capture progress separately from output
recipe-runner-rs deploy.yaml --progress 2>progress.log

# Combine with JSON output for machine-readable progress + results
recipe-runner-rs deploy.yaml --progress --output-format json

Example stderr output:

[step:start] build (1/3)
[step:complete] build (1/3) — ok
[step:start] test (2/3)
[step:complete] test (2/3) — ok
[step:start] deploy (3/3)
[step:complete] deploy (3/3) — ok

`--include-tags <TAGS>`

Comma-separated list of tags. Only steps whose when_tags match at least one of the specified tags will run. All other steps are skipped.

# Run only steps tagged "frontend"
recipe-runner-rs build.yaml --include-tags frontend

# Run steps tagged "test" or "lint"
recipe-runner-rs ci.yaml --include-tags test,lint

# Combine with --explain to preview filtered steps
recipe-runner-rs ci.yaml --include-tags test --explain

`--exclude-tags <TAGS>`

Comma-separated list of tags. Steps whose when_tags match any of the specified tags will be skipped.

# Skip slow integration tests
recipe-runner-rs ci.yaml --exclude-tags slow

# Skip multiple categories
recipe-runner-rs full-pipeline.yaml --exclude-tags slow,experimental,deprecated

# Include some, exclude others
recipe-runner-rs ci.yaml --include-tags test --exclude-tags slow

`--audit-dir <DIR>`

Directory where JSONL audit log files are written. Each recipe run produces one audit log file.

# Write audit logs to a directory
recipe-runner-rs deploy.yaml --audit-dir ./audit-logs

# Combine with other options for a fully audited production run
recipe-runner-rs deploy.yaml \
  --audit-dir /var/log/recipe-runner \
  --set environment=production \
  --progress

`--output-format <FORMAT>`

Control the output format. Available formats:

Format	Description
`text`	Human-readable output (default)
`json`	Machine-readable JSON output

# Default text output
recipe-runner-rs deploy.yaml

# JSON output for scripting / CI pipelines
recipe-runner-rs deploy.yaml --output-format json

# Pipe JSON output to jq
recipe-runner-rs deploy.yaml --output-format json | jq '.steps[] | select(.status == "failed")'

# JSON output with progress on stderr
recipe-runner-rs deploy.yaml --output-format json --progress 2>/dev/null

Exit Codes

Code	Meaning	Description
`0`	Success	Recipe completed successfully; all steps passed
`1`	Failure	Recipe failed; at least one step failed during execution
`2`	Parse/validation error	Invalid YAML syntax, unknown fields, or other validation errors

# Check exit code in scripts
recipe-runner-rs deploy.yaml
if [ $? -eq 0 ]; then
  echo "Deploy succeeded"
elif [ $? -eq 1 ]; then
  echo "Deploy failed — check step output"
elif [ $? -eq 2 ]; then
  echo "Recipe is invalid — check YAML syntax"
fi

# Use && / || for simple chaining
recipe-runner-rs build.yaml && recipe-runner-rs deploy.yaml

# Validate before running
recipe-runner-rs deploy.yaml --validate-only && recipe-runner-rs deploy.yaml

Smart Context Value Parsing (`--set`)

When using --set KEY=VALUE, the runner automatically determines the value type by attempting each parse strategy in order:

Priority	Type	Detection	Example
1	JSON	Valid JSON object/array	`--set data='{"key": "val"}'`
2	Boolean	Literal `true` or `false`	`--set verbose=true`
3	Integer	Digits only (with optional sign)	`--set count=5`
4	Float	Numeric with decimal point	`--set ratio=0.5`
5	String	Everything else (fallback)	`--set name=hello`

# JSON — parsed as a structured object
recipe-runner-rs setup.yaml --set config='{"host": "localhost", "port": 8080}'
recipe-runner-rs setup.yaml --set tags='["web", "api"]'

# Boolean — parsed as bool
recipe-runner-rs build.yaml --set release=true
recipe-runner-rs build.yaml --set skip_tests=false

# Integer — parsed as i64
recipe-runner-rs scale.yaml --set workers=8
recipe-runner-rs scale.yaml --set retries=0

# Float — parsed as f64
recipe-runner-rs tune.yaml --set threshold=0.95
recipe-runner-rs tune.yaml --set learning_rate=0.001

# String — fallback for everything else
recipe-runner-rs deploy.yaml --set branch=main
recipe-runner-rs deploy.yaml --set message="deploy to production"

Environment Variables

`RECIPE_RUNNER_RECIPE_DIRS`

Additional recipe search directories, separated by colons. These directories are searched in addition to any specified via --recipe-dir.

# Set via environment
export RECIPE_RUNNER_RECIPE_DIRS="/opt/recipes:/home/user/.config/recipes"
recipe-runner-rs my-recipe

# Inline for a single invocation
RECIPE_RUNNER_RECIPE_DIRS=./recipes recipe-runner-rs list

# Combine with --recipe-dir (both are searched)
export RECIPE_RUNNER_RECIPE_DIRS="/opt/shared-recipes"
recipe-runner-rs my-recipe --recipe-dir ./local-recipes

Usage Examples

Basic Usage

# Run a recipe by file path
recipe-runner-rs ./recipes/build.yaml

# Run a recipe by name (searched in recipe directories)
recipe-runner-rs build

# List all discoverable recipes
recipe-runner-rs list

CI/CD Pipeline

# Validate, then run with JSON output and auditing
recipe-runner-rs deploy.yaml --validate-only \
  && recipe-runner-rs deploy.yaml \
    --set environment=production \
    --set version="$(git describe --tags)" \
    --output-format json \
    --audit-dir /var/log/deploys \
    --progress

Development Workflow

# Preview what a recipe will do
recipe-runner-rs refactor.yaml --explain

# Dry-run with overrides to test logic
recipe-runner-rs refactor.yaml --dry-run \
  --set target_module=auth \
  --set aggressive=true

# Run without auto-staging to review changes manually
recipe-runner-rs refactor.yaml \
  --set target_module=auth \
  --no-auto-stage

Selective Step Execution

# Run only unit tests
recipe-runner-rs ci.yaml --include-tags unit

# Run everything except slow tests
recipe-runner-rs ci.yaml --exclude-tags slow,integration

# Explain which steps match the filter
recipe-runner-rs ci.yaml --include-tags unit --explain

Multi-Directory Recipe Management

# Search across project, team, and global recipes
recipe-runner-rs list \
  -R ./recipes \
  -R ~/team-recipes \
  -R /opt/global-recipes

# Or use the environment variable
export RECIPE_RUNNER_RECIPE_DIRS="./recipes:~/team-recipes:/opt/global-recipes"
recipe-runner-rs list

Scripting and Automation

# Capture JSON output for downstream processing
output=$(recipe-runner-rs analyze.yaml --output-format json)
echo "$output" | jq '.summary'

# Run with full observability
recipe-runner-rs deploy.yaml \
  --output-format json \
  --progress \
  --audit-dir ./audit \
  --set environment=production \
  2>progress.log \
  1>result.json

Condition Language Reference

The recipe runner’s condition evaluator is a hand-rolled tokenizer + recursive-descent parser implemented in src/context.rs. Conditions are expressions evaluated to determine if a step should execute. If the condition evaluates to truthy, the step runs; otherwise it’s skipped.

If evaluation itself fails (e.g., a syntax error), the step is marked Failed — not skipped.

Truthiness

Type	Truthy	Falsy
Boolean	`true`	`false`
Number	Any non-zero (e.g., `1`, `-3.14`)	`0`, `0.0`
String	Non-empty (e.g., `"hello"`)	Empty string `""`
Array	Non-empty	Empty `[]`
Object	Non-empty	Empty `{}`
Null	—	Always falsy

Operators

Listed by precedence, lowest to highest:

Precedence	Operator	Kind	Description
1 (lowest)	`or`	Logical	Short-circuit logical OR
2	`and`	Logical	Short-circuit logical AND
3	`not`	Unary	Logical negation (prefix)
4 (highest)	`==`	Comparison	Equality (with type coercion)
	`!=`	Comparison	Inequality
	`<`	Comparison	Less than
	`<=`	Comparison	Less than or equal
	`>`	Comparison	Greater than
	`>=`	Comparison	Greater than or equal
	`in`	Membership	Substring or array membership
	`not in`	Membership	Negated membership (parsed as one token)

Type coercion in comparisons

Equality (==, !=): Same types compare directly. Mixed types fall back to comparing string representations (so 5 == "5" is true).
Ordering (<, <=, >, >=): Number–Number is numeric. String–String is lexicographic. String–Number attempts to parse the string as f64 then compares numerically. All other combinations are incomparable (condition evaluates as falsy).
Membership (in, not in): Against a string, checks substring containment. Against an array, checks element equality via values_equal. Against any other type, evaluates as falsy.

Literals

Type	Syntax	Notes
String	`"hello"` or `'world'`	Single or double quotes. Backslash escapes supported (`\'`, `\"`).
Number	`42`, `3.14`, `-7`	All parsed and stored as `f64`.
Boolean	`true`, `True`, `false`, `False`	Case-sensitive to these exact forms.
None	`none`	Not a keyword — it’s an unknown identifier that resolves to `Null`.

Identifiers

Identifiers are alphanumeric names (plus underscores) that look up values in the recipe context.

Form	Example	Behavior
Simple	`my_var`	Looks up `my_var` in the top-level context.
Dot-notation	`result.status`	Nested lookup: `context["result"]["status"]`.
Unknown	`undefined_var`	Resolves to `Null` (falsy). No error raised.

Dot-notation in identifiers is resolved during parsing — each segment walks one level deeper into nested JSON values. If any segment is missing, the whole expression resolves to Null.

Function Calls

Only whitelisted function names are allowed. Calling an unknown function is an error.

Function	Signature	Description
`int(value)`	1 arg	Convert to integer (i64). Strings are parsed, bools → 0/1, else 0.
`float(value)`	1 arg	Convert to f64. Strings are parsed, bools → 0.0/1.0, else 0.0.
`str(value)`	1 arg	Convert to string. Null → `""`. Numbers use serde’s `to_string()`.
`bool(value)`	1 arg	Convert to boolean using the truthiness rules above.
`len(value)`	1 arg	Length of string (bytes), array, or object. Other types return 0.
`min(a, b, ...)`	2+ args	Minimum of values (uses ordering comparison). Requires at least 2 args.
`max(a, b, ...)`	2+ args	Maximum of values (uses ordering comparison). Requires at least 2 args.

Method Calls

Methods use .method(args) syntax and can only be called on string values. Calling a method on a non-string is an error. Only whitelisted method names are allowed.

Method	Returns	Description
`.strip()`	String	Trim whitespace from both ends.
`.lstrip()`	String	Trim whitespace from the left (start).
`.rstrip()`	String	Trim whitespace from the right (end).
`.lower()`	String	Convert to lowercase.
`.upper()`	String	Convert to uppercase.
`.title()`	String	Title-case each whitespace-separated word.
`.startswith(prefix)`	Boolean	True if string starts with `prefix`.
`.endswith(suffix)`	Boolean	True if string ends with `suffix`.
`.replace(old, new)`	String	Replace all occurrences of `old` with `new`.
`.split(sep)`	Array	Split by `sep`. If no arg, splits on whitespace.
`.join(arr)`	String	Join array elements with the string as separator.
`.count(sub)`	Number	Count non-overlapping occurrences of `sub`.
`.find(sub)`	Number	Index of first occurrence of `sub`. Returns `-1` if not found.

Methods can be chained: name.strip().lower().

Safety Features

Whitelist-only execution — Only the functions and methods listed above are allowed. Unknown names produce an error, not silent null.
Dunder blocking — Any expression containing __ (e.g., __class__, __import__) is rejected before parsing even begins.
No assignment, no side effects — The expression language is pure; it can only read context values and compute results.
Unknown identifiers are null — Referencing a variable that doesn’t exist returns Null (falsy) rather than raising an error. This is intentional for optional-variable patterns.

Important Gotchas

1. All numbers are f64

Numbers are parsed and stored as f64 internally. This means str(42) produces "42.0", not "42". If you need the integer string representation, store it as a string in the context instead of using str() on a numeric literal.

2. `shell_escape::escape()` wraps values in single quotes

When using render_shell() for template expansion, empty strings become '' (two single quotes), not the empty string. This is correct for shell safety but may surprise you in conditions that check the rendered result.

3. Unknown identifiers are null by design

This is a feature, not a bug. It allows patterns like condition: "optional_var" to work — if optional_var isn’t set, the condition is falsy and the step is skipped without error.

4. `not in` is a single operator

The tokenizer uses lookahead to parse not in as one token (NotIn), distinct from a standalone not followed by in. This means not in always means “not contained in”, never “negation of the result of in” — though the result is the same.

5. Boolean keywords are case-sensitive

Only true/True and false/False are recognized. TRUE, FALSE, tRue, etc. are treated as regular identifiers and will resolve to Null.

6. `none` is not a keyword

There is no none or None literal. Writing none creates an identifier lookup that (typically) resolves to Null because no context variable named none exists. This works in practice but is not guaranteed if someone sets a context variable called none.

Examples

Basic truthiness

# Truthy if 'analysis' is set and non-empty in context
condition: "analysis"

# Always true
condition: "true"

# Always false
condition: "false"

String comparison

# Exact match
condition: "status == 'success'"

# Not equal
condition: "status != 'error'"

# Case-insensitive comparison via method
condition: "status.lower() == 'success'"

Numeric comparison

# Greater than
condition: "count > 0"

# Compound range check
condition: "count > 0 and count < 10"

# With function conversion
condition: "int(exit_code) == 0"

Logical operators

# Negation
condition: "not skip_tests"

# AND
condition: "has_tests and not skip_tests"

# OR
condition: "use_cache or force_rebuild"

# Combined with parentheses
condition: "(status == 'success' or status == 'partial') and not skip"

Membership tests

# Substring containment
condition: "'error' in output"

# Negated containment
condition: "'error' not in output"

# Array membership (items is an array in context)
condition: "'admin' in roles"

Function calls

# Length check
condition: "len(items) > 0"

# Type conversion
condition: "int(retry_count) < 3"

# Boolean conversion
condition: "bool(result)"

# Min/max
condition: "max(score_a, score_b) >= 80"

Method calls

# String prefix check
condition: "name.startswith('test_')"

# String suffix check
condition: "filename.endswith('.py')"

# Chained methods
condition: "input.strip().lower() == 'yes'"

# Replace and check
condition: "path.replace('\\', '/').startswith('/home')"

# Split and check length
condition: "len(csv_line.split(',')) > 3"

# Find (returns index or -1)
condition: "message.find('WARNING') >= 0"

# Count occurrences
condition: "log_output.count('ERROR') == 0"

Nested context access

# Dot-notation for nested values
condition: "result.status == 'ok'"

# Deep nesting
condition: "response.data.count > 0"

Optional variable patterns

# Skip step if variable isn't set (resolves to null → falsy)
condition: "optional_feature"

# Guard with default-like logic
condition: "config.verbose and len(debug_output) > 0"

Cross-type equality

# Number-string coercion: this is true if exit_code is 0 (the number)
condition: "exit_code == '0'"

# But be careful: str(42) gives "42.0", not "42"
# So this does NOT work as expected:
#   condition: "str(count) == '42'"    # produces "42.0" == "42" → false

Tutorial Examples

Progressive tutorials that teach one recipe runner feature at a time. Each tutorial is a self-contained YAML recipe you can run directly.

Source: examples/tutorials/

Tutorials

#	Recipe	Feature	Run it
01	hello-world	Simplest recipe — one bash step	`recipe-runner-rs examples/tutorials/01-hello-world.yaml`
02	variables	Template `{{variables}}` and context	`recipe-runner-rs examples/tutorials/02-variables.yaml`
03	conditions	Conditional step execution	`recipe-runner-rs examples/tutorials/03-conditions.yaml`
04	multi-step-pipeline	Sequential steps with output chaining	`recipe-runner-rs examples/tutorials/04-multi-step-pipeline.yaml`
05	working-directories	Per-step `working_dir`	`recipe-runner-rs examples/tutorials/05-working-directories.yaml`
06	parse-json	JSON extraction from output	`recipe-runner-rs examples/tutorials/06-parse-json.yaml`
07	error-handling	`continue_on_error`	`recipe-runner-rs examples/tutorials/07-error-handling.yaml`
08	hooks	Pre/post/on_error hooks	`recipe-runner-rs examples/tutorials/08-hooks.yaml`
09	tags	`when_tags` + `--include-tags`	`recipe-runner-rs examples/tutorials/09-tags.yaml --include-tags fast`
10	parallel-groups	`parallel_group` concurrent execution	`recipe-runner-rs examples/tutorials/10-parallel-groups.yaml`
11	extends	Recipe inheritance via `extends`	`recipe-runner-rs examples/tutorials/11-extends.yaml`
12	recursion-limits	`recursion` config	`recipe-runner-rs examples/tutorials/12-recursion-limits.yaml`
13	timeouts	Step-level `timeout`	`recipe-runner-rs examples/tutorials/13-timeouts.yaml`
14	dry-run	`--dry-run` mode	`recipe-runner-rs examples/tutorials/14-dry-run.yaml --dry-run`

Recommended Order

Start with 01-hello-world and work through sequentially. Each tutorial builds on concepts from previous ones.

Workflow Pattern Examples

Real-world workflow patterns that show how to compose recipe runner features for common development scenarios.

Source: examples/patterns/

Patterns

Pattern	Recipe	Description
CI Pipeline	ci-pipeline.yaml	Gated build pipeline: checkout → deps → lint → test → build → package. Each step gates on prior success.
Code Review	code-review.yaml	Automated review: git diff → agent analysis → issue detection → review comments.
Deploy Pipeline	deploy-pipeline.yaml	Full deployment: pre-flight → build → integration test → staging → smoke test → promote.
Investigation	investigation.yaml	Systematic research: scope → explore (find/grep) → analyze → synthesize → document.
Migration	migration.yaml	Fail-fast migration: backup → validate → migrate → smoke test → verify.
Multi-Agent Consensus	multi-agent-consensus.yaml	Multiple agents analyze independently → synthesize votes → apply decision.
Quality Audit	quality-audit.yaml	Audit loop: lint → analyze → fix → re-lint → verify improvement.
Self-Improvement	self-improvement.yaml	Closed loop: eval → analyze errors → research → apply → re-eval → compare.

Combining Patterns

Patterns compose via sub-recipe steps, hooks, tags, and parallel groups. Here’s a full deployment recipe that chains three patterns together — CI first, then review, then deploy — with quality audit as a gate between stages:

name: "ship-release"
description: "CI → Review → Quality Gate → Deploy"
version: "1.0"

context:
  repo_path: "."
  environment: "staging"

hooks:
  on_error: "echo 'Pipeline failed at step: $STEP_ID' >> pipeline.log"

steps:
  # ── Stage 1: Build & Test (sub-recipe) ──
  - id: "ci"
    recipe: "ci-pipeline"
    context:
      repo_path: "{{repo_path}}"
    output: "ci_result"

  # ── Stage 2: Parallel code reviews ──
  - id: "security-review"
    agent: "amplihack:security"
    parallel_group: "reviews"
    prompt: "Review {{repo_path}} for security vulnerabilities."
    output: "security_findings"

  - id: "architecture-review"
    agent: "amplihack:architect"
    parallel_group: "reviews"
    prompt: "Review {{repo_path}} for architectural issues."
    output: "arch_findings"

  # ── Stage 3: Quality gate (sub-recipe, conditional) ──
  - id: "quality-gate"
    recipe: "quality-audit"
    condition: "ci_result and 'PASS' in ci_result"
    context:
      repo_path: "{{repo_path}}"
    output: "audit_result"

  # ── Stage 4: Deploy (tagged — only runs with --include-tags release) ──
  - id: "deploy"
    recipe: "deploy-pipeline"
    when_tags: ["release"]
    condition: "'PASS' in audit_result"
    context:
      repo_path: "{{repo_path}}"
      environment: "{{environment}}"
    output: "deploy_result"

  # ── Notification ──
  - id: "notify"
    command: |
      echo "Release pipeline complete."
      echo "CI: {{ci_result}}"
      echo "Audit: {{audit_result}}"
      echo "Deploy: {{deploy_result}}"

This recipe demonstrates:

Sub-recipes (recipe:) — CI, quality audit, and deploy each run as self-contained workflows
Parallel groups (parallel_group:) — security and architecture reviews run concurrently
Conditional gates (condition:) — quality audit only runs if CI passed; deploy only if audit passed
Tag filtering (when_tags:) — deploy step only executes when --include-tags release is passed
Error hooks (hooks.on_error:) — logs which step failed for post-mortem
Output chaining — each stage’s result flows into the next stage’s conditions

Testing & Edge-Case Recipes

Recipes designed to exercise specific recipe runner features and edge cases. Useful as regression tests and as references for condition syntax.

Source: recipes/testing/

Recipes

Recipe	What It Tests
all-condition-operators	Every comparison and boolean operator: `==`, `!=`, `<`, `<=`, `>`, `>=`, `and`, `or`, `not`, `in`, `not in`
all-functions	All whitelisted functions: `int()`, `str()`, `len()`, `bool()`, `float()`, `min()`, `max()`
all-methods	All whitelisted string methods: `strip()`, `lstrip()`, `rstrip()`, `lower()`, `upper()`, `title()`, `startswith()`, `endswith()`, `replace()`, `split()`, `join()`, `count()`, `find()`
output-chaining	Step output stored in context and referenced by subsequent steps via `{{variable}}`
json-extraction-strategies	All 3 JSON extraction strategies: direct parse, markdown fence, balanced braces
step-type-inference	Automatic step type detection: bash (command), agent (agent field), recipe (recipe field), agent (prompt-only)
continue-on-error-chain	`continue_on_error: true` allowing subsequent steps to run after failures
nested-context	Dot-notation access to nested context values: `{{config.database.host}}`
large-context	Many context variables and long values to test template rendering at scale
empty-and-edge-cases	Empty strings, missing variables, whitespace-only values, special characters

Production Recipes

These recipes ship with amplihack and demonstrate real-world workflow patterns at scale.

Source: amplifier-bundle/recipes/

Development Workflows

Recipe	Description
default-workflow	Complete development lifecycle: requirements → design → implement → test → merge
verification-workflow	Lightweight workflow for trivial changes: config edits, doc updates, single-file fixes
qa-workflow	Minimal workflow for simple questions and informational requests
investigation-workflow	Systematic investigation with parallel agent deployment
guide	Interactive guide to amplihack features

Quality & Reliability

Recipe	Description
quality-audit-cycle	Iterative audit loop: lint → analyze → fix → re-lint → verify improvement
self-improvement-loop	Closed-loop eval improvement: eval → analyze → research → improve → re-eval → compare
domain-agent-eval	Evaluate domain agents: eval harness + teaching evaluation + combined report
long-horizon-memory-eval	1000-turn memory stress test with self-improvement loop
sdk-comparison	Run L1-L12 eval on all 4 SDKs and generate comparative report

Multi-Agent Decision Making

Recipe	Description
consensus-workflow	Multi-agent consensus at critical decision points with structured checkpoints
debate-workflow	Multi-agent structured debate for complex decisions requiring diverse perspectives
n-version-workflow	N-version programming: generate multiple independent implementations, pick best
cascade-workflow	3-level fallback cascade: primary → secondary → tertiary

Orchestration

Recipe	Description
smart-orchestrator	Task classifier + goal-seeking loop with up to 3 execution rounds
auto-workflow	Autonomous multi-turn workflow — continues until task complete or max iterations

Migration

Recipe	Description
oxidizer-workflow	Automated Python-to-Rust migration with quality audit cycles and degradation checks

Architecture — amplihack-recipe-runner

Rust implementation of the amplihack recipe runner. Parses YAML recipe files, evaluates conditions in a sandboxed expression language, and executes steps (bash commands, AI agent prompts, or nested sub-recipes) through a pluggable adapter layer.

Module Dependency Diagram

┌─────────────────────────────────────────────────────────────────────┐
│                           main.rs (CLI)                            │
│  clap args → parse → build runner → execute → format output        │
└──────┬──────────┬───────────┬──────────┬───────────┬───────────────┘
       │          │           │          │           │
       ▼          ▼           ▼          ▼           ▼
   parser.rs  runner.rs  discovery.rs  adapters/  models.rs
       │       │  │  │       │        cli_subprocess.rs
       │       │  │  │       │              │
       │       │  │  └───────┘              │
       │       │  │                         │
       │       ▼  ▼                         │
       │  context.rs  agent_resolver.rs     │
       │       │                            │
       └───────┴────────────────────────────┘
                  models.rs (shared types)

graph TD
    main[main.rs — CLI] --> parser[parser.rs]
    main --> runner[runner.rs]
    main --> discovery[discovery.rs]
    main --> cli_sub[cli_subprocess.rs]

    lib[lib.rs — Public API] --> parser
    lib --> runner
    lib --> discovery

    runner --> context[context.rs]
    runner --> agent_resolver[agent_resolver.rs]
    runner --> discovery
    runner --> adapters[adapters/mod.rs — Adapter trait]

    cli_sub --> adapters

    parser --> models[models.rs]
    runner --> models
    context --> models
    discovery --> models
    main --> models
    lib --> models

Module Roles at a Glance

Module	Responsibility
`main.rs`	CLI interface (clap), subcommands, output formatting
`lib.rs`	Public library API for embedding
`models.rs`	Shared data types (Recipe, Step, StepResult, …)
`parser.rs`	YAML deserialization, validation, typo detection
`context.rs`	Template rendering, sandboxed condition evaluation
`runner.rs`	Orchestration: hooks, conditions, audit, recursion
`agent_resolver.rs`	Agent reference → markdown file resolution
`discovery.rs`	Multi-directory recipe discovery and manifest sync
`adapters/mod.rs`	`Adapter` trait definition
`adapters/cli_subprocess.rs`	Subprocess execution for bash and agent steps

Data Flow

  YAML file
      │
      ▼
 ┌──────────┐   file size check    ┌────────────┐
 │ parser.rs │ ──────────────────► │ serde_yaml  │
 └──────────┘   MAX_YAML_SIZE 1MB  │ deserialize │
      │                            └─────┬──────┘
      │  validate: name, steps,          │
      │  unique IDs, field typos         ▼
      │                           Recipe (models.rs)
      ▼
 ┌──────────┐   merge recipe.context
 │ runner.rs │   + user overrides (--set)
 └──────────┘
      │
      │  for each step:
      │    1. Tag filter (when_tags vs active/exclude)
      │    2. Condition evaluation (context.evaluate)
      │    3. Template rendering (context.render / render_shell)
      │    4. Dispatch: Bash │ Agent │ Sub-Recipe
      │    5. Optional JSON parse of output
      │    6. Store output in context
      │    7. Write JSONL audit entry
      │    8. Run post_step / on_error hook
      │
      ▼
 RecipeResult
   ├── success: bool
   ├── step_results: Vec<StepResult>
   ├── context: final variable state
   └── duration: wall-clock time

Parse Phase

RecipeParser::parse_file reads the file and rejects anything over 1 MB (YAML bomb protection).
serde_yaml deserializes into Recipe. Step fields like command, agent, prompt, and recipe determine the implicit StepType via Step::effective_type().
Structural validation: name must be non-empty, at least one step required, step IDs must be unique.
validate_with_yaml inspects raw YAML keys and reports unknown fields using edit-distance typo detection (e.g., “comand” → did you mean “command”?).

Execute Phase

RecipeRunner::execute merges the recipe’s context map with any user-supplied --set KEY=VALUE overrides, then iterates steps sequentially:

Tag filter — should_skip_by_tags checks when_tags against active_tags / exclude_tags.
Condition — RecipeContext::evaluate runs a sandboxed boolean expression (see Safety Model).
Dispatch — routes to execute_bash_step, execute_agent_step, or execute_sub_recipe on the adapter.
Output capture — if parse_json is set, the runner tries three extraction strategies (direct parse → markdown fence → balanced brackets), with an optional retry that re-prompts the agent for JSON-only output.
Context update — step output is stored under step.output (or step.id) in the context for downstream templates.
Hooks — pre_step runs before dispatch, post_step after success, on_error after failure. Hook commands are rendered through the context.
Audit — each step result is appended to a JSONL file (<audit_dir>/<recipe>_<timestamp>.jsonl).

Core Types (models.rs)

Step

#![allow(unused)]
fn main() {
struct Step {
    id:                String,
    command:           Option<String>,       // Bash step
    agent:             Option<String>,       // Agent reference
    prompt:            Option<String>,       // Agent prompt
    recipe:            Option<String>,       // Sub-recipe name
    output:            Option<String>,       // Context variable for result
    condition:         Option<String>,       // Boolean expression
    parse_json:        Option<bool>,         // Auto-parse output as JSON
    mode:              Option<String>,       // Execution mode
    working_dir:       Option<String>,       // Override cwd
    timeout:           Option<u64>,          // Seconds
    auto_stage:        Option<bool>,         // git add -A after agent steps
    continue_on_error: Option<bool>,         // Don't fail-fast
    when_tags:         Option<Vec<String>>,  // Tag-based filtering
    parallel_group:    Option<String>,       // Concurrent step grouping (fully implemented)
    sub_context:       Option<HashMap<…>>,   // Context overrides for sub-recipe
}
}

Step::effective_type() infers the step type from which fields are present: recipe → Recipe, agent/prompt → Agent, command → Bash.

Recipe

#![allow(unused)]
fn main() {
struct Recipe {
    name:        String,
    version:     Option<String>,
    description: Option<String>,
    author:      Option<String>,
    tags:        Option<Vec<String>>,
    context:     Option<HashMap<String, Value>>,
    steps:       Vec<Step>,
    recursion:   Option<RecursionConfig>,   // max_depth (6), max_total_steps (200)
    hooks:       Option<RecipeHooks>,       // pre_step, post_step, on_error
    extends:     Option<String>,            // Parent recipe (inheritance)
}
}

Result Types

#![allow(unused)]
fn main() {
struct StepResult {
    step_id:  String,
    status:   StepStatus,   // Pending | Running | Completed | Skipped | Failed
    output:   Option<String>,
    error:    Option<String>,
    duration: Duration,
}

struct RecipeResult {
    recipe_name:  String,
    success:      bool,
    step_results: Vec<StepResult>,
    context:      HashMap<String, Value>,   // Final state (skipped in JSON serialization)
    duration:     Duration,
}
}

CLI Interface (main.rs)

recipe-runner-rs [OPTIONS] [RECIPE] [COMMAND]

Commands:
  list   List discovered recipes

Arguments:
  [RECIPE]   Path to a .yaml recipe file

Options:
  -C, --working-dir <DIR>        Working directory (default: ".")
  -R, --recipe-dir <DIR>         Additional recipe search directories (repeatable)
      --set <KEY=VALUE>           Context variable overrides (repeatable)
      --dry-run                   Log steps without executing
      --validate-only             Parse and validate, then exit
      --explain                   Print step plan without executing
      --progress                  Emit progress to stderr (StderrListener)
      --include-tags <TAGS>       Only run steps matching these tags (comma-separated)
      --exclude-tags <TAGS>       Skip steps matching these tags (comma-separated)
      --audit-dir <DIR>           Directory for JSONL audit logs
      --output-format <FMT>      Output format: text (default) or json

--set values are auto-typed: JSON objects/arrays are parsed as-is, true/false become booleans, numeric strings become numbers, everything else stays a string.

Adapter Pattern

The Adapter trait decouples the runner from any specific execution backend:

#![allow(unused)]
fn main() {
trait Adapter {
    fn execute_agent_step(
        &self, prompt: &str, agent_name: &str,
        system_prompt: Option<&str>, mode: Option<&str>,
        working_dir: Option<&str>, model: Option<&str>,
    ) -> Result<String>;

    fn execute_bash_step(
        &self, command: &str, working_dir: Option<&str>,
        timeout: Option<u64>,
    ) -> Result<String>;

    fn is_available(&self) -> bool;
    fn name(&self) -> &str;
}
}

CLISubprocessAdapter

The production adapter spawns subprocesses:

Bash steps — /bin/bash -c <command>, optionally wrapped with timeout.
Agent steps — claude -p <prompt> in an isolated temp directory. A NON_INTERACTIVE_FOOTER (“Proceed autonomously. Do not ask questions.”) is appended to prevent the nested Claude session from hanging on prompts.

Timeout enforcement: A background heartbeat thread monitors the deadline. It logs progress every 2 seconds. On expiry it sends SIGTERM, waits 5 seconds, then escalates to SIGKILL.

Environment propagation: build_child_env() forwards session-tracking variables (AMPLIHACK_SESSION_DEPTH, AMPLIHACK_TREE_ID, AMPLIHACK_MAX_DEPTH, AMPLIHACK_MAX_SESSIONS) and strips CLAUDECODE to prevent nested session confusion.

Execution Flow

Lifecycle of a Recipe Run

CLI args
  │
  ├─ --validate-only ──► parse + validate ──► print warnings ──► exit
  ├─ --explain ─────────► parse ──► print step plan ──► exit
  │
  ▼
RecipeRunner::execute(recipe, user_context)
  │
  ├─ Check recursion limits (depth ≤ max_depth, total_steps ≤ max_total_steps)
  ├─ Merge recipe.context + user_context
  ├─ Open JSONL audit log (if --audit-dir set)
  │
  │  ┌─── for each step ──────────────────────────────────────────┐
  │  │                                                             │
  │  │  1. should_skip_by_tags(step) ──► skip if filtered out      │
  │  │  2. run_hook(pre_step)                                      │
  │  │  3. evaluate condition ──► Skipped if false                 │
  │  │  4. render templates in command/prompt                      │
  │  │  5. dispatch:                                               │
  │  │     ├─ Bash  → adapter.execute_bash_step()                  │
  │  │     ├─ Agent → resolve agent, adapter.execute_agent_step()  │
  │  │     └─ Recipe → execute_sub_recipe() (recursive)            │
  │  │  6. parse JSON output (if parse_json, with retry)           │
  │  │  7. store output in context                                 │
  │  │  8. maybe_auto_stage (git add -A for agent steps)           │
  │  │  9. run_hook(post_step) or run_hook(on_error)               │
  │  │ 10. write JSONL audit entry                                 │
  │  │ 11. fail-fast unless continue_on_error                      │
  │  │                                                             │
  │  └─────────────────────────────────────────────────────────────┘
  │
  ▼
RecipeResult { success, step_results, context, duration }

Sub-Recipe Execution

When a step has step_type: Recipe:

The runner searches for the recipe file using discovery::find_recipe across recipe_search_dirs, then falls back to a direct path relative to working_dir.
Recursion depth is checked against RecursionConfig::max_depth (default 6). total_steps is checked against max_total_steps (default 200).
The sub-recipe’s context inherits from the parent context, merged with any sub_context overrides defined on the step.
A new execute_with_depth(recipe, context, depth + 1) call runs the sub-recipe. Depth and total-step counters are tracked via Cell<u32>.
After execution, the sub-recipe’s final context is propagated back into the parent context.

Hooks

Defined in RecipeHooks:

hooks:
  pre_step: "echo 'Starting step {{step_id}}'"
  post_step: "echo 'Completed step {{step_id}}'"
  on_error: "notify-send 'Step {{step_id}} failed'"

Hooks are shell commands rendered through the context. pre_step runs before every step dispatch. post_step runs after a successful step. on_error runs after a failed step. Hook failures are logged but do not abort the recipe.

Execution Listeners

The ExecutionListener trait provides real-time progress callbacks:

#![allow(unused)]
fn main() {
trait ExecutionListener {
    fn on_step_start(&self, step_id: &str, step_type: &str);
    fn on_step_complete(&self, result: &StepResult);
    fn on_output(&self, step_id: &str, line: &str);
}
}

Implementation	Behavior
`NullListener`	No-op (default)
`StderrListener`	Emits progress emojis and timing to stderr

Activated with --progress.

Safety Model

Condition Evaluator (context.rs)

The condition evaluator is a hand-written recursive descent parser that evaluates boolean expressions over recipe context variables. It does not call eval() or execute arbitrary code.

Supported syntax:

status == "ok" and (retries < 3 or force == true)
len(items) > 0
name.startswith("test_")
value not in "blocked,disabled"

Operator precedence (low → high): or, and, not, comparison (==, !=, <, <=, >, >=, in, not in).

Security constraints:

Rule	Rationale
No `__` (dunder) access	Blocks dunder attribute introspection
Whitelisted functions only	`int`, `str`, `len`, `bool`, `float`, `min`, `max`
Whitelisted methods only	`strip`, `lower`, `upper`, `startswith`, `endswith`, `replace`, `split`, `join`, `count`, `find`, and variants
No assignment operators	Expressions are read-only
No function definitions	Grammar does not support `fn`, `def`, `lambda`

The tokenizer produces typed tokens (String, Number, Ident, Eq, And, Or, …) and the parser consumes them with lookahead. Unrecognized tokens produce a parse error rather than silent misbehavior.

Template Rendering

RecipeContext::render replaces {{var}} placeholders with values from the context. Dot-notation ({{obj.nested.key}}) traverses into JSON objects. Missing variables render as empty strings.

RecipeContext::render_shell does the same but shell-escapes every substituted value to prevent injection in bash commands.

Agent Resolver Path Safety (agent_resolver.rs)

Agent references use a namespaced format (namespace:category:name or namespace:name). Each segment is validated against:

#![allow(unused)]
fn main() {
static SAFE_NAME_RE: Regex = Regex::new(r"^[a-zA-Z0-9_-]+$");
}

This rejects /, .., and any characters that could enable path traversal.

As defense-in-depth, after resolving the file path, the resolver canonicalizes both the candidate path and the search base directory, then verifies the resolved path is a child of the search base. This defends against symlink attacks.

Parser Protections (parser.rs)

File size limit: 1 MB (MAX_YAML_SIZE_BYTES). Prevents YAML bombs and memory exhaustion.
Structural validation: Rejects empty names, zero-step recipes, and duplicate step IDs.
Field typo detection: Unknown top-level and step-level fields trigger warnings. Edit-distance matching suggests corrections.

Subprocess Isolation (cli_subprocess.rs)

Agent steps execute in a fresh temporary directory that is cleaned up on drop.
CLAUDECODE is stripped from the child environment to prevent the nested Claude process from attaching to the parent’s session.
Session depth tracking (AMPLIHACK_SESSION_DEPTH) prevents runaway recursive spawning.

Interior Mutability Pattern

The runner tracks recursion state with std::cell::Cell<u32>:

#![allow(unused)]
fn main() {
struct RecipeRunner<A: Adapter> {
    // ...
    depth:       Cell<u32>,
    total_steps: Cell<u32>,
    // ...
}
}

Why Cell?

RecipeRunner::execute takes &self (shared reference) because the runner is logically immutable during a run — the adapter, working directory, tag filters, and listener never change. But recursion tracking requires mutating two counters.

Cell<u32> provides interior mutability for Copy types without runtime borrow-checking overhead (no RefCell needed). The runner is single-threaded, so Cell is sufficient and zero-cost.

Usage in Recursion

execute(&self, recipe, context)
    │
    ├─ self.depth.get() checked against max_depth
    ├─ self.total_steps.get() checked against max_total_steps
    │
    └─ execute_sub_recipe(&self, step, ctx)
         │
         ├─ self.depth.set(self.depth.get() + 1)
         ├─ execute_with_depth(&self, sub_recipe, ctx, new_depth)
         └─ self.total_steps.set(self.total_steps.get() + sub_step_count)

The RecursionConfig defaults (max_depth: 6, max_total_steps: 200) can be overridden per-recipe in the YAML:

recursion:
  max_depth: 3
  max_total_steps: 50

Recipe Discovery (discovery.rs)

Search Directories (default order)

~/.amplihack/.claude/recipes
./amplifier-bundle/recipes
./src/amplihack/amplifier-bundle/recipes
./.claude/recipes

Additional directories can be added with -R <dir> (repeatable).

Discovery Functions

Function	Purpose
`discover_recipes`	Scan directories, return `HashMap<name, RecipeInfo>`
`list_recipes`	Sorted `Vec<RecipeInfo>` for display
`find_recipe`	Locate a single recipe by name → `Option<PathBuf>`
`verify_global_installation`	Check that default dirs exist and contain recipes

Manifest & Upstream Sync

update_manifest writes _recipe_manifest.json — a map of filenames to their SHA-256 hashes (first 16 hex chars). check_upstream_changes diffs the current directory state against the manifest and reports new, modified, or deleted files.

sync_upstream adds a git remote, fetches, and diffs local recipes against the upstream branch, returning a summary of changes.

JSONL Audit Log

When --audit-dir is set, each recipe run produces a file:

<audit-dir>/<recipe-name>_<ISO-timestamp>.jsonl

Each line is a JSON object:

{"step_id": "build", "status": "Completed", "duration_ms": 1423, "error": null, "output_len": 256}

Audit logs enable post-hoc analysis of recipe execution without cluttering stdout.

JSON Output Extraction

When parse_json: true is set on a step, the runner extracts structured JSON from potentially noisy output using three strategies (tried in order):

Direct parse — serde_json::from_str(output). Works when the output is pure JSON.
Markdown fence — extracts content between ```json and ``` delimiters. Common in LLM output.
Balanced brackets — finds the first { or [ and matches it to its closing counterpart, counting nesting depth.

If all three fail, the runner optionally retries the agent step with a JSON-only reminder appended to the prompt, then re-applies the extraction pipeline.

Keyboard shortcuts

amplihack Recipe Runner (Rust)