Architecture — amplihack-recipe-runner
Rust implementation of the amplihack recipe runner. Parses YAML recipe files, evaluates conditions in a sandboxed expression language, and executes steps (bash commands, AI agent prompts, or nested sub-recipes) through a pluggable adapter layer.
Module Dependency Diagram
┌─────────────────────────────────────────────────────────────────────┐
│ main.rs (CLI) │
│ clap args → parse → build runner → execute → format output │
└──────┬──────────┬───────────┬──────────┬───────────┬───────────────┘
│ │ │ │ │
▼ ▼ ▼ ▼ ▼
parser.rs runner.rs discovery.rs adapters/ models.rs
│ │ │ │ │ cli_subprocess.rs
│ │ │ │ │ │
│ │ │ └───────┘ │
│ │ │ │
│ ▼ ▼ │
│ context.rs agent_resolver.rs │
│ │ │
└───────┴────────────────────────────┘
models.rs (shared types)
graph TD
main[main.rs — CLI] --> parser[parser.rs]
main --> runner[runner.rs]
main --> discovery[discovery.rs]
main --> cli_sub[cli_subprocess.rs]
lib[lib.rs — Public API] --> parser
lib --> runner
lib --> discovery
runner --> context[context.rs]
runner --> agent_resolver[agent_resolver.rs]
runner --> discovery
runner --> adapters[adapters/mod.rs — Adapter trait]
cli_sub --> adapters
parser --> models[models.rs]
runner --> models
context --> models
discovery --> models
main --> models
lib --> models
Module Roles at a Glance
| Module | Responsibility |
|---|---|
main.rs | CLI interface (clap), subcommands, output formatting |
lib.rs | Public library API for embedding |
models.rs | Shared data types (Recipe, Step, StepResult, …) |
parser.rs | YAML deserialization, validation, typo detection |
context.rs | Template rendering, sandboxed condition evaluation |
runner.rs | Orchestration: hooks, conditions, audit, recursion |
agent_resolver.rs | Agent reference → markdown file resolution |
discovery.rs | Multi-directory recipe discovery and manifest sync |
adapters/mod.rs | Adapter trait definition |
adapters/cli_subprocess.rs | Subprocess execution for bash and agent steps |
Data Flow
YAML file
│
▼
┌──────────┐ file size check ┌────────────┐
│ parser.rs │ ──────────────────► │ serde_yaml │
└──────────┘ MAX_YAML_SIZE 1MB │ deserialize │
│ └─────┬──────┘
│ validate: name, steps, │
│ unique IDs, field typos ▼
│ Recipe (models.rs)
▼
┌──────────┐ merge recipe.context
│ runner.rs │ + user overrides (--set)
└──────────┘
│
│ for each step:
│ 1. Tag filter (when_tags vs active/exclude)
│ 2. Condition evaluation (context.evaluate)
│ 3. Template rendering (context.render / render_shell)
│ 4. Dispatch: Bash │ Agent │ Sub-Recipe
│ 5. Optional JSON parse of output
│ 6. Store output in context
│ 7. Write JSONL audit entry
│ 8. Run post_step / on_error hook
│
▼
RecipeResult
├── success: bool
├── step_results: Vec<StepResult>
├── context: final variable state
└── duration: wall-clock time
Parse Phase
RecipeParser::parse_filereads the file and rejects anything over 1 MB (YAML bomb protection).serde_yamldeserializes intoRecipe. Step fields likecommand,agent,prompt, andrecipedetermine the implicitStepTypeviaStep::effective_type().- Structural validation: name must be non-empty, at least one step required, step IDs must be unique.
validate_with_yamlinspects raw YAML keys and reports unknown fields using edit-distance typo detection (e.g., “comand” → did you mean “command”?).
Execute Phase
RecipeRunner::execute merges the recipe’s context map with any user-supplied
--set KEY=VALUE overrides, then iterates steps sequentially:
- Tag filter —
should_skip_by_tagscheckswhen_tagsagainstactive_tags/exclude_tags. - Condition —
RecipeContext::evaluateruns a sandboxed boolean expression (see Safety Model). - Dispatch — routes to
execute_bash_step,execute_agent_step, orexecute_sub_recipeon the adapter. - Output capture — if
parse_jsonis set, the runner tries three extraction strategies (direct parse → markdown fence → balanced brackets), with an optional retry that re-prompts the agent for JSON-only output. - Context update — step output is stored under
step.output(orstep.id) in the context for downstream templates. - Hooks —
pre_stepruns before dispatch,post_stepafter success,on_errorafter failure. Hook commands are rendered through the context. - Audit — each step result is appended to a JSONL file
(
<audit_dir>/<recipe>_<timestamp>.jsonl).
Core Types (models.rs)
Step
#![allow(unused)]
fn main() {
struct Step {
id: String,
command: Option<String>, // Bash step
agent: Option<String>, // Agent reference
prompt: Option<String>, // Agent prompt
recipe: Option<String>, // Sub-recipe name
output: Option<String>, // Context variable for result
condition: Option<String>, // Boolean expression
parse_json: Option<bool>, // Auto-parse output as JSON
mode: Option<String>, // Execution mode
working_dir: Option<String>, // Override cwd
timeout: Option<u64>, // Seconds
auto_stage: Option<bool>, // git add -A after agent steps
continue_on_error: Option<bool>, // Don't fail-fast
when_tags: Option<Vec<String>>, // Tag-based filtering
parallel_group: Option<String>, // Concurrent step grouping (fully implemented)
sub_context: Option<HashMap<…>>, // Context overrides for sub-recipe
}
}
Step::effective_type() infers the step type from which fields are present:
recipe → Recipe, agent/prompt → Agent, command → Bash.
Recipe
#![allow(unused)]
fn main() {
struct Recipe {
name: String,
version: Option<String>,
description: Option<String>,
author: Option<String>,
tags: Option<Vec<String>>,
context: Option<HashMap<String, Value>>,
steps: Vec<Step>,
recursion: Option<RecursionConfig>, // max_depth (6), max_total_steps (200)
hooks: Option<RecipeHooks>, // pre_step, post_step, on_error
extends: Option<String>, // Parent recipe (inheritance)
}
}
Result Types
#![allow(unused)]
fn main() {
struct StepResult {
step_id: String,
status: StepStatus, // Pending | Running | Completed | Skipped | Failed
output: Option<String>,
error: Option<String>,
duration: Duration,
}
struct RecipeResult {
recipe_name: String,
success: bool,
step_results: Vec<StepResult>,
context: HashMap<String, Value>, // Final state (skipped in JSON serialization)
duration: Duration,
}
}
CLI Interface (main.rs)
recipe-runner-rs [OPTIONS] [RECIPE] [COMMAND]
Commands:
list List discovered recipes
Arguments:
[RECIPE] Path to a .yaml recipe file
Options:
-C, --working-dir <DIR> Working directory (default: ".")
-R, --recipe-dir <DIR> Additional recipe search directories (repeatable)
--set <KEY=VALUE> Context variable overrides (repeatable)
--dry-run Log steps without executing
--validate-only Parse and validate, then exit
--explain Print step plan without executing
--progress Emit progress to stderr (StderrListener)
--include-tags <TAGS> Only run steps matching these tags (comma-separated)
--exclude-tags <TAGS> Skip steps matching these tags (comma-separated)
--audit-dir <DIR> Directory for JSONL audit logs
--output-format <FMT> Output format: text (default) or json
--set values are auto-typed: JSON objects/arrays are parsed as-is, true/false
become booleans, numeric strings become numbers, everything else stays a string.
Adapter Pattern
The Adapter trait decouples the runner from any specific execution backend:
#![allow(unused)]
fn main() {
trait Adapter {
fn execute_agent_step(
&self, prompt: &str, agent_name: &str,
system_prompt: Option<&str>, mode: Option<&str>,
working_dir: Option<&str>, model: Option<&str>,
) -> Result<String>;
fn execute_bash_step(
&self, command: &str, working_dir: Option<&str>,
timeout: Option<u64>,
) -> Result<String>;
fn is_available(&self) -> bool;
fn name(&self) -> &str;
}
}
CLISubprocessAdapter
The production adapter spawns subprocesses:
- Bash steps —
/bin/bash -c <command>, optionally wrapped withtimeout. - Agent steps —
claude -p <prompt>in an isolated temp directory. ANON_INTERACTIVE_FOOTER(“Proceed autonomously. Do not ask questions.”) is appended to prevent the nested Claude session from hanging on prompts.
Timeout enforcement: A background heartbeat thread monitors the deadline.
It logs progress every 2 seconds. On expiry it sends SIGTERM, waits 5 seconds,
then escalates to SIGKILL.
Environment propagation: build_child_env() forwards session-tracking
variables (AMPLIHACK_SESSION_DEPTH, AMPLIHACK_TREE_ID, AMPLIHACK_MAX_DEPTH,
AMPLIHACK_MAX_SESSIONS) and strips CLAUDECODE to prevent nested session
confusion.
Execution Flow
Lifecycle of a Recipe Run
CLI args
│
├─ --validate-only ──► parse + validate ──► print warnings ──► exit
├─ --explain ─────────► parse ──► print step plan ──► exit
│
▼
RecipeRunner::execute(recipe, user_context)
│
├─ Check recursion limits (depth ≤ max_depth, total_steps ≤ max_total_steps)
├─ Merge recipe.context + user_context
├─ Open JSONL audit log (if --audit-dir set)
│
│ ┌─── for each step ──────────────────────────────────────────┐
│ │ │
│ │ 1. should_skip_by_tags(step) ──► skip if filtered out │
│ │ 2. run_hook(pre_step) │
│ │ 3. evaluate condition ──► Skipped if false │
│ │ 4. render templates in command/prompt │
│ │ 5. dispatch: │
│ │ ├─ Bash → adapter.execute_bash_step() │
│ │ ├─ Agent → resolve agent, adapter.execute_agent_step() │
│ │ └─ Recipe → execute_sub_recipe() (recursive) │
│ │ 6. parse JSON output (if parse_json, with retry) │
│ │ 7. store output in context │
│ │ 8. maybe_auto_stage (git add -A for agent steps) │
│ │ 9. run_hook(post_step) or run_hook(on_error) │
│ │ 10. write JSONL audit entry │
│ │ 11. fail-fast unless continue_on_error │
│ │ │
│ └─────────────────────────────────────────────────────────────┘
│
▼
RecipeResult { success, step_results, context, duration }
Sub-Recipe Execution
When a step has step_type: Recipe:
- The runner searches for the recipe file using
discovery::find_recipeacrossrecipe_search_dirs, then falls back to a direct path relative toworking_dir. - Recursion depth is checked against
RecursionConfig::max_depth(default 6).total_stepsis checked againstmax_total_steps(default 200). - The sub-recipe’s context inherits from the parent context, merged with any
sub_contextoverrides defined on the step. - A new
execute_with_depth(recipe, context, depth + 1)call runs the sub-recipe. Depth and total-step counters are tracked viaCell<u32>. - After execution, the sub-recipe’s final context is propagated back into the parent context.
Hooks
Defined in RecipeHooks:
hooks:
pre_step: "echo 'Starting step {{step_id}}'"
post_step: "echo 'Completed step {{step_id}}'"
on_error: "notify-send 'Step {{step_id}} failed'"
Hooks are shell commands rendered through the context. pre_step runs before
every step dispatch. post_step runs after a successful step. on_error runs
after a failed step. Hook failures are logged but do not abort the recipe.
Execution Listeners
The ExecutionListener trait provides real-time progress callbacks:
#![allow(unused)]
fn main() {
trait ExecutionListener {
fn on_step_start(&self, step_id: &str, step_type: &str);
fn on_step_complete(&self, result: &StepResult);
fn on_output(&self, step_id: &str, line: &str);
}
}
| Implementation | Behavior |
|---|---|
NullListener | No-op (default) |
StderrListener | Emits progress emojis and timing to stderr |
Activated with --progress.
Safety Model
Condition Evaluator (context.rs)
The condition evaluator is a hand-written recursive descent parser that
evaluates boolean expressions over recipe context variables. It does not
call eval() or execute arbitrary code.
Supported syntax:
status == "ok" and (retries < 3 or force == true)
len(items) > 0
name.startswith("test_")
value not in "blocked,disabled"
Operator precedence (low → high): or, and, not, comparison
(==, !=, <, <=, >, >=, in, not in).
Security constraints:
| Rule | Rationale |
|---|---|
No __ (dunder) access | Blocks dunder attribute introspection |
| Whitelisted functions only | int, str, len, bool, float, min, max |
| Whitelisted methods only | strip, lower, upper, startswith, endswith, replace, split, join, count, find, and variants |
| No assignment operators | Expressions are read-only |
| No function definitions | Grammar does not support fn, def, lambda |
The tokenizer produces typed tokens (String, Number, Ident, Eq,
And, Or, …) and the parser consumes them with lookahead. Unrecognized
tokens produce a parse error rather than silent misbehavior.
Template Rendering
RecipeContext::render replaces {{var}} placeholders with values from the
context. Dot-notation ({{obj.nested.key}}) traverses into JSON objects.
Missing variables render as empty strings.
RecipeContext::render_shell does the same but shell-escapes every substituted
value to prevent injection in bash commands.
Agent Resolver Path Safety (agent_resolver.rs)
Agent references use a namespaced format (namespace:category:name or
namespace:name). Each segment is validated against:
#![allow(unused)]
fn main() {
static SAFE_NAME_RE: Regex = Regex::new(r"^[a-zA-Z0-9_-]+$");
}
This rejects /, .., and any characters that could enable path traversal.
As defense-in-depth, after resolving the file path, the resolver canonicalizes both the candidate path and the search base directory, then verifies the resolved path is a child of the search base. This defends against symlink attacks.
Parser Protections (parser.rs)
- File size limit: 1 MB (
MAX_YAML_SIZE_BYTES). Prevents YAML bombs and memory exhaustion. - Structural validation: Rejects empty names, zero-step recipes, and duplicate step IDs.
- Field typo detection: Unknown top-level and step-level fields trigger warnings. Edit-distance matching suggests corrections.
Subprocess Isolation (cli_subprocess.rs)
- Agent steps execute in a fresh temporary directory that is cleaned up on drop.
CLAUDECODEis stripped from the child environment to prevent the nested Claude process from attaching to the parent’s session.- Session depth tracking (
AMPLIHACK_SESSION_DEPTH) prevents runaway recursive spawning.
Interior Mutability Pattern
The runner tracks recursion state with std::cell::Cell<u32>:
#![allow(unused)]
fn main() {
struct RecipeRunner<A: Adapter> {
// ...
depth: Cell<u32>,
total_steps: Cell<u32>,
// ...
}
}
Why Cell?
RecipeRunner::execute takes &self (shared reference) because the runner is
logically immutable during a run — the adapter, working directory, tag filters,
and listener never change. But recursion tracking requires mutating two counters.
Cell<u32> provides interior mutability for Copy types without runtime
borrow-checking overhead (no RefCell needed). The runner is single-threaded,
so Cell is sufficient and zero-cost.
Usage in Recursion
execute(&self, recipe, context)
│
├─ self.depth.get() checked against max_depth
├─ self.total_steps.get() checked against max_total_steps
│
└─ execute_sub_recipe(&self, step, ctx)
│
├─ self.depth.set(self.depth.get() + 1)
├─ execute_with_depth(&self, sub_recipe, ctx, new_depth)
└─ self.total_steps.set(self.total_steps.get() + sub_step_count)
The RecursionConfig defaults (max_depth: 6, max_total_steps: 200) can be
overridden per-recipe in the YAML:
recursion:
max_depth: 3
max_total_steps: 50
Recipe Discovery (discovery.rs)
Search Directories (default order)
~/.amplihack/.claude/recipes./amplifier-bundle/recipes./src/amplihack/amplifier-bundle/recipes./.claude/recipes
Additional directories can be added with -R <dir> (repeatable).
Discovery Functions
| Function | Purpose |
|---|---|
discover_recipes | Scan directories, return HashMap<name, RecipeInfo> |
list_recipes | Sorted Vec<RecipeInfo> for display |
find_recipe | Locate a single recipe by name → Option<PathBuf> |
verify_global_installation | Check that default dirs exist and contain recipes |
Manifest & Upstream Sync
update_manifest writes _recipe_manifest.json — a map of filenames to their
SHA-256 hashes (first 16 hex chars). check_upstream_changes diffs the current
directory state against the manifest and reports new, modified, or deleted
files.
sync_upstream adds a git remote, fetches, and diffs local recipes against the
upstream branch, returning a summary of changes.
JSONL Audit Log
When --audit-dir is set, each recipe run produces a file:
<audit-dir>/<recipe-name>_<ISO-timestamp>.jsonl
Each line is a JSON object:
{"step_id": "build", "status": "Completed", "duration_ms": 1423, "error": null, "output_len": 256}
Audit logs enable post-hoc analysis of recipe execution without cluttering stdout.
JSON Output Extraction
When parse_json: true is set on a step, the runner extracts structured JSON
from potentially noisy output using three strategies (tried in order):
- Direct parse —
serde_json::from_str(output). Works when the output is pure JSON. - Markdown fence — extracts content between
```jsonand```delimiters. Common in LLM output. - Balanced brackets — finds the first
{or[and matches it to its closing counterpart, counting nesting depth.
If all three fail, the runner optionally retries the agent step with a JSON-only reminder appended to the prompt, then re-applies the extraction pipeline.