Code Visualizer Skill¶

Purpose¶

Automatically generate and maintain visual code flow diagrams across multiple programming languages. The skill auto-detects which languages are present in a target path, analyzes each one with a dedicated analyzer, and emits one mermaid diagram per language plus an optional combined high-level view. It also detects when committed diagrams are stale relative to the source they describe.

What's New in 2.0.0¶

Multi-language support: Python, TypeScript/JavaScript, Rust, and Go.
Language dispatcher: Detects languages by file extension and routes to per-language analyzers.
Language-blind renderer: A single mermaid renderer consumes a normalized graph; the renderer never inspects language semantics.
One diagram per language plus an optional --combined view that places each language in its own mermaid subgraph.
Generalized staleness: Walks all source files matching detected languages' extensions and compares max-mtime against the diagram mtime.
Brick-style architecture: Each language analyzer is a self-contained module that exposes a single normalize() function. No shared inheritance.

Supported Languages¶

Language	Extensions	Analyzer	Parser	Notes
Python	`.py`	`python_analyzer`	`ast`	Extracts `import` and `from … import …`.
TypeScript/JavaScript	`.ts`, `.tsx`, `.js`, `.jsx`, `.mjs`, `.cjs`	`ts_analyzer`	regex	Extracts `import … from`, `require(...)`, dynamic `import(...)`.
Rust	`.rs`	`rust_analyzer`	regex	Extracts `use crate::…`, `use super::…`, `mod …`.
Go	`.go`	`go_analyzer`	regex	Extracts single and grouped `import` declarations.

Languages outside this table are skipped silently. See Extending below to add new ones.

Architecture¶

amplifier-bundle/skills/code-visualizer/
├── SKILL.md
├── README.md
└── scripts/
    ├── __init__.py
    ├── graph.py              # Normalized data contract (Node, Edge, Graph)
    ├── python_analyzer.py    # normalize(paths) -> Graph
    ├── ts_analyzer.py        # normalize(paths) -> Graph
    ├── rust_analyzer.py      # normalize(paths) -> Graph
    ├── go_analyzer.py        # normalize(paths) -> Graph
    ├── dispatcher.py         # detect languages, route, return dict[lang, Graph]
    ├── mermaid_renderer.py   # render(graph) / render_combined(graphs)
    ├── staleness.py          # is_stale(target, diagram, languages)
    └── visualizer.py         # CLI entry point

Data Contract (`graph.py`)¶

@dataclass(frozen=True)
class Node:
    id: str           # mermaid-safe identifier
    label: str        # human-readable label (e.g. "src/auth/oauth.py")
    language: str     # "python" | "typescript" | "rust" | "go"
    file_path: str    # absolute path on disk

@dataclass(frozen=True)
class Edge:
    src: str          # Node.id of source
    dst: str          # Node.id of destination
    kind: str         # "import" | "require" | "use" | "mod" | "dynamic_import"

@dataclass(frozen=True)
class Graph:
    language: str
    nodes: tuple[Node, ...]
    edges: tuple[Edge, ...]

Analyzers may import these dataclasses but must not inherit from any shared class. The data contract is the only coupling.

Per-Language Analyzers¶

Each analyzer is a self-contained brick exposing exactly one entry point:

def normalize(paths: Iterable[Path]) -> Graph: ...

The function:

Reads each file with encoding="utf-8", errors="ignore".
Skips files larger than ~5 MB.
Wraps parsing in try/except and skips files that fail to parse.
Returns a Graph whose language field matches the analyzer.

Dispatcher¶

The dispatcher uses a registry that maps language name → extensions + module name (string). It loads analyzers lazily via importlib.import_module so adding a new language never requires touching the dispatcher's import statements.

from scripts.dispatcher import analyze

graphs: dict[str, Graph] = analyze(target_path)
# {"python": Graph(...), "typescript": Graph(...)}

The dispatcher:

Walks target_path with os.walk(..., followlinks=False).
Skips IGNORE_DIRS (.git, node_modules, .venv, venv, __pycache__, dist, build, target, .mypy_cache, .pytest_cache, .tox).
Buckets files by extension into language groups.
Calls each language's normalize() with its file list.
Returns a dict[language_name, Graph] for languages that produced any files.

Mermaid Renderer¶

The renderer is language-blind:

from scripts.mermaid_renderer import render, render_combined

per_language: str = render(graph)            # one diagram for one language
combined: str = render_combined(graphs)      # one diagram, one subgraph/lang

Node IDs are sanitized ([^A-Za-z0-9_] -> _) and labels with quotes are escaped to prevent diagram-syntax injection.

Staleness Detection¶

from scripts.staleness import is_stale

stale = is_stale(
    target_path=Path("src/"),
    diagram_path=Path("docs/architecture-python.mmd"),
    languages=["python"],
)

Returns True if any source file with a matching language extension has an mtime newer than diagram_path. Generalizes the previous Python-only behavior.

CLI¶

The skill ships a single executable: scripts/visualizer.py.

python visualizer.py <path> [--output DIR] [--basename NAME]
                            [--check-staleness] [--combined]

Flag	Default	Purpose
`<path>`	required	Directory to analyze. Must exist and be a directory.
`--output DIR`	`./diagrams`	Output directory for `.mmd` files.
`--basename NAME`	`architecture`	Filename stem. Validated against `^[A-Za-z0-9._-]+$`.
`--check-staleness`	off	Print staleness report for existing diagrams; exit non-zero if stale.
`--combined`	off	Also write `<basename>-combined.mmd` containing all languages.

Output Files¶

File	Contents
`<basename>-python.mmd`	Mermaid diagram for Python modules and their imports.
`<basename>-typescript.mmd`	Mermaid diagram for TS/JS files and their imports.
`<basename>-rust.mmd`	Mermaid diagram for Rust modules and `use` edges.
`<basename>-go.mmd`	Mermaid diagram for Go packages and `import` edges.
`<basename>-combined.mmd` (with `--combined`)	One diagram with one `subgraph` per detected language.

Files are only written for languages that were actually detected.

Quick Start¶

Generate diagrams for a polyglot repo¶

python amplifier-bundle/skills/code-visualizer/scripts/visualizer.py . \
    --output docs/diagrams --combined

Output (for this repo, which contains Python and JS):

docs/diagrams/architecture-python.mmd
docs/diagrams/architecture-typescript.mmd
docs/diagrams/architecture-combined.mmd

Check freshness in CI¶

python amplifier-bundle/skills/code-visualizer/scripts/visualizer.py src/ \
    --output docs/diagrams --check-staleness
# exits 1 if any per-language diagram is older than its source set

Generate for a single language¶

Provide a path that only contains files of one language; the dispatcher will detect a single language and emit a single .mmd:

python visualizer.py src/auth/      # Python-only -> architecture-python.mmd

Auto-Detection Rules¶

The dispatcher walks <path>, skipping IGNORE_DIRS and symlinks.
Files are bucketed by extension into one of the supported languages.
A language is "detected" if at least one file matches.
Each detected language is analyzed independently.
With --combined, the renderer composes one mermaid diagram with one subgraph per detected language. Cross-language edges are not inferred in the MVP.

Example Output¶

For a repo with:

src/api.py importing src/auth.py
web/index.ts importing web/utils.ts

architecture-python.mmd:

flowchart TD
    src_api_py["src/api.py"]
    src_auth_py["src/auth.py"]
    src_api_py --> src_auth_py

architecture-typescript.mmd:

flowchart TD
    web_index_ts["web/index.ts"]
    web_utils_ts["web/utils.ts"]
    web_index_ts --> web_utils_ts

architecture-combined.mmd:

flowchart TD
    subgraph python ["python"]
        src_api_py["src/api.py"]
        src_auth_py["src/auth.py"]
        src_api_py --> src_auth_py
    end
    subgraph typescript ["typescript"]
        web_index_ts["web/index.ts"]
        web_utils_ts["web/utils.ts"]
        web_index_ts --> web_utils_ts
    end

Note: the renderer emits the subgraph <id> ["<label>"] form (space between id and bracketed label), which is the Mermaid-documented syntax accepted across recent Mermaid versions. test_mermaid_renderer.py pins the exact emitted form.

Extending: Adding a New Language¶

The skill follows the brick philosophy: a new language is a new self-contained module. There is no base class to subclass.

Create scripts/<lang>_analyzer.py with the entry point:

from collections.abc import Iterable
from pathlib import Path
from graph import Edge, Graph, Node  # sibling import; works under `python visualizer.py`

def normalize(paths: Iterable[Path]) -> Graph:
    nodes: list[Node] = []
    edges: list[Edge] = []
    for p in paths:
        # parse file, append nodes/edges
        ...
    return Graph(language="<lang>", nodes=tuple(nodes), edges=tuple(edges))

Register the language in scripts/dispatcher.py:

LANGUAGES = {
    "python":     {"exts": {".py"},                          "module": "python_analyzer"},
    "typescript": {"exts": {".ts", ".tsx", ".js", ".jsx",
                            ".mjs", ".cjs"},                 "module": "ts_analyzer"},
    "rust":       {"exts": {".rs"},                          "module": "rust_analyzer"},
    "go":         {"exts": {".go"},                          "module": "go_analyzer"},
    # add here:
    "<lang>":     {"exts": {".ext"},                         "module": "<lang>_analyzer"},
}

Add tests/test_<lang>_analyzer.py with tmp_path fixtures asserting nodes and edges produced by representative source snippets.
Update the Supported Languages table above.

That's it. The renderer, dispatcher routing, staleness detector, and CLI all work without further changes because they consume the language-blind Graph data contract.

Testing¶

Tests live under amplifier-bundle/skills/code-visualizer/tests/ and run via pytest. The skill registers its tests/ directory in the repo's pytest.ini testpaths so CI picks them up automatically.

Test files:

File	Purpose
`test_python_analyzer.py`	AST-driven import extraction; verifies edges for `import`/`from`.
`test_ts_analyzer.py`	`import`/`require`/dynamic `import()`; type-only and relative paths.
`test_dispatcher.py`	Mixed-language fixture; verifies correct routing per extension.
`test_mermaid_renderer.py`	Empty graphs, non-empty graphs, ID/label sanitization.
`test_staleness.py`	Mtime comparison across multiple language extensions.
`test_smoke_repo.py`	Runs dispatcher against the repo root; asserts non-empty mermaid
	for both Python and TypeScript/JavaScript.

Run only the skill's tests:

pytest amplifier-bundle/skills/code-visualizer/tests -q

Security Considerations¶

No code execution: Analyzers only parse source. No exec/eval/ subprocess on analyzed files.
Path validation: <path> and --output are resolved with Path.resolve() and rejected if non-existent or non-directory.
Filename validation: --basename must match ^[A-Za-z0-9._-]+$.
Symlink safety: os.walk(..., followlinks=False) plus IGNORE_DIRS prevents loops and escape.
Bounded reads: Per-file size cap (~5 MB); UTF-8 decode with errors="ignore".
Bounded regex: Anchored, no nested quantifiers; protects against ReDoS.
Mermaid sanitization: Node IDs strip non-[A-Za-z0-9_]; labels with embedded quotes are escaped.
Stdlib-only: Zero third-party runtime dependencies; no supply-chain surface.
Output containment: Writes are constrained to the resolved --output directory; source content is never logged.

Limitations¶

Static heuristics: Regex-based extraction for TS/JS/Rust/Go misses some edge syntax (TS type-only imports across multiple lines, Rust nested use {a, b::c}, Go cgo blocks). Documented per analyzer in source.
No call graphs: Edges are import/use only. Runtime/dynamic imports beyond import("...")/__import__ are not modeled.
External imports: Rendered as ghost target nodes inline; not resolved to real files.
Combined view: Cross-language edges are out of MVP scope.
Shell scripts: Not first-class; .sh files are ignored.
Compiler-grade accuracy: Not a goal. The skill optimizes for "useful diagram in seconds" over "perfect AST."

Philosophy Alignment¶

Principle	How v2.0 follows it
Ruthless Simplicity	Stdlib-only; regex over tree-sitter; max-mtime over semantic diff.
Zero-BS	Real parsers (`ast` for Python, regex for others). Limitations documented honestly.
Modular Design	Each analyzer is a brick with a single `normalize()` stud. No inheritance.
Brick Composition	Renderer/dispatcher/staleness are independent bricks reusing only the data contract.

Migration from 1.x¶

The 1.x skill was Python-only. Forward-compatibility notes (verify against your actual 1.x integration before relying on them):

Diagrams previously named <basename>.mmd are now <basename>-python.mmd. Update any references in README.md / ARCHITECTURE.md.
Staleness reports now include a per-language breakdown. CI scripts that parsed the old single-line output should be updated to handle multiple languages.
Any direct Python helper used in 1.x is superseded by dispatcher.analyze(path) returning a dict[language, Graph]. Callers that only want Python can use dispatcher.analyze(path)["python"].

Remember¶

The skill automates what developers forget across all four supported languages: keeping diagrams in sync with code. It's not a compiler; it's a fast, honest, multi-language snapshot.