Handle Exceptions from WikiGR Components¶
This guide covers the exceptions that can propagate from KnowledgeGraphAgent, CypherRAG,
and the build scripts, and shows how to catch them correctly.
Quick Reference¶
| Component | Method | Exceptions That Propagate |
|---|---|---|
KnowledgeGraphAgent |
query() |
APIConnectionError, APIStatusError, APITimeoutError, RuntimeError |
KnowledgeGraphAgent |
_identify_seed_articles() |
APIConnectionError, APIStatusError, APITimeoutError, ValueError |
CypherRAG |
generate_cypher() |
APIConnectionError, APIStatusError, APITimeoutError, ValueError, json.JSONDecodeError |
LLMSeedResearcher |
URL fetch loop | requests.RequestException |
| Build scripts | process_url() |
requests.RequestException, json.JSONDecodeError |
| Build scripts | build_pack() DB_PATH guard |
ValueError (misconfigured DB_PATH) |
Programming bugs (AttributeError, TypeError, KeyError) are not caught — they propagate
directly. Fix the bug; do not add a broad except Exception around these calls.
KnowledgeGraphAgent¶
Anthropic API errors¶
API errors from synthesis or seed identification propagate as one of three Anthropic exception types. Catch all three together:
from anthropic import APIConnectionError, APIStatusError, APITimeoutError
from wikigr.agent.kg_agent import KnowledgeGraphAgent
agent = KnowledgeGraphAgent(db_path="data/packs/go-expert/pack.db")
try:
result = agent.query("What is goroutine scheduling?")
except (APIConnectionError, APITimeoutError) as e:
# Transient — retry after a brief wait
print(f"API temporarily unavailable: {e}")
except APIStatusError as e:
# Permanent for this request (e.g. 400 Bad Request, 401 Unauthorized)
print(f"API rejected the request: {e.status_code} {e.message}")
raise
Database errors¶
LadybugDB errors surface as RuntimeError. These indicate a corrupt database, a missing pack file,
or a schema mismatch:
try:
result = agent.query("What is goroutine scheduling?")
except RuntimeError as e:
print(f"Database error (check pack integrity): {e}")
raise
Embedding / vector pipeline errors¶
Vector search failures raise RuntimeError or OSError (e.g. the embedding model files are
missing or the vector index is corrupt):
try:
result = agent.query("...")
except (RuntimeError, OSError) as e:
print(f"Vector pipeline error: {e}")
raise
Typical caller pattern¶
For most application code, catching API errors and re-raising the rest is sufficient:
from anthropic import APIConnectionError, APIStatusError, APITimeoutError
try:
result = agent.query(user_question)
return result["answer"]
except (APIConnectionError, APITimeoutError):
return "The AI service is temporarily unavailable. Please try again."
except APIStatusError as e:
if e.status_code == 401:
raise RuntimeError("Invalid ANTHROPIC_API_KEY") from e
raise
CypherRAG¶
CypherRAG.generate_cypher() raises exceptions rather than returning a generic fallback query.
Callers that previously relied on a silent passthrough must now handle failure explicitly.
Empty API response¶
from wikigr.agent.cypher_rag import CypherRAG
from anthropic import APIConnectionError, APIStatusError, APITimeoutError
try:
plan = rag.generate_cypher(question)
except ValueError as e:
# Empty response from Claude — treat as generation failure
logger.warning("Cypher generation returned empty response: %s", e)
plan = None
except json.JSONDecodeError as e:
# Malformed JSON in Claude response
logger.warning("Cypher generation returned unparseable JSON: %s", e)
plan = None
except (APIConnectionError, APIStatusError, APITimeoutError):
raise # let API errors propagate
Pattern retrieval failure¶
Pattern retrieval (FewShotManager.find_similar_examples) raises RuntimeError or OSError
when the pattern store is unavailable. generate_cypher() handles this internally — if pattern
retrieval fails, it proceeds with no patterns rather than aborting. The caller only sees the
generation-level exceptions listed above.
LLM Seed Researcher¶
LLMSeedResearcher URL fetching catches only requests.RequestException. All other exceptions
propagate immediately.
from requests import RequestException
from wikigr.packs.seed_researcher import LLMSeedResearcher
researcher = LLMSeedResearcher(domain="go.dev", topic="Go programming language")
try:
urls = researcher.discover_urls(max_urls=50)
except RequestException as e:
# Network failure during source discovery
print(f"Could not reach {e.request.url if e.request else 'remote source'}: {e}")
Build Scripts (scripts/build_*_pack.py)¶
Each build script's process_url() function catches only network and JSON errors. LadybugDB
errors, embedding failures, and other unexpected errors now abort the build with a visible
traceback rather than logging a warning and moving on to the next URL.
This is intentional: a corrupt partial database write is worse than a build that fails fast.
# Inside process_url() — exceptions that are caught (per-URL, non-fatal):
# requests.RequestException — network timeout, DNS failure, etc.
# json.JSONDecodeError — malformed JSON in API response
# Exceptions that are NOT caught (fatal, abort the build):
# RuntimeError — LadybugDB database error
# OSError — embedding model file missing
# AttributeError, TypeError — programming bug in extraction code
If a build aborts with a LadybugDB RuntimeError, the database may be in a partial state. Delete
the pack.db file and re-run the build from scratch.
DB_PATH Safety Guard¶
Before shutil.rmtree() is called on the existing database, every build script checks that
DB_PATH is inside the expected data/packs/ tree:
if not str(DB_PATH).startswith("data/packs/"):
raise ValueError(f"Unsafe DB_PATH: {DB_PATH}")
This ValueError is not caught by process_url(). It propagates to main(), which logs
it and exits with a non-zero status code. The build does not proceed if DB_PATH is
misconfigured.
The guard relies on relative paths. All build scripts set DB_PATH as:
PACK_DIR = Path("data/packs/<pack-name>")
DB_PATH = PACK_DIR / "pack.db"
Because these are relative Path objects, str(DB_PATH) begins with "data/packs/" exactly.
If you run a build script from a directory other than the repository root, the guard will raise
ValueError — which is the correct behaviour, since the relative path assumption would be
violated.
What You Should NOT Do¶
Broad except Exception¶
Do not add a broad except Exception around calls to WikiGR components. Broad handlers hide
programming bugs and make debugging harder:
# BAD — swallows AttributeError, TypeError, KeyError (programming bugs)
try:
result = agent.query(question)
except Exception as e:
return f"Error: {e}"
# GOOD — catch only the recoverable cases
try:
result = agent.query(question)
except (APIConnectionError, APITimeoutError):
return "Temporarily unavailable."
Relying on silent fallbacks (removed)¶
Two silent fallbacks that existed in earlier versions have been removed:
_fallback_seed_extraction— word-splitting was not an acceptable degradation for semantic seed identification. API errors from_identify_seed_articlesnow propagate._safe_fallbackinCypherRAG— a genericMATCHquery is not a valid fallback for a failed Cypher generation step.generate_cypher()now raisesValueErrororjson.JSONDecodeErroron failure.
If you were catching the return value of these methods (previously always a dict), update your code to handle the exceptions described above.
Exception Domain Summary¶
Exception domains, plus programming bugs that propagate unhandled:
| Domain | Exception Types | When Raised |
|---|---|---|
| Anthropic API | APIConnectionError, APIStatusError, APITimeoutError |
Synthesis, seed identification, Cypher generation |
| LadybugDB database | RuntimeError |
Any conn.execute() call |
| Embedding / ML | RuntimeError, OSError |
Vector ops, model loading, pipeline init |
| Seed researcher | requests.RequestException |
HTTP fetch, DNS, timeout |
| Build scripts (JSON) | json.JSONDecodeError |
Malformed API response in URL loop |
| Build scripts (path guard) | ValueError |
DB_PATH outside data/packs/ before shutil.rmtree |
| Programming bugs | AttributeError, TypeError, KeyError |
Always — fix the bug, do not catch |
See Also¶
- Exception Types Reference — full list of exception types and which module raises them
- KG Agent API —
query()return value andquery_typevalues - Security: Exception Handling — design rationale and security implications