Token Sanitization Guide¶
Security > Token Sanitization Guide
Quick Start¶
Prevent token exposure in logs, errors, and debug output with automatic sanitization.
from amplihack.proxy.security import TokenSanitizer
sanitizer = TokenSanitizer()
# Sanitize error messages before logging
error_msg = "Authentication failed with token: gho_abc123xyz"
safe_msg = sanitizer.sanitize(error_msg)
print(safe_msg)
# Output: "Authentication failed with token: [REDACTED-GITHUB-TOKEN]"
Contents¶
What Gets Sanitized¶
TokenSanitizer detects and redacts these token types:
| Token Type | Pattern | Redaction Marker |
|---|---|---|
| GitHub tokens | gho_*, ghp_*, ghs_*, ghu_*, ghr_* | [REDACTED-GITHUB-TOKEN] |
| OpenAI keys | sk-*, sk-proj-* | [REDACTED-OPENAI-KEY] |
| Anthropic keys | sk-ant-* | [REDACTED-ANTHROPIC-KEY] |
| Bearer tokens | Bearer <token> | [REDACTED-BEARER-TOKEN] |
| JWT tokens | eyJ*.eyJ*.* | [REDACTED-JWT-TOKEN] |
| Azure keys | azure-key-* | [REDACTED-AZURE-KEY] |
| Azure connections | DefaultEndpointsProtocol=... | [REDACTED-AZURE-CONNECTION] |
Common Use Cases¶
Sanitizing API Errors¶
When API calls fail, error messages often contain authentication tokens:
from amplihack.proxy.security import TokenSanitizer
sanitizer = TokenSanitizer()
try:
# API call that might fail
response = github_api.chat_completion(token="gho_abc123xyz")
except Exception as e:
# Sanitize before logging
safe_error = sanitizer.sanitize(str(e))
logger.error(f"API call failed: {safe_error}")
Output:
Sanitizing Configuration Dumps¶
When debugging configuration, sanitize before printing:
from amplihack.proxy.security import TokenSanitizer
config = {
"github_token": "gho_1234567890abcdefghij",
"openai_key": "sk-proj-abc123xyz",
"endpoint": "https://api.github.com"
}
sanitizer = TokenSanitizer()
safe_config = sanitizer.sanitize(config)
print(safe_config)
# Output: {'github_token': '[REDACTED-GITHUB-TOKEN]', 'openai_key': '[REDACTED-OPENAI-KEY]', 'endpoint': 'https://api.github.com'}
Sanitizing Log Files¶
Process existing log files to remove tokens:
from amplihack.proxy.security import TokenSanitizer
from pathlib import Path
sanitizer = TokenSanitizer()
log_file = Path("debug.log")
# Read, sanitize, and overwrite
content = log_file.read_text()
sanitized = sanitizer.sanitize(content)
log_file.write_text(sanitized)
Checking for Tokens Before Logging¶
Conditionally sanitize only when tokens are detected:
from amplihack.proxy.security import TokenSanitizer
sanitizer = TokenSanitizer()
message = "Debug info: connection established"
if sanitizer.contains_token(message):
message = sanitizer.sanitize(message)
logger.debug(message)
Integration Examples¶
FastAPI Error Handler¶
Sanitize errors before returning to clients:
from fastapi import FastAPI, Request
from fastapi.responses import JSONResponse
from amplihack.proxy.security import TokenSanitizer
app = FastAPI()
sanitizer = TokenSanitizer()
@app.exception_handler(Exception)
async def sanitize_errors(request: Request, exc: Exception):
error_detail = str(exc)
safe_detail = sanitizer.sanitize(error_detail)
return JSONResponse(
status_code=500,
content={"detail": safe_detail}
)
Logging Wrapper¶
Create a logging wrapper that auto-sanitizes:
import logging
from amplihack.proxy.security import TokenSanitizer
class SanitizingLogger:
def __init__(self, name: str):
self.logger = logging.getLogger(name)
self.sanitizer = TokenSanitizer()
def debug(self, msg: str, *args, **kwargs):
safe_msg = self.sanitizer.sanitize(msg)
self.logger.debug(safe_msg, *args, **kwargs)
def error(self, msg: str, *args, **kwargs):
safe_msg = self.sanitizer.sanitize(msg)
self.logger.error(safe_msg, *args, **kwargs)
# Usage
logger = SanitizingLogger(__name__)
logger.debug(f"Token: {github_token}") # Automatically sanitized
Request/Response Interceptor¶
Sanitize all HTTP traffic logs:
from amplihack.proxy.security import TokenSanitizer
import httpx
sanitizer = TokenSanitizer()
async def log_request(request: httpx.Request):
# Sanitize headers before logging
safe_headers = sanitizer.sanitize(dict(request.headers))
print(f"Request headers: {safe_headers}")
async def log_response(response: httpx.Response):
# Sanitize response body
safe_body = sanitizer.sanitize(response.text)
print(f"Response: {safe_body}")
Performance¶
TokenSanitizer is optimized for production use:
- Simple strings: < 1ms per sanitization
- Small dicts: < 1ms per sanitization
- 1000 strings: < 1 second total
- Compiled regex: Patterns compiled once at initialization
Performance Tips¶
-
Reuse instances: Create one TokenSanitizer and reuse it
-
Check before sanitizing: Use
contains_token()to skip clean data -
Batch processing: Sanitize in batches for large datasets
Troubleshooting¶
False Positives¶
If safe text is being redacted, check pattern lengths:
Problem: Short strings like "sk-short" shouldn't match Solution: Patterns require 6+ characters after prefix
# These are NOT detected as tokens (too short)
safe_texts = [
"gho_", # Prefix only
"sk-", # Prefix only
"sk-short", # Too short (< 6 chars after prefix)
]
False Negatives¶
If real tokens aren't detected, verify token format:
Problem: Token not being redacted Solution: Check token format matches known patterns
# Supported GitHub token prefixes
valid_prefixes = ["gho_", "ghp_", "ghs_", "ghu_", "ghr_"]
# Supported OpenAI key formats
valid_openai = ["sk-", "sk-proj-"]
# Supported Anthropic format
valid_anthropic = ["sk-ant-"]
Performance Issues¶
If sanitization is slow:
-
Profile token density: Are most strings clean?
-
Reduce nested depth: Very deep nesting (10+ levels) impacts performance
-
Batch processing: Process in chunks for huge datasets
Best Practices¶
-
Sanitize at boundaries: Sanitize data when it crosses trust boundaries (logging, errors, API responses)
-
Don't sanitize business logic: Only sanitize for output/logging, not internal processing
-
Use in exception handlers: Always sanitize exceptions before displaying
-
Test with real tokens: Use actual token formats in tests (with pragma comments)
-
Check before expensive operations: Use
contains_token()before deep sanitization
Related Documentation¶
- Security API Reference - Complete API documentation
- Security Testing Guide - How to test security features
- Security README - Security overview
Remember: TokenSanitizer protects against accidental token exposure. It's not a replacement for proper secret management, secure storage, or encryption.