Passthrough Mode¶
Passthrough mode is a feature that allows the proxy to start with the Anthropic API and automatically switch to Azure OpenAI fallback when encountering 429 rate limit errors.
Overview¶
When passthrough mode is enabled, the proxy will:
- Start with Anthropic API: All Claude model requests are first sent to the official Anthropic API
- Monitor for 429 errors: When the Anthropic API returns a 429 rate limit error, the proxy records the failure
- Switch to Azure fallback: After a configurable number of failures, subsequent requests are automatically routed to Azure OpenAI
- Automatic recovery: After a timeout period without failures, the proxy switches back to Anthropic API
Configuration¶
Environment Variables¶
| Variable | Default | Description |
|---|---|---|
PASSTHROUGH_MODE | false | Enable passthrough mode |
PASSTHROUGH_FALLBACK_ENABLED | true | Enable Azure OpenAI fallback |
PASSTHROUGH_MAX_RETRIES | 3 | Maximum retries before giving up |
PASSTHROUGH_RETRY_DELAY | 1.0 | Delay between retries (seconds) |
PASSTHROUGH_FALLBACK_AFTER_FAILURES | 2 | Number of 429 errors before switching to fallback |
Required API Keys¶
For passthrough mode to work, you need:
- Anthropic API Key:
ANTHROPIC_API_KEY - Azure OpenAI Configuration (for fallback):
AZURE_OPENAI_API_KEYAZURE_OPENAI_ENDPOINTAZURE_OPENAI_API_VERSION(optional, defaults to2024-02-01)
Model Mappings¶
Configure how Claude models map to your Azure deployments:
| Claude Model | Environment Variable | Default Azure Model |
|---|---|---|
claude-3-5-sonnet-20241022 | AZURE_CLAUDE_SONNET_DEPLOYMENT | gpt-4 |
claude-3-5-haiku-20241022 | AZURE_CLAUDE_HAIKU_DEPLOYMENT | gpt-4o-mini |
claude-3-opus-20240229 | AZURE_CLAUDE_OPUS_DEPLOYMENT | gpt-4 |
claude-3-sonnet-20240229 | AZURE_CLAUDE_SONNET_DEPLOYMENT | gpt-4 |
claude-3-haiku-20240307 | AZURE_CLAUDE_HAIKU_DEPLOYMENT | gpt-4o-mini |
Quick Start¶
- Copy the example configuration:
- Edit the configuration:
- Add your Anthropic API key
- Add your Azure OpenAI endpoint and API key
-
Configure the deployment mappings
-
Start the proxy:
- Test the proxy:
How It Works¶
Normal Operation¶
During Rate Limiting¶
Request Flow¶
- Request Received: Client sends request to proxy
- Model Detection: Proxy detects Claude model in request
- Passthrough Check: If passthrough mode enabled and Claude model detected
- Primary Attempt: Try Anthropic API first
- Error Handling: If 429 error received, record failure
- Fallback Decision: If failure count exceeds threshold, use Azure
- Response Conversion: Convert Azure response back to Anthropic format
Failure Tracking¶
The proxy maintains a failure counter that:
- Increments on each 429 error from Anthropic
- Resets to zero on successful Anthropic responses
- Triggers fallback mode when threshold is reached
- Resets after a timeout period (5 minutes default)
Monitoring¶
Status Endpoint¶
Check the proxy status at GET /status:
{
"proxy_active": true,
"passthrough_mode": true,
"passthrough_status": {
"passthrough_enabled": true,
"fallback_enabled": true,
"anthropic_configured": true,
"azure_configured": true,
"failure_count": 0,
"using_fallback": false,
"last_failure_time": 0,
"max_retries": 3,
"fallback_after_failures": 2
}
}
Logging¶
The proxy logs important events:
- Passthrough mode activation
- API switching (Anthropic ↔ Azure)
- Failure tracking
- Configuration validation
Limitations¶
- Streaming: Streaming responses are not yet implemented for passthrough mode
- Tool Calls: Complex tool interactions may have slight formatting differences between APIs
- Model Features: Some Claude-specific features may not be available through Azure mappings
Troubleshooting¶
Common Issues¶
- Invalid API Keys:
- Missing Azure Configuration:
- Deployment Not Found:
- Check your Azure deployment names
- Ensure deployment mappings are correctly configured
Testing¶
Run the test suite to verify your configuration:
For integration testing:
Example Configuration¶
See .env.passthrough.example for a complete configuration example.
Security Notes¶
- API keys are never logged or exposed in error messages
- All communication with APIs uses HTTPS
- Sensitive configuration is sanitized in logs
- No API keys are stored persistently