OpenAI Responses API Implementation¶
Overview¶
This document describes the implementation of OpenAI Responses API support in the claude-code-proxy integration for Azure OpenAI services.
Background¶
The original PR was intended to support the new OpenAI Responses API (https://platform.openai.com/docs/guides/migrate-to-responses) instead of the traditional Chat Completions API. The Responses API provides a different request/response format optimized for structured conversations.
Implementation Details¶
Request Format¶
The Responses API uses a simplified request format:
Response Format¶
The API returns responses in the OpenAI Responses format:
{
"id": "resp_...",
"object": "response",
"created_at": 1759611575,
"status": "completed",
"model": "gpt-5-codex",
"output": [
{
"id": "msg_...",
"type": "message",
"status": "completed",
"content": [...],
"role": "assistant"
}
],
"usage": {
"input_tokens": 17,
"output_tokens": 58,
"total_tokens": 75
}
}
Changes Made¶
1. FastAPI Endpoint Addition¶
Added a new /openai/responses endpoint to the claude-code-proxy FastAPI server:
- File:
server/fastapi.py(in cached claude-code-proxy instances) - Endpoint:
POST /openai/responses - Request Model:
OpenAIResponsesRequestwithmodelandinputfields - Integration: Uses LiteLLM for Azure OpenAI backend processing
2. JSON Serialization Fix¶
Fixed a critical bug in error handling that was causing "Internal Server Error" responses:
- Issue: TypeError when serializing Response objects in error details
- Fix: Added JSON serializability checking with fallback to string conversion
- Impact: Proper error responses instead of crashes
3. Enhanced Logging¶
Improved proxy startup logging to show log file locations like other Claude Code components.
Testing¶
Automated Tests¶
Created tests/test_openai_responses_api.py with:
- Endpoint existence validation
- Response format validation
- Integration test framework
Manual Testing¶
Verified working with curl:
curl -X POST http://localhost:8082/openai/responses \
-H "Content-Type: application/json" \
-d '{"model": "gpt-5-codex", "input": "tell me what agents you have available"}'
Azure Integration¶
Tested with real Azure OpenAI configuration:
- Endpoint:
ai-adapt-oai-eastus2.openai.azure.com - Model:
gpt-5-codex - API Version:
2025-04-01-preview
Current Status¶
✅ WORKING: OpenAI Responses API endpoint implemented and tested ✅ WORKING: JSON serialization crashes fixed ✅ WORKING: Azure OpenAI integration with correct model mappings ✅ WORKING: Enhanced logging and error reporting
Architecture Notes¶
The claude-code-proxy is an external dependency installed via UVX. The implementation modifies the cached UVX instances to provide the Responses API support. For production deployment, these changes would need to be:
- Submitted upstream to claude-code-proxy project, OR
- Maintained as a fork with custom changes, OR
- Integrated directly into the amplihack proxy management system
Dependencies¶
uvxfor claude-code-proxy installationlitellmfor Azure OpenAI backendfastapifor endpoint handlinghttpxfor async HTTP client functionality
Configuration¶
Uses standard Azure OpenAI environment variables:
OPENAI_API_KEY/AZURE_OPENAI_KEYOPENAI_BASE_URLwith Responses API endpointAZURE_API_VERSION- Model mappings (
BIG_MODEL,MIDDLE_MODEL,SMALL_MODEL)
The implementation automatically detects Responses API endpoints and uses the appropriate request/response format.