Skip to main content
Drop-in replacement for OpenAI’s Chat Completions API. Works with any client that speaks the OpenAI protocol.

List models

GET /v1/models
Returns the active model and a synthetic sorat-agent entry.
Response
{
  "object": "list",
  "data": [
    {"id": "gpt-4o", "object": "model", "created": 1700000000, "owned_by": "provider"},
    {"id": "sorat-agent", "object": "model", "created": 1700000000, "owned_by": "sorat"}
  ]
}

Create chat completion

POST /v1/chat/completions

Headers

HeaderDescription
X-Session-IDOptional session UUID. Auto-creates a session if not provided.
AuthorizationBearer <token>

Request body

{
  "model": "gpt-4o",
  "messages": [
    {"role": "user", "content": "Hello, what can you do?"}
  ],
  "stream": true,
  "temperature": 0.7,
  "max_tokens": 4096
}

Non-streaming response

{
  "id": "chatcmpl-xxx",
  "object": "chat.completion",
  "choices": [{
    "index": 0,
    "message": {"role": "assistant", "content": "I can help you with..."},
    "finish_reason": "stop"
  }]
}

Streaming response (SSE)

When stream: true, the response is a Server-Sent Events stream:
data: {"choices":[{"delta":{"content":"I can"},"index":0}]}

data: {"choices":[{"delta":{"content":" help"},"index":0}]}

data: {"choices":[{"delta":{"tool_calls":[{"function":{"name":"web_search"}}]},"index":0}]}

data: [DONE]

Session management

  • If X-Session-ID is provided, messages are appended to that session
  • If not provided, a new session is created automatically
  • Sessions are persisted to disk (~/.sorat/sessions/)
  • Session titles are generated asynchronously after the first response

Example with curl

# Non-streaming
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "default",
    "messages": [{"role": "user", "content": "What time is it?"}],
    "stream": false
  }'

# Streaming
curl -N -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -H "X-Session-ID: my-session-123" \
  -d '{
    "model": "default",
    "messages": [{"role": "user", "content": "Search the web for Go 1.24 release notes"}],
    "stream": true
  }'

Tool calls

The agent can make multiple tool calls per request (up to max_iterations, default 30). Tool calls and their results are handled internally by the ADK runner — you receive the final text response. In streaming mode, tool call names appear in the delta:
{"choices":[{"delta":{"tool_calls":[{"function":{"name":"run_command","arguments":"{\"command\":\"ls\"}"}}]},"index":0}]}