Back to docs
API Reference
Complete reference for the DeployAI API. All endpoints are OpenAI-compatible.
Base URL
https://api.deployai.dev/v1
Authentication
All API requests require an API key passed in the Authorization header as a Bearer token.
Header
Authorization: Bearer sk-your-api-key
Get your API key from the dashboard. Keys start with sk-.
Chat Completions
POST
/v1/chat/completionsCreates a chat completion. This is the primary endpoint for generating AI responses.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | Model ID in provider/model format |
| messages | array | Yes | Array of message objects with role and content |
| stream | boolean | No | Enable streaming via SSE. Default: false |
| temperature | number | No | Sampling temperature (0-2). Default: 1 |
| max_tokens | integer | No | Maximum tokens to generate |
| top_p | number | No | Nucleus sampling (0-1). Default: 1 |
Example Request
cURL
curl https://api.deployai.dev/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $DEPLOYAI_API_KEY" \
-d '{
"model": "openai/gpt-4o",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
"temperature": 0.7,
"max_tokens": 1000
}'Example Response
JSON
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1708000000,
"model": "openai/gpt-4o",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 20,
"completion_tokens": 9,
"total_tokens": 29
}
}List Models
GET
/v1/modelsReturns a list of all available models.
cURL
curl https://api.deployai.dev/v1/models \ -H "Authorization: Bearer $DEPLOYAI_API_KEY"
Available Models
Use the provider/model-name format when specifying a model.
| Model ID | Provider | Context |
|---|---|---|
| openai/gpt-4o | OpenAI | 128k |
| openai/o3-mini | OpenAI | 128k |
| anthropic/claude-3.5-sonnet | Anthropic | 200k |
| google/gemini-2.0-flash | 1M | |
| mistralai/mistral-large | Mistral AI | 128k |
| meta/llama-3.1-405b | Meta | 128k |
| deepseek/deepseek-r1 | DeepSeek | 64k |
See the full list on the Models page.
Error Codes
| Code | Description |
|---|---|
| 400 | Bad Request — Invalid parameters |
| 401 | Unauthorized — Invalid or missing API key |
| 403 | Forbidden — Insufficient permissions |
| 404 | Not Found — Model or endpoint not found |
| 429 | Too Many Requests — Rate limit exceeded |
| 500 | Internal Server Error — Something went wrong on our end |
| 503 | Service Unavailable — Provider temporarily unavailable |
Rate Limits
Rate limits depend on your plan. The following headers are included in every response:
| Header | Description |
|---|---|
| x-ratelimit-limit | Maximum requests allowed per minute |
| x-ratelimit-remaining | Remaining requests in current window |
| x-ratelimit-reset | Unix timestamp when the rate limit resets |