Documentation

DeployAI Docs

Everything you need to integrate AI models into your applications. From quickstart guides to full API reference.

Quickstart

Get your first API response in under 5 minutes.

1

Get your API key

Sign up at deployai.com/sign-up and grab your API key from the dashboard. Your key starts with sk-

2

Install the SDK (optional)

DeployAI is OpenAI-compatible, so you can use any OpenAI client library. Or use our API directly with any HTTP client.

npm
npm install openai
pip
pip install openai
3

Make your first request

TypeScript
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.deployai.dev/v1",
  apiKey: process.env.DEPLOYAI_API_KEY,
});

const completion = await client.chat.completions.create({
  model: "anthropic/claude-3.5-sonnet",
  messages: [
    { role: "user", content: "Hello!" }
  ],
});

console.log(completion.choices[0].message.content);

That's it!

You're now routing AI requests through DeployAI. Explore the API Reference for all available endpoints, or browse available models.

Core Concepts

Key ideas to understand how DeployAI works.

OpenAI Compatibility

DeployAI implements the OpenAI Chat Completions API format. This means any tool, SDK, or library that works with OpenAI will work with DeployAI — just change the baseURL to https://api.deployai.dev/v1. This includes the Vercel AI SDK, LangChain, LlamaIndex, and thousands more.

Model Routing

Specify the model using the provider/model-name format (e.g., anthropic/claude-3.5-sonnet). DeployAI routes your request to the correct provider automatically, handling authentication, retries, and failover.

Streaming

All models support streaming via Server-Sent Events (SSE). Set stream: true in your request to receive tokens as they're generated. This works identically to OpenAI's streaming format.

Rate Limits & Usage

Rate limits vary by plan. Free tier users get generous limits for experimentation. Monitor your usage, costs, and rate limit status from the dashboard. Response headers include rate limit information for programmatic monitoring.