AI
Void provides a typed AI client powered by Cloudflare's AI Gateway. It supports both Workers AI models and third-party providers such as OpenAI, Anthropic, and Google through bring-your-own-key credentials. Import ai from void/ai and run inference directly from your route handlers. Usage is metered through Void.
import { ai } from 'void/ai';Basic Usage
Call ai.run() with a model name and inputs. Model names and input types are fully typed from @cloudflare/workers-types.
import { defineHandler } from 'void';
import { ai } from 'void/ai';
export const POST = defineHandler(async (c) => {
const { prompt } = await c.req.json();
const result = await ai.run('@cf/meta/llama-3.1-8b-instruct', {
messages: [{ role: 'user', content: prompt }],
});
return c.json(result);
});You can use any model available on Workers AI, including text generation, image classification, image-to-text, text-to-image, embeddings, translation, and more. Models that return binary data, such as generated images, are returned as a Blob from ai.run().
Streaming
Use ai.stream() to get a streaming response with SSE headers that you can return directly from a route handler:
import { defineHandler } from 'void';
import { ai } from 'void/ai';
export const POST = defineHandler(async (c) => {
const { prompt } = await c.req.json();
return ai.stream('@cf/meta/llama-3.1-8b-instruct', {
messages: [{ role: 'user', content: prompt }],
});
});ai.stream() calls ai.run() with stream: true and wraps the result in a Response with content-type: text/event-stream and cache-control: no-cache headers.
Listing Models
Use ai.models() to list available models:
const models = await ai.models();
// Filter by task
const textModels = await ai.models({ task: 'Text Generation' });Markdown Conversion
Use ai.toMarkdown() to convert documents to markdown:
const result = await ai.toMarkdown([{ name: 'document.pdf', blob: pdfBytes }]);Local Development
AI requires Void credentials for local development. Run void auth login and void project link (or follow the interactive setup during vite dev) to connect your project.
Once linked, credentials are injected automatically as worker bindings. You do not need to configure them by hand. All inference traffic goes through the Void AI proxy over HTTPS, so usage is tracked and metered the same way it is in production.
Usage Limits
Workers AI usage is metered in neurons, which is Cloudflare's unit for inference cost. Usage resets at the start of each billing cycle.
| Plan | Included | At limit |
|---|---|---|
| Free | 100,000/month | Blocked until billing cycle resets |
| Solo | 300,000/month | Overage billed |
| Pro | 500,000/month | Overage billed |
On the free tier, AI requests return a 429 error once the limit is reached. On paid tiers, usage beyond the included allowance is tracked as overage on your monthly bill.
Third-Party Providers (BYOK)
You can use models from OpenAI, Anthropic, Google, and other providers by passing a provider/model identifier. Bring your own API key, add it as a project secret, and Void handles the rest. Third-party inference is billed directly by the provider, and Void does not mark up API calls. Requests still route through AI Gateway so usage shows up on the dashboard.
Usage
import { defineHandler } from 'void';
import { ai } from 'void/ai';
export const POST = defineHandler(async (c) => {
const { prompt } = await c.req.json();
const result = await ai.run('openai/gpt-4o', {
messages: [{ role: 'user', content: prompt }],
max_tokens: 512,
});
return c.json(result);
});The same ai.run() and ai.stream() methods work for both Workers AI and third-party models. The framework detects the provider from the model string and routes the request through AI Gateway automatically.
The options follow each provider's conventions. All third-party providers use the OpenAI-compatible chat completions format, including messages, max_tokens, and temperature. TypeScript narrows the input and return types based on the model string: Workers AI models get per-model typed inputs from @cloudflare/workers-types, while third-party models such as "provider/model" get ChatCompletionInputs and ChatCompletionResponse.
Vision
Third-party chat models that accept image inputs can receive OpenAI-compatible multimodal message content:
const result = await ai.run('openai/gpt-4o', {
messages: [
{
role: 'user',
content: [
{ type: 'text', text: 'What is in this image?' },
{
type: 'image_url',
image_url: { url: 'data:image/png;base64,...', detail: 'high' },
},
],
},
],
});Image Generation
Use ai.image() for image-generation providers or Workers AI image models. It returns a Response, so route handlers can return it directly or parse it as JSON depending on the provider:
export const POST = defineHandler(async (c) => {
const { prompt } = await c.req.json();
return ai.image('openai/gpt-image-1.5', {
prompt,
size: '1024x1024',
response_format: 'b64_json',
});
});For Workers AI text-to-image models, pass the model name and input shape from Cloudflare's model docs:
export const POST = defineHandler(async (c) => {
const { prompt } = await c.req.json();
return ai.image('@cf/black-forest-labs/flux-1-schnell', { prompt });
});For image edits, pass the file-like input and select the edit endpoint. Void serializes the file through the proxy and forwards it to AI Gateway as multipart form data:
export const POST = defineHandler(async (c) => {
const body = await c.req.parseBody();
return ai.image(
'openai/gpt-image-1.5',
{
prompt: String(body.prompt),
image: body.image as Blob,
},
{ endpoint: 'images/edits' },
);
});Provider Key Convention
Each provider requires an API key set as a project secret. The env var name is automatically derived from the provider prefix:
| Provider prefix | Env var |
|---|---|
openai | OPENAI_API_KEY |
anthropic | ANTHROPIC_API_KEY |
google-ai-studio | GOOGLE_API_KEY |
deepseek | DEEPSEEK_API_KEY |
groq | GROQ_API_KEY |
mistral | MISTRAL_API_KEY |
grok | GROK_API_KEY |
openrouter | OPENROUTER_API_KEY |
perplexity | PERPLEXITY_API_KEY |
cohere | COHERE_API_KEY |
cerebras | CEREBRAS_API_KEY |
huggingface | HUGGINGFACE_API_KEY |
replicate | REPLICATE_API_KEY |
baseten | BASETEN_API_KEY |
cartesia | CARTESIA_API_KEY |
deepgram | DEEPGRAM_API_KEY |
elevenlabs | ELEVENLABS_API_KEY |
fal | FAL_API_KEY |
ideogram | IDEOGRAM_API_KEY |
parallel | PARALLEL_API_KEY |
All AI Gateway providers that accept a Bearer API key are supported. Providers with non-standard auth (Amazon Bedrock, Azure OpenAI, Google Vertex) are not yet supported.
For production, add your API key as a project secret:
void secrets set OPENAI_API_KEY sk-...For local development, add it to .env.local in your project root:
OPENAI_API_KEY=sk-...If the key is missing at runtime, ai.run() throws a descriptive error telling you which env var to set.
Streaming with Third-Party Models
ai.stream() works the same way. The response is always SSE regardless of provider:
export const POST = defineHandler(async (c) => {
const { prompt } = await c.req.json();
return ai.stream('anthropic/claude-sonnet-4-20250514', {
messages: [{ role: 'user', content: prompt }],
max_tokens: 512,
});
});