Replicate HTTP API

Replicate provides a unified HTTP API to run community and official AI models (text, image, audio, video, etc.). The core resource is a "prediction". Developers create runs via POST /v1/predictions or model/deployment-specific endpoints, with support for sync (Prefer: wait) and async modes, SSE streaming, and webhooks. Authentication uses Bearer tokens and the base is https://api.replicate.com/v1. Rate limits are ~600 requests/min for create prediction and ~3000 requests/min for other endpoints. Pricing is per model (time-based or token-based).

基础 URL
https://api.replicate.com/v1
认证
Bearer / Authorization: Bearer $REPLICATE_API_TOKEN
官方 SDK
Python, Node.js
🔑 API Key 获取
前置要求:Requires a valid account; verify email and add billing to increase quota and rate limits; tokens grant account-level access—do not expose.
入口:https://replicate.com/account/api-tokens
说明:Sign up or sign in to Replicate, go to the "API tokens" page, create a new token (shown once), copy and store it, then configure it as an environment variable.

支持模型

API 接口列表

POST /v1/predictions Docs

Create a prediction (run a community model version), supporting sync via Prefer: wait and default async mode. Returns a prediction object with status and output.

认证Yes
频率限制600/min
计费模式按具体模型计费(按时长或按输入/输出令牌);请求本身不单独计价
{
  "version": "5c7d5dc6dd8bf75c1acaa8565735e7986bc5b66206b55cca93cb72c9bf15ccaa",
  "input": {
    "prompt": "A photo of a bear riding a bicycle over the moon"
  }
}
GET /v1/predictions/{id} Docs

Get the current state and result of a specific prediction (including output, logs, timing, etc.).

认证Yes
频率限制3000/min
计费模式免费查询;实际费用取决于创建预测所用模型
curl -s -H 'Authorization: Bearer $REPLICATE_API_TOKEN' https://api.replicate.com/v1/predictions/{id}
POST /v1/predictions/{id}/cancel Docs

Cancel a prediction that is not yet completed. Completed predictions cannot be canceled.

认证Yes
频率限制3000/min
计费模式免费操作;费用仍由已运行的模型用量决定
curl -s -X POST -H 'Authorization: Bearer $REPLICATE_API_TOKEN' https://api.replicate.com/v1/predictions/{id}/cancel
GET /v1/predictions Docs

List predictions you have created (from website and API), paginated, returning up to 100 records per page by default.

认证Yes
频率限制3000/min
计费模式免费查询;费用取决于实际预测运行
curl -s -H 'Authorization: Bearer $REPLICATE_API_TOKEN' https://api.replicate.com/v1/predictions
POST /v1/models/{owner}/{name}/predictions Docs

Create a prediction for an official model without requiring a specific version ID (official models manage versioning policy).

认证Yes
频率限制600/min
计费模式按模型计费(按时长或按令牌)
{
  "input": {
    "prompt": "A cozy cabin in the woods"
  }
}
POST /v1/deployments/{owner}/{name}/predictions Docs

Create a prediction on a specified deployment, useful for production scenarios (stable model configuration and quotas).

认证Yes
频率限制600/min
计费模式按部署绑定模型的计费规则执行
curl -s -X POST -H 'Prefer: wait' -H 'Authorization: Bearer $REPLICATE_API_TOKEN' -H 'Content-Type: application/json' -d '{
  "input": { "prompt": "A photo of a bear riding a bicycle over the moon" }
}' https://api.replicate.com/v1/deployments/{owner}/{name}/predictions
DELETE /v1/deployments/{owner}/{name} Docs

Delete the specified deployment. On success returns 204 No Content.

认证Yes
频率限制3000/min
计费模式免费操作;不产生推理费用
curl -s -X DELETE -H 'Authorization: Bearer $REPLICATE_API_TOKEN' https://api.replicate.com/v1/deployments/{owner}/{name}
GET /v1/webhooks/default/secret Docs

Get the default webhook secret (returns a JSON object with a key property).

认证Yes
频率限制3000/min
计费模式免费查询;用于校验 Webhook 签名
curl -s -H 'Authorization: Bearer $REPLICATE_API_TOKEN' https://api.replicate.com/v1/webhooks/default/secret
GET /v1/search Docs

Search public models, collections, and docs (Beta), supports query string and result limit.

认证Yes
频率限制3000/min
计费模式免费查询;不产生推理费用
curl -s -H 'Authorization: Bearer $REPLICATE_API_TOKEN' 'https://api.replicate.com/v1/search?query=flux&models_limit=20'
GET /v1/models/{owner}/{name} Docs

Get model details (including available versions and metadata; version's openapi_schema helps validate inputs).

认证Yes
频率限制3000/min
计费模式免费查询;用于发现模型与版本信息
curl -s -H 'Authorization: Bearer $REPLICATE_API_TOKEN' https://api.replicate.com/v1/models/{owner}/{name}
GET /v1/models/{owner}/{name}/versions Docs

List versions for the specified model.

认证Yes
频率限制3000/min
计费模式免费查询;用于选择运行版本
curl -s -H 'Authorization: Bearer $REPLICATE_API_TOKEN' https://api.replicate.com/v1/models/{owner}/{name}/versions