Hugging Face

Hugging Face is a leading open-source and open-science AI platform. Its core includes the Hugging Face Hub for models/datasets/Spaces, inference and hosting (Inference Providers and Inference Endpoints), and a rich ecosystem of open-source libraries (Transformers, Diffusers, Datasets, Tokenizers, Accelerate, PEFT, TRL, Safetensors, Transformers.js, smolagents, Text Generation Inference, etc.). The platform supports text, image, audio, video and 3D modalities with unified access via Python, JavaScript and REST/OpenAI-compatible endpoints.

基础 URL
https://router.huggingface.co/v1
认证
Bearer / Authorization: Bearer hf_XXXXXXXXXXXXXXXXXXXX
官方 SDK
Python, JavaScript
🔑 API Key 获取
前置要求:For dedicated Inference Endpoints, add a valid payment method on your account or organization billing page.
入口:https://huggingface.co/settings/tokens
说明:Generate a User Access Token on the Hugging Face tokens page (fine-grained with ‘Make calls to Inference Providers’ permission recommended) to call Inference Providers/Hub API/Endpoints.

支持模型

API 接口列表

GET /api/models Docs

Paginated list of models on the Hub with query params like `search`, `author`, `filter`, `sort`, `direction`, `limit`, `full`, `config`.

认证No
频率限制60/min
计费模式免费列举;调用不计费
GET /api/models/{repo_id} Docs

Return detailed information for a specific model repo. Supports `revision`.

认证No
频率限制60/min
计费模式免费查询;调用不计费
GET /api/datasets Docs

Paginated list of datasets on the Hub with similar filter/sort params as models.

认证No
频率限制60/min
计费模式免费列举;调用不计费
GET /api/spaces Docs

Paginated list of Spaces applications on the Hub.

认证No
频率限制60/min
计费模式免费列举;调用不计费
POST https://router.huggingface.co/v1/chat/completions Docs

OpenAI-compatible chat completions. `model` requires a provider suffix or `:auto`/`:fastest`/`:cheapest` policy; supports `image_url` in messages. Officially covers Chat Completions only.

认证Yes
频率限制60/min
计费模式按提供商透传单价计费;月度赠额后按需付费
{
  "model": "deepseek-ai/DeepSeek-R1:fastest",
  "messages": [
    {
      "role": "user",
      "content": "Hello!"
    }
  ]
}
POST https://router.huggingface.co/v1/responses Docs

(Where applicable) OpenAI-compatible Responses API for advanced event streaming, structured outputs, and tool calling (e.g., gpt-oss family).

认证Yes
频率限制60/min
计费模式按提供商透传单价计费;月度赠额后按需付费
{
  "model": "openai/gpt-oss-120b:cerebras",
  "input": "Tell me a fun fact about the Eiffel Tower."
}
POST https://api-inference.huggingface.co/models/{model_id} Docs

Serverless Inference API to run inference for a specified model; request/response depends on task (text generation, image generation, ASR, etc.).

认证Yes
频率限制60/min
计费模式按执行时间计费;性能随模型与资源而变
curl -s -H 'Authorization: Bearer $HF_TOKEN' -H 'Content-Type: application/json' https://api-inference.huggingface.co/models/{model_id}
POST https://{your-endpoint-subdomain}.endpoints.huggingface.cloud/v1/chat/completions Docs

OpenAI-compatible Chat Completions on dedicated Inference Endpoints (TGI), requires the endpoint-specific `hf_...` API key.

认证Yes
频率限制60/min
计费模式按计算实例小时计费(自动扩缩)
{
  "model": "meta-llama/Meta-Llama-3-8B-Instruct",
  "messages": [
    {
      "role": "user",
      "content": "Explain transformers in one sentence."
    }
  ]
}
POST https://api-inference.huggingface.co/models/black-forest-labs/FLUX.1-dev Docs

Text-to-image generation example. The request is a text prompt; response is image binary or base64.

认证Yes
频率限制60/min
计费模式按执行时间或提供商计费
{
  "inputs": "A steampunk airship in the clouds"
}
POST https://api-inference.huggingface.co/models/openai/whisper-large-v3 Docs

ASR example. Upload audio or bytes and receive transcribed text.

认证Yes
频率限制60/min
计费模式按执行时间或提供商计费
curl -s -H 'Authorization: Bearer $HF_TOKEN' -H 'Content-Type: audio/wav' --data-binary @sample.wav https://api-inference.huggingface.co/models/openai/whisper-large-v3
GET /.well-known/openapi.json Docs

Hub API OpenAPI specification to generate clients and view all endpoints.

认证No
频率限制60/min
计费模式免费查询;调用不计费