Google AI for Developers — Gemini API

The Gemini API by Google AI is a multimodal generative AI service that supports understanding and generation across text, images, video, audio, and PDF. It provides structured outputs, function calling, context caching, batch processing, embeddings, and token counting, suitable for chat agents, content generation, RAG, agent tool use, and large-scale pipelines.

基础 URL
https://generativelanguage.googleapis.com
认证
Bearer / Authorization: Bearer
官方 SDK
JavaScript, Python, Android (Kotlin), Swift
🔑 API Key 获取
前置要求:Requires a Google account. To increase rate limits or use the paid tier, enable Cloud Billing on your Cloud project.
入口:https://aistudio.google.com/api-keys
说明:Sign in to Google AI Studio, open 'API Keys', click 'Create API key' to generate a key (shown only once). Manage free or paid plans per project.

支持模型

API 接口列表

GET /v1beta/models Docs

List available Gemini models (stable and preview), returning model names, supported modalities, context window, and capability flags.

认证Yes
频率限制10 requests/min
计费模式按量计费(每百万 token,输入与输出分别计费);提供免费层级
{
  "url": "https://generativelanguage.googleapis.com/v1beta/models?key=YOUR_API_KEY"
}
GET /v1beta/models/{model} Docs

Get detailed information for a specific model, such as supported input/output types, token limits, and features.

认证Yes
频率限制10 requests/min
计费模式按量计费(每百万 token,输入与输出分别计费);提供免费层级
{
  "url": "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash?key=YOUR_API_KEY"
}
POST /v1beta/models/{model}:generateContent Docs

Generate content (text and multimodal). Supports structured outputs, function calling, context caching, and tool use.

认证Yes
频率限制10 requests/min
计费模式按量计费(每百万 token,输入与输出分别计费);批处理可降本
{
  "url": "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=YOUR_API_KEY",
  "body": {
    "contents": [
      {
        "role": "user",
        "parts": [
          {
            "text": "用 3 点概述 Gemini API 的核心能力。"
          }
        ]
      }
    ]
  }
}
POST /v1beta/models/{model}:streamGenerateContent Docs

Generate content with server-streaming responses, suitable for low-latency output scenarios.

认证Yes
频率限制10 requests/min
计费模式按量计费(每百万 token,输入与输出分别计费)
{
  "url": "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent?key=YOUR_API_KEY",
  "body": {
    "contents": [
      {
        "role": "user",
        "parts": [
          {
            "text": "请逐步流式输出结果。"
          }
        ]
      }
    ]
  }
}
POST /v1beta/models/{model}:countTokens Docs

Count tokens for prompts and candidates to estimate cost and quota usage.

认证Yes
频率限制10 requests/min
计费模式计数请求不收取令牌费用,但占用配额;生成/嵌入按量计费
{
  "url": "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:countTokens?key=YOUR_API_KEY",
  "body": {
    "contents": [
      {
        "role": "user",
        "parts": [
          {
            "text": "这段文字的 token 数是多少?"
          }
        ]
      }
    ]
  }
}
POST /v1beta/models/{model}:embedContent Docs

Generate an embedding vector for a single content item for RAG and semantic search.

认证Yes
频率限制100 requests/min
计费模式按量计费(每百万 token);不同嵌入模型价格不同
{
  "url": "https://generativelanguage.googleapis.com/v1beta/models/embedding-001:embedContent?key=YOUR_API_KEY",
  "body": {
    "content": {
      "parts": [
        {
          "text": "检索增强的文档片段"
        }
      ]
    }
  }
}
POST /v1beta/models/{model}:batchEmbedContents Docs

Generate embeddings in batch for multiple items to improve throughput and reduce overall cost.

认证Yes
频率限制100 requests/min
计费模式按量计费(每百万 token);批处理具备成本优势
{
  "url": "https://generativelanguage.googleapis.com/v1beta/models/embedding-001:batchEmbedContents?key=YOUR_API_KEY",
  "body": {
    "requests": [
      {
        "content": {
          "parts": [
            {
              "text": "文档 A"
            }
          ]
        }
      },
      {
        "content": {
          "parts": [
            {
              "text": "文档 B"
            }
          ]
        }
      }
    ]
  }
}