Google 聊天补全
通过 OpenAI 兼容接口调用 Gemini 系列模型。
统一接口
Google 模型同样通过 /v1/chat/completions 接口调用,请求和响应格式与 OpenAI 一致。
请求
POST https://api.clawdrouter.com/v1/chat/completions
请求头
| 名称 | 必填 | 类型 | 说明 |
|---|---|---|---|
Authorization | 是 | string | Bearer YOUR_API_KEY |
Content-Type | 是 | string | application/json |
Request-Id | 否 | string | 客户系统生成的唯一业务标识 |
请求体
核心参数与 OpenAI 聊天补全 一致。Google 模型额外支持以下参数:
| 参数 | 类型 | 必填 | 默认值 | 说明 |
|---|---|---|---|---|
model | string | 是 | — | 模型标识,参考模型列表 |
messages | array | 是 | — | 对话消息列表 |
stream | boolean | 否 | false | 是否启用流式输出 |
reasoning_effort | string | 否 | — | 思考力度控制:low / medium / high。用于控制 Gemini 思考模型的推理深度 |
thinking | object | 否 | — | 思考模式控制。设置 {"type": "enabled", "budget_tokens": 0} 可关闭思考 |
stream_options | object | 否 | — | 流式选项。{"include_usage": true} 可在流式响应中包含 Token 用量 |
完整参数说明请参考 OpenAI 聊天补全参数。
请求示例
示例 1:流式请求(带思考)
curl https://api.clawdrouter.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "gemini-2.5-flash",
"messages": [
{"role": "user", "content": "介绍一下量子计算"}
],
"stream": true,
"stream_options": {"include_usage": true},
"reasoning_effort": "low"
}'
流式响应(思考内容通过 reasoning_content 字段返回):
data: {
"id": "gCglaaHuCby-694Pv5bo0Q0",
"created": 1764042881,
"model": "gemini-2.5-flash",
"object": "chat.completion.chunk",
"choices": [
{
"index": 0,
"delta": {
"role": "assistant",
"reasoning_content": "**Initiating Conceptual Breakdown**\n\nI'm starting by dissecting the prompt...",
"content": null
},
"finish_reason": null
}
]
}
data: {
"id": "gCglaaHuCby-694Pv5bo0Q0",
"created": 1764042881,
"model": "gemini-2.5-flash",
"object": "chat.completion.chunk",
"choices": [
{
"index": 0,
"delta": {
"content": "量子计算是一种利用量子力学原理进行信息处理的新型计算模式..."
},
"finish_reason": null
}
]
}
data: [DONE]
思考内容
Gemini 2.5 系列模型支持「思考」功能。流式响应中,模型的思考过程通过 delta.reasoning_content 字段返回,正式回复内容通过 delta.content 字段返回。
示例 2:非流式请求
curl https://api.clawdrouter.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "gemini-2.5-flash",
"messages": [
{"role": "user", "content": "介绍一下量子计算"}
],
"stream": false,
"reasoning_effort": "low"
}'
响应:
{
"id": "Oiwlacz2CYq4694P3eOo8Q0",
"created": 1764043828,
"model": "gemini-2.5-flash",
"object": "chat.completion",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "量子计算(Quantum Computing)是一种利用量子力学原理进行信息处理的新型计算模式...",
"role": "assistant"
}
}
],
"usage": {
"prompt_tokens": 5,
"completion_tokens": 1200,
"total_tokens": 1205
}
}
响应字段说明
| 字段 | 类型 | 说明 |
|---|---|---|
id | string | 本次补全的唯一标识 |
object | string | 固定为 chat.completion(非流式)或 chat.completion.chunk(流式) |
created | integer | 创建时间的 Unix 时间戳 |
model | string | 实际使用的模型版本 |
choices | array | 补全结果列表 |
choices[].message | object | 模型生成的消息(非流式) |
choices[].delta | object | 流式增量内容,可包含 content 和 reasoning_content |
choices[].finish_reason | string | 停止原因:stop(正常结束)/ length(达到长度限制)/ tool_calls(调用工具) |
usage | object | Token 用量统计(流式需设置 stream_options.include_usage: true) |
示例 3:关闭思考模式
如果不需要模型的思考过程,可以通过 thinking 参数关闭:
curl https://api.clawdrouter.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "gemini-2.5-flash",
"messages": [
{"role": "user", "content": "介绍一下量子计算"}
],
"stream": true,
"thinking": {
"type": "enabled",
"budget_tokens": 0
},
"stream_options": {"include_usage": true}
}'
关闭思考后,响应中将不再包含 reasoning_content 字段,模型直接输出回复内容。
使用 Python SDK
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://api.clawdrouter.com/v1",
)
# 基础调用
completion = client.chat.completions.create(
model="gemini-2.5-flash",
messages=[
{"role": "user", "content": "介绍一下量子计算"}
],
)
print(completion.choices[0].message.content)
# 流式调用
stream = client.chat.completions.create(
model="gemini-2.5-flash",
messages=[
{"role": "user", "content": "介绍一下量子计算"}
],
stream=True,
extra_body={"reasoning_effort": "low"},
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
使用 Node.js SDK
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "YOUR_API_KEY",
baseURL: "https://api.clawdrouter.com/v1",
});
async function main() {
// 基础调用
const completion = await client.chat.completions.create({
model: "gemini-2.5-flash",
messages: [
{ role: "user", content: "介绍一下量子计算" },
],
});
console.log(completion.choices[0].message.content);
}
main();
流式调用(带思考)
async function streamWithThinking() {
const stream = await client.chat.completions.create({
model: "gemini-2.5-flash",
messages: [
{ role: "user", content: "介绍一下量子计算" },
],
stream: true,
stream_options: { include_usage: true },
reasoning_effort: "low",
});
for await (const chunk of stream) {
const delta = chunk.choices[0]?.delta;
if (delta?.reasoning_content) {
process.stdout.write(`[思考] ${delta.reasoning_content}`);
}
if (delta?.content) {
process.stdout.write(delta.content);
}
}
}
streamWithThinking();