API 参考文档

海鲸AI API 概述

海鲸AI 的 API 设计与 OpenAI Chat API 保持高度兼容，同时针对多模型场景进行了优化。我们对不同模型和服务提供商的接口进行了标准化处理，让您可以使用统一的方式调用各种 AI 模型，大大降低了学习成本和接入难度。

请求格式 (Requests)

Chat Completions 请求

向 /v2/chat/completions 端点发送 POST 请求来调用 AI 模型。

以下是完整的 TypeScript 类型定义，展示了所有可用的请求参数：

TypeScript

// 请求参数类型定义（子类型定义见下方）
type Request = {
  // messages 或 prompt 必须提供其中一个
  messages?: Message[];
  prompt?: string;

  // 模型名称，不指定时使用用户的默认模型
  model?: string;

  // 强制模型输出特定格式（如 JSON）
  response_format?: { type: 'json_object' };

  stop?: string | string[]; // 停止词，遇到时停止生成
  stream?: boolean; // 是否启用流式输出

  // LLM 核心参数
  max_tokens?: number; // 最大生成 token 数，范围: [1, context_length)
  temperature?: number; // 温度参数，控制输出随机性，范围: [0, 2]

  // 工具调用 (Tool Calling)
  // 对于支持 OpenAI 接口的提供商，会直接透传
  // 对于使用自定义接口的提供商，会进行属性转换和映射
  // 否则会将工具转换为 YAML 模板，模型会返回 assistant 消息
  tools?: Tool[];
  tool_choice?: ToolChoice;

  // 高级可选参数
  seed?: number; // 随机种子，整数类型，用于确保输出可复现
  top_p?: number; // 核采样参数，范围: (0, 1]
  top_k?: number; // Top-K 采样，范围: [1, Infinity)，注意: OpenAI 模型不支持
  frequency_penalty?: number; // 频率惩罚，降低重复内容，范围: [-2, 2]
  presence_penalty?: number; // 存在惩罚，鼓励新话题，范围: [-2, 2]
  repetition_penalty?: number; // 重复惩罚，范围: (0, 2]
  logit_bias?: { [key: number]: number }; // Token 偏置，调整特定 token 出现概率
  top_logprobs: number; // 返回 top N 个 token 的对数概率，整数类型
  min_p?: number; // 最小概率阈值，范围: [0, 1]
  top_a?: number; // Top-A 采样，范围: [0, 1]

  // 预测输出（降低延迟）
  prediction?: { type: 'content'; content: string };

  // 海鲸AI 专属参数
  transforms?: string[]; // 提示词转换
  models?: string[]; // 备选模型列表
  route?: 'fallback'; // 路由策略：失败自动回退
  provider?: ProviderPreferences; // 服务商偏好设置
  user?: string; // 终端用户标识符，用于防止滥用
};

// 子类型定义

type TextContent = {
  type: 'text';
  text: string;
};

type ImageContentPart = {
  type: 'image_url';
  image_url: {
    url: string; // 图片 URL 或 base64 编码数据
    detail?: string; // 可选，默认为 "auto"
  };
};

type ContentPart = TextContent | ImageContentPart;

type Message =
  | {
      role: 'user' | 'assistant' | 'system';
      content: string | ContentPart[]; // ContentPart[] 仅用于 "user" 角色
      name?: string; // 可选的名称，部分模型会将其加入消息前缀
    }
  | {
      role: 'tool';
      content: string;
      tool_call_id: string;
      name?: string;
    };

type FunctionDescription = {
  description?: string;
  name: string;
  parameters: object; // JSON Schema 对象
};

type Tool = {
  type: 'function';
  function: FunctionDescription;
};

type ToolChoice =
  | 'none'   // 不使用工具
  | 'auto'   // 自动选择
  | {
      type: 'function';
      function: {
        name: string; // 指定使用的函数名
      };
    };

关于 response_format 参数

response_format 参数用于确保您从 LLM 获得结构化的 JSON 响应。

支持情况：此参数目前仅受部分模型支持，包括 OpenAI 模型、Nitro 模型等。使用前请在海鲸AI 模型列表页面确认您选择的模型是否支持该参数。

使用提示：如需强制使用支持该参数的服务商，可在"服务商偏好设置"中将 require_parameters 设为 true。

响应格式 (Responses)

Chat Completions 响应

海鲸AI 对所有模型和服务商的响应格式进行了标准化，统一遵循 OpenAI Chat API 规范。

标准化说明：

choices 始终是一个数组，即使模型只返回单个结果
流式请求时，每个 choice 包含 delta 属性
非流式请求时，每个 choice 包含 message 属性
这种统一的格式让您可以用相同的代码处理不同模型的响应

以下是完整的 TypeScript 响应类型定义：

TypeScript

// 响应类型定义（子类型定义见下方）
type Response = {
  id: string;
  // 根据 stream 参数和输入类型（messages 或 prompt），返回不同的 choice 类型
  choices: (NonStreamingChoice | StreamingChoice | NonChatChoice)[];
  created: number; // Unix 时间戳
  model: string; // 实际使用的模型
  object: 'chat.completion' | 'chat.completion.chunk';

  system_fingerprint?: string; // 系统指纹（如果服务商支持）

  // 非流式请求总是返回 usage
  // 流式请求会在最后返回一个 usage 对象，此时 choices 为空数组
  usage?: ResponseUsage;
};

// Token 使用统计（服务商返回时直接透传，否则使用 GPT-4 tokenizer 计算）
type ResponseUsage = {
  prompt_tokens: number; // 提示词消耗的 token 数（包括图片和工具调用）
  completion_tokens: number; // 生成内容消耗的 token 数
  total_tokens: number; // 总 token 数
};

// 子类型定义

type NonChatChoice = {
  finish_reason: string | null;
  text: string;
  error?: ErrorResponse;
};

type NonStreamingChoice = {
  finish_reason: string | null; // 标准化的结束原因
  message: {
    content: string | null;
    role: string;
    tool_calls?: ToolCall[];
  };
  error?: ErrorResponse;
};

type StreamingChoice = {
  finish_reason: string | null;
  delta: {
    content: string | null; // 增量内容
    role?: string;
    tool_calls?: ToolCall[];
  };
  error?: ErrorResponse;
};

type ErrorResponse = {
  code: number; // 错误码
  message: string; // 错误信息
  metadata?: Record<string, unknown>; // 额外的错误信息，如服务商详情、原始错误等
};

type ToolCall = {
  id: string;
  type: 'function';
  function: FunctionCall;
};

响应示例

{
  "id": "gen-xxxxxxxxxxxxxx",
  "choices": [
    {
      "finish_reason": "stop",
      "native_finish_reason": "stop",
      "message": {  // 流式请求时为 "delta"
        "role": "assistant",
        "content": "你好！"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 4,
    "total_tokens": 14
  },
  "model": "openai/gpt-3.5-turbo"
}

结束原因 (Finish Reason)

海鲸AI 将所有模型的 finish_reason 标准化为以下几种值：

tool_calls - 模型调用了工具
stop - 正常完成（模型主动结束）
length - 达到最大 token 限制
content_filter - 触发内容过滤
error - 发生错误

API 参考文档 ​

海鲸AI API 概述 ​

请求格式 (Requests) ​

Chat Completions 请求 ​

关于 response_format 参数 ​

响应格式 (Responses) ​

Chat Completions 响应 ​

响应示例 ​

结束原因 (Finish Reason) ​