OpenAI
This documentation is for an unreleased version of Apache Flink. We recommend you use the latest stable version.

OpenAI #

OpenAI模型函数允许Flink SQL调用OpenAI API执行推理任务。

概述 #

该函数支持通过Flink SQL调用远程的OpenAI模型服务进行预测/推理任务。目前支持以下任务类型:

  • Chat Completions:根据包含对话消息列表生成模型响应。
  • Embeddings:获取给定输入的向量表示,方便在后续流程中由机器学习模型和算法消费。

使用示例 #

以下示例创建了一个聊天补全模型,并使用它对电影评论进行情感标签预测。

首先,使用如下SQL语句创建聊天补全模型:

CREATE MODEL ai_analyze_sentiment
INPUT (`input` STRING)
OUTPUT (`content` STRING)
WITH (
    'provider'='openai',
    'endpoint'='https://api.openai.com/v1/chat/completions',
    'api-key' = '<YOUR KEY>',
    'model'='gpt-3.5-turbo',
    'system-prompt' = 'Classify the text below into one of the following labels: [positive, negative, neutral, mixed]. Output only the label.'
);

假设如下数据存储在名为 movie_comment 的表中,预测结果需要存储到名为 print_sink 的表中:

CREATE TEMPORARY VIEW movie_comment(id, movie_name,  user_comment, actual_label)
AS VALUES
  (1, '好东西', '最爱小孩子猜声音那段,算得上看过的电影里相当浪漫的叙事了。很温和也很有爱。', 'positive');

CREATE TEMPORARY TABLE print_sink(
  id BIGINT,
  movie_name VARCHAR,
  predicit_label VARCHAR,
  actual_label VARCHAR
) WITH (
  'connector' = 'print'
);

然后就可以使用如下SQL语句对电影评论进行情感标签预测。

INSERT INTO print_sink
SELECT id, movie_name, content as predicit_label, actual_label
FROM ML_PREDICT(
  TABLE movie_comment,
  MODEL ai_analyze_sentiment,
  DESCRIPTOR(user_comment));

模型选项 #

公共选项 #

Key Default Type Description
api-key
(none) String OpenAI API key for authentication.
context-overflow-action
truncated-tail

Enum

Action to handle context overflows.

Possible values:
  • "truncated-tail": Truncates exceeded tokens from the tail of the context.
  • "truncated-tail-log": Truncates exceeded tokens from the tail of the context. Records the truncation log.
  • "truncated-head": Truncates exceeded tokens from the head of the context.
  • "truncated-head-log": Truncates exceeded tokens from the head of the context. Records the truncation log.
  • "skipped": Skips the input row.
  • "skipped-log": Skips the input row. Records the skipping log.
endpoint
(none) String Full URL of the OpenAI API endpoint, e.g., https://api.openai.com/v1/chat/completions or https://api.openai.com/v1/embeddings
error-handling-strategy
RETRY

Enum

Strategy for handling errors during model requests.

Possible values:
  • "RETRY": Retry sending the request.
  • "FAILOVER": Throw exceptions and fail the Flink job.
  • "IGNORE": Ignore the input that caused the error and continue. The error itself would be recorded in log.
max-context-size
(none) Integer Max number of tokens for context. context-overflow-action would be triggered if this threshold is exceeded.
model
(none) String Model name, e.g., gpt-3.5-turbo, text-embedding-ada-002.
retry-fallback-strategy
FAILOVER

Enum

Fallback strategy to employ if the retry attempts are exhausted. This strategy is applied when error-handling-strategy is set to retry.

Possible values:
  • "FAILOVER": Throw exceptions and fail the Flink job.
  • "IGNORE": Ignore the input that caused the error and continue. The error itself would be recorded in log.
retry-num
100 Integer Number of retry for OpenAI client requests.

Chat Completions #

Key Default Type Description
max-tokens
(none) Long The maximum number of tokens that can be generated in the chat completion.
n
(none) Long How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
presence-penalty
(none) Double Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
response-format
(none)

Enum

The format of the response, e.g., 'text' or 'json_object'.

Possible values:
  • "text"
  • "json_object"
seed
(none) Long If specified, the model platform will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed.
stop
(none) String A CSV list of strings to pass as stop sequences to the model.
system-prompt
"You are a helpful assistant." String The system message of a chat.
temperature
(none) Double Controls the randomness or “creativity” of the output. Typical values are between 0.0 and 1.0.
top-p
(none) Double The probability cutoff for token selection. Usually, either temperature or topP are specified, but not both.

Embeddings #

Key Default Type Description
dimension
(none) Long The size of the embedding result array.

Schema要求 #

任务类型 输入类型 输出类型
Chat Completions STRING STRING
Embeddings STRING ARRAY<FLOAT>

可用元数据 #

当配置 error-handling-strategyignore 时,您可以选择额外指定以下元数据列,将故障信息展示到您的输出流中。

  • error-string(STRING):与错误相关的消息
  • http-status-code(INT):HTTP状态码
  • http-headers-map(MAP<STRING, ARRAY>):响应返回的头部信息

如果您在Output Schema中定义了这些元数据列,但调用未失败,则这些列将填充为null值。