This documentation is for an unreleased version of Apache Flink. We recommend you use the latest stable version.

OpenAI #

OpenAI模型函数允许Flink SQL调用OpenAI API执行推理任务。

概述 #

该函数支持通过Flink SQL调用远程的OpenAI模型服务进行预测/推理任务。目前支持以下任务类型：

Chat Completions：根据包含对话消息列表生成模型响应。
Embeddings：获取给定输入的向量表示，方便在后续流程中由机器学习模型和算法消费。

使用示例 #

以下示例创建了一个聊天补全模型，并使用它对电影评论进行情感标签预测。

首先，使用如下SQL语句创建聊天补全模型：

CREATE MODEL ai_analyze_sentiment
INPUT (`input` STRING)
OUTPUT (`content` STRING)
WITH (
    'provider'='openai',
    'endpoint'='https://api.openai.com/v1/chat/completions',
    'api-key' = '<YOUR KEY>',
    'model'='gpt-3.5-turbo',
    'system-prompt' = 'Classify the text below into one of the following labels: [positive, negative, neutral, mixed]. Output only the label.'
);

假设如下数据存储在名为 movie_comment 的表中，预测结果需要存储到名为 print_sink 的表中：

CREATE TEMPORARY VIEW movie_comment(id, movie_name,  user_comment, actual_label)
AS VALUES
  (1, '好东西', '最爱小孩子猜声音那段，算得上看过的电影里相当浪漫的叙事了。很温和也很有爱。', 'positive');

CREATE TEMPORARY TABLE print_sink(
  id BIGINT,
  movie_name VARCHAR,
  predicit_label VARCHAR,
  actual_label VARCHAR
) WITH (
  'connector' = 'print'
);

然后就可以使用如下SQL语句对电影评论进行情感标签预测。

INSERT INTO print_sink
SELECT id, movie_name, content as predicit_label, actual_label
FROM ML_PREDICT(
  TABLE movie_comment,
  MODEL ai_analyze_sentiment,
  DESCRIPTOR(user_comment));

模型选项 #

公共选项 #

Key	Default	Type	Description
api-key	(none)	String	OpenAI API key for authentication.
context-overflow-action	truncated-tail	Enum	Action to handle context overflows. Possible values: "truncated-tail": Truncates exceeded tokens from the tail of the context. "truncated-tail-log": Truncates exceeded tokens from the tail of the context. Records the truncation log. "truncated-head": Truncates exceeded tokens from the head of the context. "truncated-head-log": Truncates exceeded tokens from the head of the context. Records the truncation log. "skipped": Skips the input row. "skipped-log": Skips the input row. Records the skipping log.
endpoint	(none)	String	Full URL of the OpenAI API endpoint, e.g., `https://api.openai.com/v1/chat/completions` or `https://api.openai.com/v1/embeddings`
error-handling-strategy	RETRY	Enum	Strategy for handling errors during model requests. Possible values: "RETRY": Retry sending the request. "FAILOVER": Throw exceptions and fail the Flink job. "IGNORE": Ignore the input that caused the error and continue. The error itself would be recorded in log.
max-context-size	(none)	Integer	Max number of tokens for context. context-overflow-action would be triggered if this threshold is exceeded.
model	(none)	String	Model name, e.g., `gpt-3.5-turbo`, `text-embedding-ada-002`.
retry-fallback-strategy	FAILOVER	Enum	Fallback strategy to employ if the retry attempts are exhausted. This strategy is applied when error-handling-strategy is set to retry. Possible values: "FAILOVER": Throw exceptions and fail the Flink job. "IGNORE": Ignore the input that caused the error and continue. The error itself would be recorded in log.
retry-num	100	Integer	Number of retry for OpenAI client requests.

Chat Completions #

Key	Default	Type	Description
max-tokens	(none)	Long	The maximum number of tokens that can be generated in the chat completion.
n	(none)	Long	How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
presence-penalty	(none)	Double	Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
response-format	(none)	Enum	The format of the response, e.g., 'text' or 'json_object'. Possible values: "text" "json_object"
seed	(none)	Long	If specified, the model platform will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed.
stop	(none)	String	A CSV list of strings to pass as stop sequences to the model.
system-prompt	"You are a helpful assistant."	String	The system message of a chat.
temperature	(none)	Double	Controls the randomness or “creativity” of the output. Typical values are between 0.0 and 1.0.
top-p	(none)	Double	The probability cutoff for token selection. Usually, either temperature or topP are specified, but not both.

Embeddings #

Key	Default	Type	Description
dimension	(none)	Long	The size of the embedding result array.

Schema要求 #

任务类型	输入类型	输出类型
Chat Completions	STRING	STRING
Embeddings	STRING	ARRAY<FLOAT>

可用元数据 #

当配置 error-handling-strategy 为 ignore 时，您可以选择额外指定以下元数据列，将故障信息展示到您的输出流中。

error-string(STRING)：与错误相关的消息
http-status-code(INT)：HTTP状态码
http-headers-map(MAP<STRING, ARRAY>)：响应返回的头部信息

如果您在Output Schema中定义了这些元数据列，但调用未失败，则这些列将填充为null值。

OpenAI #

概述 #

使用示例 #

模型选项 #

公共选项 #

api-key

context-overflow-action

endpoint

error-handling-strategy

max-context-size

model

retry-fallback-strategy

retry-num

Chat Completions #

max-tokens

n

presence-penalty

response-format

seed

stop

system-prompt

temperature

top-p

Embeddings #

dimension

Schema要求 #

可用元数据 #