跳到主内容

Unity Catalog 集成

本示例说明了 Unity Catalog (UC) 集成与 MLflow AI 网关的使用。此集成使您能够利用注册在 Unity Catalog 中的函数作为工具来增强您的聊天应用程序。

先决条件

  1. 克隆 MLflow 仓库

要下载本示例所需的文件,请克隆 MLflow 仓库

git clone --depth=1 https://github.com/mlflow/mlflow.git
cd mlflow

如果您没有安装 git,可以从 https://github.com/mlflow/mlflow/archive/refs/heads/master.zip 下载仓库的 zip 文件。

  1. 安装所需的包
pip install mlflow>=2.14.0 openai databricks-sdk
  1. 通过在您的 Databricks 工作空间中运行以下 SQL 命令,创建示例脚本中使用的 UC 函数
CREATE OR REPLACE FUNCTION
my.uc_func.add (
x INTEGER COMMENT 'The first number to add.',
y INTEGER COMMENT 'The second number to add.'
)
RETURNS INTEGER
LANGUAGE SQL
RETURN x + y

要定义您自己的函数,请参阅 https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-create-sql-function.html#create-function-sql-and-python

  1. 按照 https://docs.databricks.com/en/compute/sql-warehouse/create.html 的说明创建一个 SQL 仓库。

运行网关服务器

完成先决条件后,您可以启动网关服务器

# Required to authenticate with Databricks. See https://docs.databricks.com/en/dev-tools/auth/index.html#supported-authentication-types-by-databricks-tool-or-sdk for other authentication methods.
export DATABRICKS_HOST="..."
export DATABRICKS_TOKEN="..."

# Required to execute UC functions. See https://docs.databricks.com/en/integrations/compute-details.html#get-connection-details-for-a-databricks-compute-resource for how to get the http path of your warehouse.
# The last part of the http path is the warehouse ID.
#
# /sql/1.0/warehouses/1234567890123456
# ^^^^^^^^^^^^^^^^
export DATABRICKS_WAREHOUSE_ID="..."

# Required to authenticate with OpenAI.
# See https://platform.openai.com/docs/guides/authentication for how to get your API key.
export OPENAI_API_KEY="..."

# Enable Unity Catalog integration
export MLFLOW_ENABLE_UC_FUNCTIONS=true

# Run the server
mlflow gateway start --config-path examples/gateway/deployments_server/openai/config.yaml --port 7000

使用 UC 函数查询端点

服务器运行后,您可以运行示例脚本

# `run.py` uses the `openai.OpenAI` client to query the gateway server,
# but it throws an error if the `OPENAI_API_KEY` environment variable is not set.
# To avoid this error, use a dummy API key.
export OPENAI_API_KEY="test"

# Replace `my.uc_func.add` if your UC function has a different name
python examples/gateway/uc_functions/run.py --uc-function-name my.uc_func.add

幕后发生了什么?

当 MLflow AI 网关收到包含 uc_functiontools 请求时,它会自动获取 UC 函数元数据以构建函数 schema,查询聊天 API 以确定调用函数所需的参数,然后使用提供的参数调用函数。

uc_function = {
"type": "uc_function",
"uc_function": {
"name": args.uc_function_name,
},
}

resp = client.chat.completions.create(
model="chat",
messages=[
{
"role": "user",
"content": "What is the result of 1 + 2?",
}
],
tools=[uc_function],
)

print(resp.choices[0].message.content) # -> The result of 1 + 2 is 3

上述代码等同于以下内容

# Function tool schema:
# https://platform.openai.com/docs/api-reference/chat/create#chat-create-tools
function = {
"type": "function",
"function": {
"description": None,
"name": "my.uc_func.add",
"parameters": {
"type": "object",
"properties": {
"x": {
"type": "integer",
"name": "x",
"description": "The first number to add.",
},
"y": {
"type": "integer",
"name": "y",
"description": "The second number to add.",
},
},
"required": ["x", "y"],
},
},
}

messages = [
{
"role": "user",
"content": "What is the result of 1 + 2?",
}
]

resp = client.chat.completions.create(
model="chat",
tools=[function],
)

resp_message = resp.choices[0].message
messages.append(resp_message)
tool_call = tool_calls[0]
arguments = json.loads(tool_call.function.arguments)
result = arguments["x"] + arguments["y"]
messages.append(
{
"tool_call_id": tool_call.id,
"role": "tool",
"name": "my.uc_func.add",
"content": str(result),
}
)

final_resp = client.chat.messages.create(
model="chat",
messages=messages,
)

print(final_resp.choices[0].message.content) # -> The result of 1 + 2 is 3