使用 mlflow.pyfunc.ChatModel 构建工具调用模型

下载此笔记本

欢迎来到关于使用 mlflow.pyfunc.ChatModel 包装器构建简单工具调用模型的 Notebook 教程。 ChatModel 是 MLflow 高度可定制的 PythonModel 的一个子类，专门设计用于简化 GenAI 工作流程的创建。

简而言之，以下是使用 ChatModel 的一些好处：

无需定义复杂的签名！聊天模型通常接受具有多个嵌套级别的复杂输入，这可能难以自行定义。
支持 JSON / dict 输入（无需包装输入或转换为 Pandas DataFrame）
包含使用 Dataclasses 来定义期望的输入/输出，以简化开发体验

有关 ChatModel 的更深入探索，请查看详细指南。

在本教程中，我们将构建一个简单的 OpenAI 包装器，该包装器利用工具调用支持（在 MLflow 2.17.0 中发布）。

环境设置

首先，让我们设置环境。我们需要 OpenAI Python SDK，以及 MLflow >= 2.17.0。我们还需要设置我们的 OpenAI API 密钥才能使用 SDK。

%pip install 'mlflow>=2.17.0' 'openai>=1.0' -qq

Note: you may need to restart the kernel to use updated packages.

import os
from getpass import getpass

os.environ["OPENAI_API_KEY"] = getpass("Enter your OpenAI API key: ")

步骤 1：创建工具定义

让我们开始定义我们的模型！如简介中所述，我们将继承 mlflow.pyfunc.ChatModel。对于本示例，我们将构建一个玩具模型，该模型使用工具来检索给定城市的天气。

第一步是创建一个工具定义，我们可以将其传递给 OpenAI。我们通过使用 mlflow.types.llm.FunctionToolDefinition 来描述我们的工具接受的参数来实现这一点。此数据类的格式与 OpenAI 规范一致

import mlflow
from mlflow.types.llm import (
  FunctionToolDefinition,
  ParamProperty,
  ToolParamsSchema,
)


class WeatherModel(mlflow.pyfunc.ChatModel):
  def __init__(self):
      # a sample tool definition. we use the `FunctionToolDefinition`
      # class to describe the name and expected params for the tool.
      # for this example, we're defining a simple tool that returns
      # the weather for a given city.
      weather_tool = FunctionToolDefinition(
          name="get_weather",
          description="Get weather information",
          parameters=ToolParamsSchema(
              {
                  "city": ParamProperty(
                      type="string",
                      description="City name to get weather information for",
                  ),
              }
          ),
          # make sure to call `to_tool_definition()` to convert the `FunctionToolDefinition`
          # to a `ToolDefinition` object. this step is necessary to normalize the data format,
          # as multiple types of tools (besides just functions) might be available in the future.
      ).to_tool_definition()

      # OpenAI expects tools to be provided as a list of dictionaries
      self.tools = [weather_tool.to_dict()]

步骤 2：实施该工具

现在我们有了该工具的定义，我们需要实际实施它。就本教程而言，我们只是要模拟一个响应，但是该实现可以是任意的，例如，您可以对实际的天气服务进行 API 调用。

class WeatherModel(mlflow.pyfunc.ChatModel):
  def __init__(self):
      weather_tool = FunctionToolDefinition(
          name="get_weather",
          description="Get weather information",
          parameters=ToolParamsSchema(
              {
                  "city": ParamProperty(
                      type="string",
                      description="City name to get weather information for",
                  ),
              }
          ),
      ).to_tool_definition()

      self.tools = [weather_tool.to_dict()]

      def get_weather(self, city: str) -> str:
          # in a real-world scenario, the implementation might be more complex
          return f"It's sunny in {city}, with a temperature of 20C"

步骤 3：实施 `predict` 方法

我们需要做的下一件事是定义一个接受以下参数的 predict() 函数

context: PythonModelContext（在本教程中未使用）
messages: List[ChatMessage]。这是模型用于生成的聊天输入。
params: ChatParams。这些是通常用于配置聊天模型的参数，例如 temperature、max_tokens 等。这是可以找到工具规范的地方。

这是最终将在推理期间调用的函数。

对于该实现，我们将简单地将用户的输入转发到 OpenAI，并提供 get_weather 工具作为 LLM 选择使用的选项。如果我们收到工具调用请求，我们将调用 get_weather() 函数并将响应返回给 OpenAI。我们需要使用我们在前两个步骤中定义的内容才能执行此操作。

import json

from openai import OpenAI

import mlflow
from mlflow.types.llm import (
  ChatMessage,
  ChatParams,
  ChatResponse,
)


class WeatherModel(mlflow.pyfunc.ChatModel):
  def __init__(self):
      weather_tool = FunctionToolDefinition(
          name="get_weather",
          description="Get weather information",
          parameters=ToolParamsSchema(
              {
                  "city": ParamProperty(
                      type="string",
                      description="City name to get weather information for",
                  ),
              }
          ),
      ).to_tool_definition()

      self.tools = [weather_tool.to_dict()]

  def get_weather(self, city: str) -> str:
      return "It's sunny in {}, with a temperature of 20C".format(city)

  # the core method that needs to be implemented. this function
  # will be called every time a user sends messages to our model
  def predict(self, context, messages: list[ChatMessage], params: ChatParams):
      # instantiate the OpenAI client
      client = OpenAI()

      # convert the messages to a format that the OpenAI API expects
      messages = [m.to_dict() for m in messages]

      # call the OpenAI API
      response = client.chat.completions.create(
          model="gpt-4o-mini",
          messages=messages,
          # pass the tools in the request
          tools=self.tools,
      )

      # if OpenAI returns a tool_calling response, then we call
      # our tool. otherwise, we just return the response as is
      tool_calls = response.choices[0].message.tool_calls
      if tool_calls:
          print("Received a tool call, calling the weather tool...")

          # for this example, we only provide the model with one tool,
          # so we can assume the tool call is for the weather tool. if
          # we had more, we'd need to check the name of the tool that
          # was called
          city = json.loads(tool_calls[0].function.arguments)["city"]
          tool_call_id = tool_calls[0].id

          # call the tool and construct a new chat message
          tool_response = ChatMessage(
              role="tool", content=self.get_weather(city), tool_call_id=tool_call_id
          ).to_dict()

          # send another request to the API, making sure to append
          # the assistant's tool call along with the tool response.
          messages.append(response.choices[0].message)
          messages.append(tool_response)
          response = client.chat.completions.create(
              model="gpt-4o-mini",
              messages=messages,
              tools=self.tools,
          )

      # return the result as a ChatResponse, as this
      # is the expected output of the predict method
      return ChatResponse.from_dict(response.to_dict())

步骤 4（可选，但建议）：为模型启用跟踪

此步骤是可选的，但强烈建议这样做以提高应用程序中的可观察性。我们将使用 MLflow Tracing 来记录模型内部函数的输入和输出，因此我们可以轻松地在出现问题时进行调试。代理风格的工具调用模型可以在单个请求的生命周期内进行许多层函数调用，因此跟踪对于帮助我们了解每个步骤中发生的情况非常有价值。

集成跟踪很容易，我们只需使用 @mlflow.trace 装饰我们感兴趣的函数 (get_weather() 和 predict())！ MLflow Tracing 还与许多流行的 GenAI 框架集成，例如 LangChain、OpenAI、LlamaIndex 等。有关完整列表，请查看此文档页面。在本教程中，我们使用 OpenAI SDK 进行 API 调用，因此我们可以通过调用 mlflow.openai.autolog() 为其启用跟踪。

要在 UI 中查看跟踪，请在单独的终端 shell 中运行 mlflow ui，并在下面使用该模型进行推理后导航到 Traces 选项卡。

from mlflow.entities.span import (
  SpanType,
)

# automatically trace OpenAI SDK calls
mlflow.openai.autolog()


class WeatherModel(mlflow.pyfunc.ChatModel):
  def __init__(self):
      weather_tool = FunctionToolDefinition(
          name="get_weather",
          description="Get weather information",
          parameters=ToolParamsSchema(
              {
                  "city": ParamProperty(
                      type="string",
                      description="City name to get weather information for",
                  ),
              }
          ),
      ).to_tool_definition()

      self.tools = [weather_tool.to_dict()]

  @mlflow.trace(span_type=SpanType.TOOL)
  def get_weather(self, city: str) -> str:
      return "It's sunny in {}, with a temperature of 20C".format(city)

  @mlflow.trace(span_type=SpanType.AGENT)
  def predict(self, context, messages: list[ChatMessage], params: ChatParams):
      client = OpenAI()

      messages = [m.to_dict() for m in messages]

      response = client.chat.completions.create(
          model="gpt-4o-mini",
          messages=messages,
          tools=self.tools,
      )

      tool_calls = response.choices[0].message.tool_calls
      if tool_calls:
          print("Received a tool call, calling the weather tool...")

          city = json.loads(tool_calls[0].function.arguments)["city"]
          tool_call_id = tool_calls[0].id

          tool_response = ChatMessage(
              role="tool", content=self.get_weather(city), tool_call_id=tool_call_id
          ).to_dict()

          messages.append(response.choices[0].message)
          messages.append(tool_response)
          response = client.chat.completions.create(
              model="gpt-4o-mini",
              messages=messages,
              tools=self.tools,
          )

      return ChatResponse.from_dict(response.to_dict())

步骤 5：记录模型

最后，我们需要记录模型。这会将模型保存为 MLflow Tracking 中的一个工件，并允许我们稍后加载并提供该模型。

（注意：这是 MLflow 中的一个基本模式。要了解更多信息，请查看快速入门指南！）

为了做到这一点，我们需要做几件事

定义一个输入示例，以告知用户我们期望的输入
实例化该模型
使用上述内容作为参数调用 mlflow.pyfunc.log_model()

记下单元格末尾打印出的模型 URI，稍后在提供模型时我们将需要它！

# messages to use as input examples
messages = [
  {"role": "system", "content": "Please use the provided tools to answer user queries."},
  {"role": "user", "content": "What's the weather in Singapore?"},
]

input_example = {
  "messages": messages,
}

# instantiate the model
model = WeatherModel()

# log the model
with mlflow.start_run():
  model_info = mlflow.pyfunc.log_model(
      name="weather-model",
      python_model=model,
      input_example=input_example,
  )

  print("Successfully logged the model at the following URI: ", model_info.model_uri)

2024/10/29 09:30:14 INFO mlflow.pyfunc: Predicting on input example to validate output

Received a tool call, calling the weather tool...

Downloading artifacts:   0%|          | 0/7 [00:00<?, ?it/s]

Received a tool call, calling the weather tool...
Successfully logged the model at the following URI:  runs:/8051850efa194a3b8b2450c4c9f4d42f/weather-model

使用该模型进行推理

现在该模型已记录，我们的工作或多或少完成了！为了使用该模型进行推理，让我们使用 mlflow.pyfunc.load_model() 重新加载它。

import mlflow

# Load the previously logged ChatModel
tool_model = mlflow.pyfunc.load_model(model_info.model_uri)

system_prompt = {
  "role": "system",
  "content": "Please use the provided tools to answer user queries.",
}

messages = [
  system_prompt,
  {"role": "user", "content": "What's the weather in Singapore?"},
]

# Call the model's predict method
response = tool_model.predict({"messages": messages})
print(response["choices"][0]["message"]["content"])

messages = [
  system_prompt,
  {"role": "user", "content": "What's the weather in San Francisco?"},
]

# Generating another response
response = tool_model.predict({"messages": messages})
print(response["choices"][0]["message"]["content"])

2024/10/29 09:30:27 WARNING mlflow.tracing.processor.mlflow: Creating a trace within the default experiment with id '0'. It is strongly recommended to not use the default experiment to log traces due to ambiguous search results and probable performance issues over time due to directory table listing performance degradation with high volumes of directories within a specific path. To avoid performance and disambiguation issues, set the experiment for your environment using `mlflow.set_experiment()` API.

Received a tool call, calling the weather tool...
The weather in Singapore is sunny, with a temperature of 20°C.
Received a tool call, calling the weather tool...
The weather in San Francisco is sunny, with a temperature of 20°C.

提供该模型

MLflow 还允许您使用 mlflow models serve CLI 工具来提供模型。在另一个终端 shell 中，从与此 notebook 相同的文件夹运行以下命令

$ export OPENAI_API_KEY=<YOUR OPENAI API KEY>
$ mlflow models serve -m <MODEL_URI>

这将在 http://127.0.0.1:5000 上开始提供该模型，并且可以通过对 /invocations 路由的 POST 请求来查询该模型。

import requests

messages = [
  system_prompt,
  {"role": "user", "content": "What's the weather in Tokyo?"},
]

response = requests.post("http://127.0.0.1:5000/invocations", json={"messages": messages})
response.raise_for_status()
response.json()

{'choices': [{'index': 0,
 'message': {'role': 'assistant',
  'content': 'The weather in Tokyo is sunny, with a temperature of 20°C.'},
 'finish_reason': 'stop'}],
'usage': {'prompt_tokens': 100, 'completion_tokens': 16, 'total_tokens': 116},
'id': 'chatcmpl-ANVOhWssEiyYNFwrBPxp1gmQvZKsy',
'model': 'gpt-4o-mini-2024-07-18',
'object': 'chat.completion',
'created': 1730165599}

结论

在本教程中，我们介绍了如何使用 MLflow 的 ChatModel 类来创建一个方便的 OpenAI 包装器，该包装器支持工具调用。尽管用例很简单，但此处介绍的概念可以轻松扩展以支持更复杂的功能。

如果您希望更深入地构建高质量的 GenAI 应用程序，您可能还有兴趣查看 MLflow Tracing，这是一种可观察性工具，可用于跟踪任意函数的执行（例如您的工具调用）。

环境设置​

步骤 1：创建工具定义​

步骤 2：实施该工具​

步骤 3：实施 predict 方法​

步骤 4（可选，但建议）：为模型启用跟踪​

步骤 5：记录模型​

使用该模型进行推理​

提供该模型​

结论​