跳到主要内容

追踪 OpenAI

OpenAI Tracing via autolog

MLflow 追踪为 OpenAI 提供了自动追踪功能。通过调用 mlflow.openai.autolog() 函数为 OpenAI 启用自动追踪后,MLflow 将捕获大语言模型 (LLM) 调用的追踪信息,并将其记录到活动的 MLflow 实验中。在 Typescript 中,您可以使用 tracedOpenAI 函数来包装 OpenAI 客户端。

import mlflow

mlflow.openai.autolog()

MLflow 追踪会自动捕获有关 OpenAI 调用的以下信息

  • 提示和完成响应
  • 延迟
  • 模型名称
  • 如果指定,还包括额外的元数据,如 temperaturemax_tokens
  • 如果响应中返回函数调用
  • 内置工具,如网页搜索、文件搜索、计算机使用等。
  • 如果抛出任何异常
提示

MLflow 与 OpenAI 的集成不仅仅是追踪。MLflow 为 OpenAI 提供了完整的追踪体验,包括模型追踪、提示管理和评估。请查看MLflow OpenAI Flavor以了解更多信息!

支持的 API

MLflow 支持对以下 OpenAI API 进行自动追踪。如需请求对其他 API 的支持,请在 GitHub 上提出功能请求

聊天补全 API

普通函数调用结构化输出流式传输异步图像音频
✅(>=2.21.0)✅ (>=2.15.0)✅(>=2.21.0)--

响应 API

普通函数调用结构化输出网页搜索文件搜索计算机使用推理流式传输异步图像
-

自 MLflow 2.22.0 起支持响应 API。

Agents SDK

更多详情请参阅OpenAI Agents SDK 追踪

嵌入 API

普通异步

基本示例

import openai
import mlflow

# Enable auto-tracing for OpenAI
mlflow.openai.autolog()

# Optional: Set a tracking URI and an experiment
mlflow.set_tracking_uri("https://:5000")
mlflow.set_experiment("OpenAI")

openai_client = openai.OpenAI()

messages = [
{
"role": "user",
"content": "What is the capital of France?",
}
]

response = openai_client.chat.completions.create(
model="o4-mini",
messages=messages,
temperature=0.1,
max_tokens=100,
)

流式传输

MLflow 追踪支持 OpenAI SDK 的流式 API。通过相同的自动追踪设置,MLflow 会自动追踪流式响应,并在 Span UI 中渲染拼接后的输出。响应流中的实际数据块也可以在 Event 选项卡中找到。

import openai
import mlflow

# Enable trace logging
mlflow.openai.autolog()

client = openai.OpenAI()

stream = client.chat.completions.create(
model="o4-mini",
messages=[
{"role": "user", "content": "How fast would a glass of water freeze on Titan?"}
],
stream=True, # Enable streaming response
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="")

异步

自 MLflow 2.21.0 起,MLflow 追踪支持 OpenAI SDK 的异步 API。其用法与同步 API 相同。

import openai

# Enable trace logging
mlflow.openai.autolog()

client = openai.AsyncOpenAI()

response = await client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "user", "content": "How fast would a glass of water freeze on Titan?"}
],
# Async streaming is also supported
# stream=True
)

函数调用

MLflow 追踪会自动捕获来自 OpenAI 模型的函数调用响应。响应中的函数指令将在追踪 UI 中高亮显示。此外,您可以使用 @mlflow.trace 装饰器来注解工具函数,从而为工具执行创建一个 Span。

OpenAI Function Calling Trace

以下示例使用 OpenAI 函数调用和 MLflow 追踪实现了一个简单的函数调用代理。

import json
from openai import OpenAI
import mlflow
from mlflow.entities import SpanType

client = OpenAI()


# Define the tool function. Decorate it with `@mlflow.trace` to create a span for its execution.
@mlflow.trace(span_type=SpanType.TOOL)
def get_weather(city: str) -> str:
if city == "Tokyo":
return "sunny"
elif city == "Paris":
return "rainy"
return "unknown"


tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"parameters": {
"type": "object",
"properties": {"city": {"type": "string"}},
},
},
}
]

_tool_functions = {"get_weather": get_weather}


# Define a simple tool calling agent
@mlflow.trace(span_type=SpanType.AGENT)
def run_tool_agent(question: str):
messages = [{"role": "user", "content": question}]

# Invoke the model with the given question and available tools
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
tools=tools,
)
ai_msg = response.choices[0].message
messages.append(ai_msg)

# If the model request tool call(s), invoke the function with the specified arguments
if tool_calls := ai_msg.tool_calls:
for tool_call in tool_calls:
function_name = tool_call.function.name
if tool_func := _tool_functions.get(function_name):
args = json.loads(tool_call.function.arguments)
tool_result = tool_func(**args)
else:
raise RuntimeError("An invalid tool is returned from the assistant!")

messages.append(
{
"role": "tool",
"tool_call_id": tool_call.id,
"content": tool_result,
}
)

# Sent the tool results to the model and get a new response
response = client.chat.completions.create(
model="gpt-4o-mini", messages=messages
)

return response.choices[0].message.content


# Run the tool calling agent
question = "What's the weather like in Paris today?"
answer = run_tool_agent(question)

令牌用量

MLflow >= 3.1.0 支持对 OpenAI 的令牌用量进行追踪。每次 LLM 调用的令牌用量将记录在 mlflow.chat.tokenUsage 属性中。整个追踪过程中的总令牌用量可在追踪信息对象的 token_usage 字段中找到。

import json
import mlflow

mlflow.openai.autolog()

# Run the tool calling agent defined in the previous section
question = "What's the weather like in Paris today?"
answer = run_tool_agent(question)

# Get the trace object just created
last_trace_id = mlflow.get_last_active_trace_id()
trace = mlflow.get_trace(trace_id=last_trace_id)

# Print the token usage
total_usage = trace.info.token_usage
print("== Total token usage: ==")
print(f" Input tokens: {total_usage['input_tokens']}")
print(f" Output tokens: {total_usage['output_tokens']}")
print(f" Total tokens: {total_usage['total_tokens']}")

# Print the token usage for each LLM call
print("\n== Detailed usage for each LLM call: ==")
for span in trace.data.spans:
if usage := span.get_attribute("mlflow.chat.tokenUsage"):
print(f"{span.name}:")
print(f" Input tokens: {usage['input_tokens']}")
print(f" Output tokens: {usage['output_tokens']}")
print(f" Total tokens: {usage['total_tokens']}")
== Total token usage: ==
Input tokens: 84
Output tokens: 22
Total tokens: 106

== Detailed usage for each LLM call: ==
Completions_1:
Input tokens: 45
Output tokens: 14
Total tokens: 59
Completions_2:
Input tokens: 39
Output tokens: 8
Total tokens: 47

支持的 API:

以下 OpenAI API 支持令牌用量追踪

模式聊天补全响应JS / TS
普通
流式传输✅(*1)
异步

(*1) 默认情况下,OpenAI 在流式传输时不会返回聊天补全 API 的令牌用量信息。要追踪令牌用量,您需要在请求中指定 stream_options={"include_usage": True}OpenAI API 参考)。

禁用自动跟踪

可以通过调用 mlflow.openai.autolog(disable=True)mlflow.autolog(disable=True) 来全局禁用对 OpenAI 的自动追踪。