Tracing OpenAI

MLflow Tracing 为 OpenAI 提供了自动跟踪功能。通过调用 mlflow.openai.autolog() 函数启用 OpenAI 的自动跟踪,MLflow 将捕获 LLM 调用跟踪并将其记录到活动的 MLflow 实验中。在 Typescript 中,您可以使用 tracedOpenAI 函数来包装 OpenAI 客户端。
- Python
- JS / TS
import mlflow
mlflow.openai.autolog()
import { OpenAI } from "openai";
import { tracedOpenAI } from "mlflow-openai";
const client = tracedOpenAI(new OpenAI());
MLflow trace 自动捕获 OpenAI 调用的以下信息
- 提示和完成响应
- 延迟
- 模型名称
- 附加元数据,例如
temperature、max_completion_tokens(如果指定)。 - 如果响应中返回函数调用
- 内置工具,例如网络搜索、文件搜索、计算机使用等。
- 如果抛出任何异常
MLflow OpenAI 集成不仅仅是跟踪。MLflow 为 OpenAI 提供了完整的跟踪体验,包括模型跟踪、提示管理和评估。请查看 MLflow OpenAI Flavor 以了解更多信息!
支持的 API
MLflow 支持以下 OpenAI API 的自动跟踪。要请求支持其他 API,请在 GitHub 上提交一个 功能请求。
Chat Completion API
| Normal | 函数调用 | Structured Outputs | 流式传输 | 异步 | 图像 | Audio |
|---|---|---|---|---|---|---|
| ✅ | ✅ | ✅(>=2.21.0) | ✅ (>=2.15.0) | ✅(>=2.21.0) | - | - |
Responses API
| Normal | 函数调用 | Structured Outputs | Web Search | File Search | Computer Use | Reasoning | 流式传输 | 异步 | 图像 |
|---|---|---|---|---|---|---|---|---|---|
| ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | - |
Responses API 自 MLflow 2.22.0 起受支持。
Agents SDK
有关更多详细信息,请参阅 OpenAI Agents SDK Tracing。
Embedding API
| Normal | 异步 |
|---|---|
| ✅ | ✅ |
基本示例
- Chat Completion API
- Responses API
- JS / TS
import openai
import mlflow
# Enable auto-tracing for OpenAI
mlflow.openai.autolog()
# Optional: Set a tracking URI and an experiment
mlflow.set_tracking_uri("https://:5000")
mlflow.set_experiment("OpenAI")
openai_client = openai.OpenAI()
messages = [
{
"role": "user",
"content": "What is the capital of France?",
}
]
response = openai_client.chat.completions.create(
model="o4-mini",
messages=messages,
max_completion_tokens=100,
)
import openai
import mlflow
# Enable auto-tracing for OpenAI
mlflow.openai.autolog()
# Optional: Set a tracking URI and an experiment
mlflow.set_tracking_uri("https://:5000")
mlflow.set_experiment("OpenAI")
openai_client = openai.OpenAI()
response = client.responses.create(
model="o4-mini", input="What is the capital of France?"
)
import { OpenAI } from "openai";
import { tracedOpenAI } from "mlflow-openai";
// Wrap the OpenAI client with the tracedOpenAI function
const client = tracedOpenAI(new OpenAI());
// Invoke the client as usual
const response = await client.chat.completions.create({
model: "o4-mini",
messages: [
{"role": "system", "content": "You are a helpful weather assistant."},
{"role": "user", "content": "What's the weather like in Seattle?"},
],
})
流式传输
MLflow Tracing 支持 OpenAI SDK 的流式 API。通过相同的自动跟踪设置,MLflow 会自动跟踪流式响应并在 span UI 中渲染连接的输出。响应流中的实际块也可以在 Event 选项卡中找到。
- Chat Completion API
- Responses API
- JS / TS
import openai
import mlflow
# Enable trace logging
mlflow.openai.autolog()
client = openai.OpenAI()
stream = client.chat.completions.create(
model="o4-mini",
messages=[
{"role": "user", "content": "How fast would a glass of water freeze on Titan?"}
],
stream=True, # Enable streaming response
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="")
import openai
import mlflow
# Enable trace logging
mlflow.openai.autolog()
client = openai.OpenAI()
stream = client.responses.create(
model="o4-mini",
input="How fast would a glass of water freeze on Titan?",
stream=True, # Enable streaming response
)
for event in stream:
print(event)
import { OpenAI } from "openai";
import { tracedOpenAI } from "mlflow-openai";
// Wrap the OpenAI client with the tracedOpenAI function
const client = tracedOpenAI(new OpenAI());
const stream = await client.chat.completions.create({
model: "o4-mini",
messages: [
{"role": "user", "content": "How fast would a glass of water freeze on Titan?"},
],
stream: true,
});
异步
MLflow Tracing 自 MLflow 2.21.0 起支持 OpenAI SDK 的异步 API。用法与同步 API 相同。
- Chat Completion API
- Responses API
- JS / TS
import openai
# Enable trace logging
mlflow.openai.autolog()
client = openai.AsyncOpenAI()
response = await client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "user", "content": "How fast would a glass of water freeze on Titan?"}
],
# Async streaming is also supported
# stream=True
)
import openai
# Enable trace logging
mlflow.openai.autolog()
client = openai.AsyncOpenAI()
response = await client.responses.create(
model="gpt-4o-mini", input="How fast would a glass of water freeze on Titan?"
)
OpenAI Typescript / Javascript SDK 是原生异步的。请参阅上面的基本示例。
Function Calling
MLflow Tracing 会自动捕获 OpenAI 模型的函数调用响应。响应中的函数指令将在跟踪 UI 中突出显示。此外,您可以使用 @mlflow.trace 装饰器来注释工具函数,为工具执行创建 span。

以下示例使用 OpenAI Function Calling 和 MLflow Tracing for OpenAI 实现了一个简单的函数调用代理。
- Chat Completion API
- Responses API
- JS / TS
import json
from openai import OpenAI
import mlflow
from mlflow.entities import SpanType
client = OpenAI()
# Define the tool function. Decorate it with `@mlflow.trace` to create a span for its execution.
@mlflow.trace(span_type=SpanType.TOOL)
def get_weather(city: str) -> str:
if city == "Tokyo":
return "sunny"
elif city == "Paris":
return "rainy"
return "unknown"
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"parameters": {
"type": "object",
"properties": {"city": {"type": "string"}},
},
},
}
]
_tool_functions = {"get_weather": get_weather}
# Define a simple tool calling agent
@mlflow.trace(span_type=SpanType.AGENT)
def run_tool_agent(question: str):
messages = [{"role": "user", "content": question}]
# Invoke the model with the given question and available tools
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
tools=tools,
)
ai_msg = response.choices[0].message
messages.append(ai_msg)
# If the model request tool call(s), invoke the function with the specified arguments
if tool_calls := ai_msg.tool_calls:
for tool_call in tool_calls:
function_name = tool_call.function.name
if tool_func := _tool_functions.get(function_name):
args = json.loads(tool_call.function.arguments)
tool_result = tool_func(**args)
else:
raise RuntimeError("An invalid tool is returned from the assistant!")
messages.append(
{
"role": "tool",
"tool_call_id": tool_call.id,
"content": tool_result,
}
)
# Sent the tool results to the model and get a new response
response = client.chat.completions.create(
model="gpt-4o-mini", messages=messages
)
return response.choices[0].message.content
# Run the tool calling agent
question = "What's the weather like in Paris today?"
answer = run_tool_agent(question)
import json
import requests
from openai import OpenAI
import mlflow
from mlflow.entities import SpanType
client = OpenAI()
# Define the tool function. Decorate it with `@mlflow.trace` to create a span for its execution.
@mlflow.trace(span_type=SpanType.TOOL)
def get_weather(latitude, longitude):
response = requests.get(
f"https://api.open-meteo.com/v1/forecast?latitude={latitude}&longitude={longitude}¤t=temperature_2m,wind_speed_10m&hourly=temperature_2m,relative_humidity_2m,wind_speed_10m"
)
data = response.json()
return data["current"]["temperature_2m"]
tools = [
{
"type": "function",
"name": "get_weather",
"description": "Get current temperature for provided coordinates in celsius.",
"parameters": {
"type": "object",
"properties": {
"latitude": {"type": "number"},
"longitude": {"type": "number"},
},
"required": ["latitude", "longitude"],
"additionalProperties": False,
},
"strict": True,
}
]
# Define a simple tool calling agent
@mlflow.trace(span_type=SpanType.AGENT)
def run_tool_agent(question: str):
messages = [{"role": "user", "content": question}]
# Invoke the model with the given question and available tools
response = client.responses.create(
model="gpt-4o-mini",
input=question,
tools=tools,
)
# Invoke the function with the specified arguments
tool_call = response.output[0]
args = json.loads(tool_call.arguments)
result = get_weather(args["latitude"], args["longitude"])
# Sent the tool results to the model and get a new response
messages.append(tool_call)
messages.append(
{
"type": "function_call_output",
"call_id": tool_call.call_id,
"output": str(result),
}
)
response = client.responses.create(
model="gpt-4o-mini",
input=input_messages,
tools=tools,
)
return response.output[0].content[0].text
# Run the tool calling agent
question = "What's the weather like in Paris today?"
answer = run_tool_agent(question)
有关 Typescript SDK 中函数调用代理的示例,请参阅 Typescript OpenAI Quickstart。
Token 用量
MLflow >= 3.1.0 支持 OpenAI 的 token 使用情况跟踪。每次 LLM 调用的 token 使用情况将记录在 mlflow.chat.tokenUsage 属性中。整个跟踪中的总 token 使用情况可在跟踪信息对象的 token_usage 字段中找到。
import json
import mlflow
mlflow.openai.autolog()
# Run the tool calling agent defined in the previous section
question = "What's the weather like in Paris today?"
answer = run_tool_agent(question)
# Get the trace object just created
last_trace_id = mlflow.get_last_active_trace_id()
trace = mlflow.get_trace(trace_id=last_trace_id)
# Print the token usage
total_usage = trace.info.token_usage
print("== Total token usage: ==")
print(f" Input tokens: {total_usage['input_tokens']}")
print(f" Output tokens: {total_usage['output_tokens']}")
print(f" Total tokens: {total_usage['total_tokens']}")
# Print the token usage for each LLM call
print("\n== Detailed usage for each LLM call: ==")
for span in trace.data.spans:
if usage := span.get_attribute("mlflow.chat.tokenUsage"):
print(f"{span.name}:")
print(f" Input tokens: {usage['input_tokens']}")
print(f" Output tokens: {usage['output_tokens']}")
print(f" Total tokens: {usage['total_tokens']}")
== Total token usage: ==
Input tokens: 84
Output tokens: 22
Total tokens: 106
== Detailed usage for each LLM call: ==
Completions_1:
Input tokens: 45
Output tokens: 14
Total tokens: 59
Completions_2:
Input tokens: 39
Output tokens: 8
Total tokens: 47
Supported APIs:
Token 使用情况跟踪支持以下 OpenAI API
| Mode | 聊天补全 | Responses | JS / TS |
|---|---|---|---|
| Normal | ✅ | ✅ | ✅ |
| 流式传输 | ✅(*1) | ✅ | ✅ |
| 异步 | ✅ | ✅ | ✅ |
(*1) 默认情况下,当流式传输时,OpenAI 不会返回 Chat Completion API 的 token 使用情况信息。要跟踪 token 使用情况,您需要在请求中指定 stream_options={"include_usage": True}(OpenAI API Reference)。
禁用自动跟踪
可以通过调用 mlflow.openai.autolog(disable=True) 或 mlflow.autolog(disable=True) 来全局禁用 OpenAI 的自动跟踪。