追踪 OpenAI

OpenAI Tracing via autolog

MLflow 追踪为 OpenAI 提供自动追踪能力。通过调用 mlflow.openai.autolog() 函数为 OpenAI 启用自动追踪后，MLflow 将捕获 LLM 调用追踪并将其记录到当前的 MLflow 实验中。

import mlflow

mlflow.openai.autolog()

MLflow 追踪自动捕获有关 OpenAI 调用的以下信息

提示和完成响应
延迟
模型名称
如果指定，还包括额外的元数据，如 temperature、max_tokens。
如果响应中返回函数调用
内置工具，例如网页搜索、文件搜索、计算机使用等。
如果抛出任何异常

提示

MLflow OpenAI 集成不仅仅是追踪。MLflow 为 OpenAI 提供全面的追踪体验，包括模型追踪、提示管理和评估。请查看MLflow OpenAI Flavor以了解更多信息！

支持的 API

MLflow 支持对以下 OpenAI API 进行自动追踪。如需请求对其他 API 的支持，请在 GitHub 上提交功能请求。

聊天补全 API

普通	函数调用	结构化输出	流式传输	异步	图像	音频
✅	✅	✅(>=2.21.0)	✅ (>=2.15.0)	✅(>=2.21.0)	-	-

响应 API

普通	函数调用	结构化输出	网页搜索	文件搜索	计算机使用	推理	流式传输	异步	图像
✅	✅	✅	✅	✅	✅	✅	✅	✅	-

响应 API 自 MLflow 2.22.0 版本开始支持。

代理 SDK

请参阅OpenAI 代理 SDK 追踪了解更多详情。

嵌入 API

普通	异步
✅	✅

基本示例

聊天补全 API
响应 API

import openai
import mlflow

# Enable auto-tracing for OpenAI
mlflow.openai.autolog()

# Optional: Set a tracking URI and an experiment
mlflow.set_tracking_uri("https://:5000")
mlflow.set_experiment("OpenAI")

openai_client = openai.OpenAI()

messages = [
    {
        "role": "user",
        "content": "What is the capital of France?",
    }
]

response = openai_client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
    temperature=0.1,
    max_tokens=100,
)

import openai
import mlflow

# Enable auto-tracing for OpenAI
mlflow.openai.autolog()

# Optional: Set a tracking URI and an experiment
mlflow.set_tracking_uri("https://:5000")
mlflow.set_experiment("OpenAI")

openai_client = openai.OpenAI()

response = client.responses.create(
    model="gpt-4o-mini", input="What is the capital of France?"
)

流式传输

MLflow 追踪支持 OpenAI SDK 的流式 API。通过相同的自动追踪设置，MLflow 会自动追踪流式响应，并在追踪界面 (span UI) 中渲染拼接后的输出。响应流中的实际数据块也可以在 Event 选项卡中找到。

聊天补全 API
响应 API

import openai
import mlflow

# Enable trace logging
mlflow.openai.autolog()

client = openai.OpenAI()

stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": "How fast would a glass of water freeze on Titan?"}
    ],
    stream=True,  # Enable streaming response
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

import openai
import mlflow

# Enable trace logging
mlflow.openai.autolog()

client = openai.OpenAI()

stream = client.responses.create(
    model="gpt-4o-mini",
    input="How fast would a glass of water freeze on Titan?",
    stream=True,  # Enable streaming response
)
for event in stream:
    print(event)

异步

MLflow 追踪自 MLflow 2.21.0 版本起支持 OpenAI SDK 的异步 API。用法与同步 API 相同。

聊天补全 API
响应 API

import openai

# Enable trace logging
mlflow.openai.autolog()

client = openai.AsyncOpenAI()

response = await client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": "How fast would a glass of water freeze on Titan?"}
    ],
    # Async streaming is also supported
    # stream=True
)

import openai

# Enable trace logging
mlflow.openai.autolog()

client = openai.AsyncOpenAI()

response = await client.responses.create(
    model="gpt-4o-mini", input="How fast would a glass of water freeze on Titan?"
)

函数调用

MLflow 追踪自动捕获来自 OpenAI 模型的函数调用响应。响应中的函数指令将在追踪界面中高亮显示。此外，您可以使用 @mlflow.trace 装饰器来标注工具函数，从而为工具执行创建追踪片段 (span)。

OpenAI Function Calling Trace

以下示例展示了如何使用 OpenAI 函数调用和 MLflow 对 OpenAI 的追踪功能来实现一个简单的函数调用代理。

聊天补全 API
响应 API

import json
from openai import OpenAI
import mlflow
from mlflow.entities import SpanType

client = OpenAI()


# Define the tool function. Decorate it with `@mlflow.trace` to create a span for its execution.
@mlflow.trace(span_type=SpanType.TOOL)
def get_weather(city: str) -> str:
    if city == "Tokyo":
        return "sunny"
    elif city == "Paris":
        return "rainy"
    return "unknown"


tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "parameters": {
                "type": "object",
                "properties": {"city": {"type": "string"}},
            },
        },
    }
]

_tool_functions = {"get_weather": get_weather}


# Define a simple tool calling agent
@mlflow.trace(span_type=SpanType.AGENT)
def run_tool_agent(question: str):
    messages = [{"role": "user", "content": question}]

    # Invoke the model with the given question and available tools
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        tools=tools,
    )
    ai_msg = response.choices[0].message
    messages.append(ai_msg)

    # If the model request tool call(s), invoke the function with the specified arguments
    if tool_calls := ai_msg.tool_calls:
        for tool_call in tool_calls:
            function_name = tool_call.function.name
            if tool_func := _tool_functions.get(function_name):
                args = json.loads(tool_call.function.arguments)
                tool_result = tool_func(**args)
            else:
                raise RuntimeError("An invalid tool is returned from the assistant!")

            messages.append(
                {
                    "role": "tool",
                    "tool_call_id": tool_call.id,
                    "content": tool_result,
                }
            )

        # Sent the tool results to the model and get a new response
        response = client.chat.completions.create(
            model="gpt-4o-mini", messages=messages
        )

    return response.choices[0].message.content


# Run the tool calling agent
question = "What's the weather like in Paris today?"
answer = run_tool_agent(question)

import json
import requests
from openai import OpenAI
import mlflow
from mlflow.entities import SpanType

client = OpenAI()


# Define the tool function. Decorate it with `@mlflow.trace` to create a span for its execution.
@mlflow.trace(span_type=SpanType.TOOL)
def get_weather(latitude, longitude):
    response = requests.get(
        f"https://api.open-meteo.com/v1/forecast?latitude={latitude}&longitude={longitude}&current=temperature_2m,wind_speed_10m&hourly=temperature_2m,relative_humidity_2m,wind_speed_10m"
    )
    data = response.json()
    return data["current"]["temperature_2m"]


tools = [
    {
        "type": "function",
        "name": "get_weather",
        "description": "Get current temperature for provided coordinates in celsius.",
        "parameters": {
            "type": "object",
            "properties": {
                "latitude": {"type": "number"},
                "longitude": {"type": "number"},
            },
            "required": ["latitude", "longitude"],
            "additionalProperties": False,
        },
        "strict": True,
    }
]


# Define a simple tool calling agent
@mlflow.trace(span_type=SpanType.AGENT)
def run_tool_agent(question: str):
    messages = [{"role": "user", "content": question}]

    # Invoke the model with the given question and available tools
    response = client.responses.create(
        model="gpt-4o-mini",
        input=question,
        tools=tools,
    )

    # Invoke the function with the specified arguments
    tool_call = response.output[0]
    args = json.loads(tool_call.arguments)
    result = get_weather(args["latitude"], args["longitude"])

    # Sent the tool results to the model and get a new response
    messages.append(tool_call)
    messages.append(
        {
            "type": "function_call_output",
            "call_id": tool_call.call_id,
            "output": str(result),
        }
    )

    response = client.responses.create(
        model="gpt-4o-mini",
        input=input_messages,
        tools=tools,
    )

    return response.output[0].content[0].text


# Run the tool calling agent
question = "What's the weather like in Paris today?"
answer = run_tool_agent(question)

令牌使用量

MLflow >= 3.1.0 支持 OpenAI 的令牌使用量追踪。每次 LLM 调用的令牌使用量将记录在 mlflow.chat.tokenUsage 属性中。整个追踪的总令牌使用量将在追踪信息对象的 token_usage 字段中提供。

import json
import mlflow

mlflow.openai.autolog()

# Run the tool calling agent defined in the previous section
question = "What's the weather like in Paris today?"
answer = run_tool_agent(question)

# Get the trace object just created
last_trace_id = mlflow.get_last_active_trace_id()
trace = mlflow.get_trace(trace_id=last_trace_id)

# Print the token usage
total_usage = trace.info.token_usage
print("== Total token usage: ==")
print(f"  Input tokens: {total_usage['input_tokens']}")
print(f"  Output tokens: {total_usage['output_tokens']}")
print(f"  Total tokens: {total_usage['total_tokens']}")

# Print the token usage for each LLM call
print("\n== Detailed usage for each LLM call: ==")
for span in trace.data.spans:
    if usage := span.get_attribute("mlflow.chat.tokenUsage"):
        print(f"{span.name}:")
        print(f"  Input tokens: {usage['input_tokens']}")
        print(f"  Output tokens: {usage['output_tokens']}")
        print(f"  Total tokens: {usage['total_tokens']}")

== Total token usage: ==
  Input tokens: 84
  Output tokens: 22
  Total tokens: 106

== Detailed usage for each LLM call: ==
Completions_1:
  Input tokens: 45
  Output tokens: 14
  Total tokens: 59
Completions_2:
  Input tokens: 39
  Output tokens: 8
  Total tokens: 47

支持的 API：

以下 OpenAI API 支持令牌使用量追踪

模式	聊天补全	响应
普通	✅	✅
流式传输	✅(*1)	✅
异步	✅	✅

(*1) 默认情况下，当进行流式传输时，OpenAI 不会为聊天补全 API 返回令牌使用量信息。要追踪令牌使用量，您需要在请求中指定 stream_options={"include_usage": True}（参阅OpenAI API 参考）。

禁用自动跟踪

可以通过调用 mlflow.openai.autolog(disable=True) 或 mlflow.autolog(disable=True) 全局禁用 OpenAI 的自动追踪功能。

支持的 API​

聊天补全 API​

响应 API​

代理 SDK​

嵌入 API​

基本示例​

流式传输​

异步​

函数调用​

令牌使用量​

支持的 API：​

禁用自动跟踪​