追踪快速入门

本快速入门指南将引导您设置一个简单的 GenAI 应用程序，并使用 MLflow Tracing 进行追踪。在不到 10 分钟的时间内，您将启用追踪，运行一个基本应用程序，并在 MLflow UI 中探索生成的追踪。

先决条件

本指南需要以下软件包

mlflow>=3.1: 带有 GenAI 功能的核心 MLflow 功能
openai>=1.0.0: 用于 OpenAI API 集成

安装所需软件包

pip install --upgrade "mlflow" openai>=1.0.0

注意

MLflow 2.15.0+ 版本提供了追踪功能，但与 MLflow 3 相比，一些高级功能可能受到限制。强烈建议升级到 MLflow 3 以获得大幅改进的追踪功能所带来的优势。

步骤 1：设置您的环境

本地 MLflow
移除 MLflow 服务器
Databricks

为了最快设置，您可以在本地运行 MLflow

# Start MLflow tracking server locally
mlflow ui

# This will start the server at http://127.0.0.1:5000

如果您有远程 MLflow 追踪服务器，请配置连接

import os
import mlflow

# Set your MLflow tracking URI
os.environ["MLFLOW_TRACKING_URI"] = "http://your-mlflow-server:5000"
# Or directly in code
mlflow.set_tracking_uri("http://your-mlflow-server:5000")

如果您有 Databricks 账户，请配置连接

import mlflow

mlflow.login()

这将提示您输入配置详细信息（Databricks 主机 URL 和 PAT）。

配置 OpenAI API 密钥

将您的 OpenAI API 密钥设置为环境变量

import os

# Set your OpenAI API key
os.environ["OPENAI_API_KEY"] = "your-api-key-here"  # Replace with your actual API key

提示

您也可以在运行脚本前在 shell 中设置环境变量

export OPENAI_API_KEY="your-api-key-here"
export MLFLOW_TRACKING_URI="http://your-mlflow-server:5000"  # Optional: for remote server, set as 'databricks' if connecting to a Databricks account

步骤 2：创建带有追踪功能的简单应用程序

import mlflow
import openai
from openai import OpenAI
import os

# Set up MLflow experiment
mlflow.set_experiment("openai-tracing-quickstart")

# Enable automatic tracing for all OpenAI API calls
mlflow.openai.autolog()

client = OpenAI()


def get_weather_response(location):
    """Get a weather-related response for a given location."""
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "You are a helpful weather assistant."},
            {"role": "user", "content": f"What's the weather like in {location}?"},
        ],
        max_tokens=100,
        temperature=0.7,
    )
    return response.choices[0].message.content


# Execute the traced function
location = "San Francisco"
response = get_weather_response(location)

print(f"Query: What's the weather like in {location}?")
print(f"Response: {response}")
print("\nTraces have been captured!")
print(
    "View them in the MLflow UI at: http://127.0.0.1:5000 (or your MLflow server URL)"
)

步骤 3：运行应用程序

在 Jupyter Notebook 中
作为 Python 脚本

只需运行上面的代码单元格。您应该看到类似于以下的输出

Query: What's the weather like in San Francisco?
Response: I don't have real-time weather data, but San Francisco typically has mild temperatures year-round...
Traces have been captured!
View them in the MLflow UI at: http://127.0.0.1:5000

提示

如果您在使用 Jupyter 且 MLflow 版本为 2.20+，当追踪生成时，追踪 UI 将自动显示在您的 notebook 中！

将上述代码保存到名为 weather_app.py 的文件中
运行脚本

python weather_app.py

您应该看到类似于以下的输出

Query: What's the weather like in San Francisco?
Response: I don't have real-time weather data, but San Francisco typically has mild temperatures year-round...
Traces have been captured!
View them in the MLflow UI at: http://127.0.0.1:5000

步骤 4：在 MLflow UI 中探索追踪

通过导航到 http://127.0.0.1:5000（或您的 MLflow 服务器 URL）打开 MLflow UI
从实验列表中点击 "openai-tracing-quickstart" 实验
点击 "Traces" 标签页以查看您的所有 OpenAI API 调用捕获的追踪
点击任何单独的追踪以打开详细追踪视图
探索追踪详情，包括输入消息、系统提示、模型参数（温度、max_tokens 等）、输出响应、执行时间、token 用量以及完整的请求/响应流程

MLflow Traces UI

步骤 5：添加自定义追踪（可选）

通过自定义追踪增强您的应用程序，以获得更好的可观测性

import mlflow
from openai import OpenAI

mlflow.set_experiment("enhanced-weather-app")
mlflow.openai.autolog()

client = OpenAI()


@mlflow.trace
def preprocess_location(location):
    """Preprocess the location input."""
    # Add custom span for preprocessing
    cleaned_location = location.strip().title()
    return cleaned_location


@mlflow.trace
def get_enhanced_weather_response(location):
    """Enhanced weather response with preprocessing and custom metadata."""

    # Add tags for better organization
    mlflow.update_current_trace(
        tags={
            "location": location,
            "app_version": "1.0.0",
            "request_type": "weather_query",
        }
    )

    # Preprocess input
    cleaned_location = preprocess_location(location)

    # Make OpenAI call (automatically traced)
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "You are a helpful weather assistant."},
            {
                "role": "user",
                "content": f"What's the weather like in {cleaned_location}?",
            },
        ],
        max_tokens=150,
        temperature=0.7,
    )

    result = response.choices[0].message.content

    # Add custom attributes
    mlflow.update_current_trace(
        tags={
            "response_length": len(result),
            "cleaned_location": cleaned_location,
        }
    )

    return result


# Test the enhanced function
locations = ["san francisco", "  New York  ", "tokyo"]

for location in locations:
    print(f"\n--- Processing: {location} ---")
    response = get_enhanced_weather_response(location)
    print(f"Response: {response[:100]}...")

print("\nEnhanced traces captured! Check the MLflow UI for detailed trace information.")

后续步骤

现在您已经完成了基本追踪设置，请探索以下高级功能

搜索和筛选追踪 了解如何使用搜索功能查找特定追踪。
添加标签和上下文 使用自定义标签组织您的追踪，以便更好地进行监控。
生产部署 使用轻量级 SDK 设置生产监控。
与其他库集成 探索 LangChain、LlamaIndex 等的自动追踪。
手动插桩 学习针对自定义应用程序的手动追踪技术。

故障排除

常见问题

追踪未在 UI 中显示

验证 MLflow 服务器正在运行且可访问
检查 MLFLOW_TRACKING_URI 是否设置正确
确保实验存在（如果不存在，MLflow 会自动创建）

OpenAI API 错误

验证您的 OPENAI_API_KEY 是否设置正确
检查您是否有可用的 API 额度
确保模型名称 (gpt-4o-mini) 正确且可访问

MLflow 服务器未启动

检查端口 5000 是否已被占用：lsof -i :5000
尝试不同的端口：mlflow ui --port 5001
验证 MLflow 安装：mlflow --version

总结

恭喜！您已成功

✅ 为 GenAI 应用程序设置了 MLflow Tracing
✅ 启用了 OpenAI API 调用的自动追踪
✅ 在 MLflow UI 中生成并探索了追踪
✅ 学习了如何添加自定义追踪和元数据

MLflow Tracing 为您的 GenAI 应用程序提供了强大的可观测性，帮助您监控性能、调试问题并了解用户交互。继续探索高级功能，充分利用您的追踪设置！

先决条件​

步骤 1：设置您的环境​

配置 OpenAI API 密钥​

步骤 2：创建带有追踪功能的简单应用程序​

步骤 3：运行应用程序​

步骤 4：在 MLflow UI 中探索追踪​

步骤 5：添加自定义追踪（可选）​

后续步骤​

故障排除​

常见问题​

总结​