跳到主要内容

Tracing FireworksAI

FireworksAI Tracing via autolog

FireworksAI 是一个开源 AI 的推理和定制引擎。它提供对最新 SOTA OSS 模型的零日访问,并允许开发人员构建闪电般的 AI 应用程序。

MLflow Tracing 通过 OpenAI SDK 兼容性为 FireworksAI 提供自动跟踪功能。FireworksAI 兼容 OpenAI SDK,您可以使用 mlflow.openai.autolog() 函数来启用自动跟踪。MLflow 将捕获 LLM 调用的跟踪信息并将其记录到活动的 MLflow 实验中。

MLflow 会自动捕获 FireworksAI 调用的以下信息

  • 提示和完成响应
  • 延迟
  • 模型名称
  • 其他元数据,例如 temperaturemax_completion_tokens(如果指定)
  • Tool Use(如果响应中返回)
  • 如果抛出任何异常

支持的 API

由于 FireworksAI 兼容 OpenAI SDK,MLflow 的 OpenAI 集成支持的所有 API 都可以与 FireworksAI 无缝配合。有关 FireworksAI 上可用模型的列表,请参阅 模型库

NormalTool UseStructured Outputs流式传输异步

快速入门

python
import mlflow
import openai
import os

# Enable auto-tracing
mlflow.openai.autolog()

# Optional: Set a tracking URI and an experiment
mlflow.set_tracking_uri("https://:5000")
mlflow.set_experiment("FireworksAI")

# Create an OpenAI client configured for FireworksAI
openai_client = openai.OpenAI(
base_url="https://api.fireworks.ai/inference/v1",
api_key=os.getenv("FIREWORKS_API_KEY"),
)

# Use the client as usual - traces will be automatically captured
response = openai_client.chat.completions.create(
model="accounts/fireworks/models/deepseek-v3-0324", # For other models see: https://fireworks.ai/models
messages=[
{"role": "user", "content": "Why is open source better than closed source?"}
],
)

Chat Completion API 示例

python
import openai
import mlflow
import os

# Enable auto-tracing
mlflow.openai.autolog()

# Optional: Set a tracking URI and an experiment
# If running locally you can start a server with: `mlflow server --host 127.0.0.1 --port 5000`
mlflow.set_tracking_uri("http://127.0.0.1:5000")
mlflow.set_experiment("FireworksAI")

# Configure OpenAI client for FireworksAI
openai_client = openai.OpenAI(
base_url="https://api.fireworks.ai/inference/v1",
api_key=os.getenv("FIREWORKS_API_KEY"),
)

messages = [
{
"role": "user",
"content": "What is the capital of France?",
}
]

# To use different models check out the model library at: https://fireworks.ai/models
response = openai_client.chat.completions.create(
model="accounts/fireworks/models/deepseek-v3-0324",
messages=messages,
max_completion_tokens=100,
)

Token Usage

MLflow 支持对 FireworksAI 的 token 使用情况进行跟踪。每次 LLM 调用的 token 使用情况将记录在 mlflow.chat.tokenUsage 属性中。整个跟踪过程的总 token 使用情况将在跟踪信息对象的 token_usage 字段中提供。

python
import json
import mlflow

mlflow.openai.autolog()

# Run the tool calling agent defined in the previous section
question = "What's the weather like in Paris today?"
answer = run_tool_agent(question)

# Get the trace object just created
last_trace_id = mlflow.get_last_active_trace_id()
trace = mlflow.get_trace(trace_id=last_trace_id)

# Print the token usage
total_usage = trace.info.token_usage
print("== Total token usage: ==")
print(f" Input tokens: {total_usage['input_tokens']}")
print(f" Output tokens: {total_usage['output_tokens']}")
print(f" Total tokens: {total_usage['total_tokens']}")

# Print the token usage for each LLM call
print("\n== Detailed usage for each LLM call: ==")
for span in trace.data.spans:
if usage := span.get_attribute("mlflow.chat.tokenUsage"):
print(f"{span.name}:")
print(f" Input tokens: {usage['input_tokens']}")
print(f" Output tokens: {usage['output_tokens']}")
print(f" Total tokens: {usage['total_tokens']}")
bash
== Total token usage: ==
Input tokens: 20
Output tokens: 283
Total tokens: 303

== Detailed usage for each LLM call: ==
Completions:
Input tokens: 20
Output tokens: 283
Total tokens: 303

禁用自动跟踪

可以通过调用 mlflow.openai.autolog(disable=True)mlflow.autolog(disable=True) 来全局禁用 FireworksAI 的自动跟踪。