跳到主要内容

跟踪 Haystack

Haystack Tracing via autolog

Haystack 是由 deepset 开发的开源 AI 编排框架,旨在帮助 Python 开发人员构建可用于生产的 LLM 驱动应用程序。它具有模块化架构——围绕组件和管道构建,用于构建从检索增强生成 (RAG) 工作流到自主代理系统和可扩展搜索引擎的所有内容。

MLflow 跟踪在使用 Haystack 管道和组件时提供自动跟踪功能。当通过调用 mlflow.haystack.autolog() 函数启用 Haystack 自动跟踪时,Haystack 管道和组件的使用情况将在交互式开发期间自动记录生成的跟踪。

python
import mlflow

mlflow.haystack.autolog()

MLflow 跟踪自动捕获以下信息

  • 管道和组件
  • 延迟
  • 有关添加的不同组件的元数据,例如工具名称
  • Token 使用量和成本
  • 缓存命中
  • 如果抛出任何异常

基本示例

python
import mlflow

from haystack import Document, Pipeline
from haystack.components.builders.chat_prompt_builder import ChatPromptBuilder
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.dataclasses import ChatMessage
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.utils import Secret

mlflow.haystack.autolog()
mlflow.set_experiment("Haystack Tracing")

# Write documents to InMemoryDocumentStore
document_store = InMemoryDocumentStore()
document_store.write_documents(
[
Document(content="My name is Jean and I live in Paris."),
Document(content="My name is Mark and I live in Berlin."),
Document(content="My name is Giorgio and I live in Rome."),
]
)

# Build a RAG pipeline
prompt_template = [
ChatMessage.from_system("You are a helpful assistant."),
ChatMessage.from_user(
"Given these documents, answer the question.\n"
"Documents:\n{% for doc in documents %}{{ doc.content }}{% endfor %}\n"
"Question: {{question}}\n"
"Answer:"
),
]

# Define required variables explicitly
prompt_builder = ChatPromptBuilder(
template=prompt_template, required_variables={"question", "documents"}
)

retriever = InMemoryBM25Retriever(document_store=document_store)
llm = OpenAIChatGenerator(api_key=Secret.from_env_var("OPENAI_API_KEY"))

rag_pipeline = Pipeline()
rag_pipeline.add_component("retriever", retriever)
rag_pipeline.add_component("prompt_builder", prompt_builder)
rag_pipeline.add_component("llm", llm)
rag_pipeline.connect("retriever", "prompt_builder.documents")
rag_pipeline.connect("prompt_builder", "llm.messages")

# Ask a question
question = "Who lives in Paris?"
results = rag_pipeline.run(
{
"retriever": {"query": question},
"prompt_builder": {"question": question},
}
)

print(results["llm"]["replies"])

Haystack Tracing via autolog

Token 用量

MLflow >= 3.4.0 支持 Haystack 的 token 使用情况跟踪。每次 LLM 调用的 token 使用情况将记录在 mlflow.chat.tokenUsage 属性中。整个跟踪的总 token 使用情况可在跟踪信息对象的 token_usage 字段中找到。

python
question = "Who lives in Paris?"
results = rag_pipeline.run(
{
"retriever": {"query": question},
"prompt_builder": {"question": question},
}
)

print(results["llm"]["replies"])

last_trace_id = mlflow.get_last_active_trace_id()
trace = mlflow.get_trace(trace_id=last_trace_id)

# Print the token usage
total_usage = trace.info.token_usage
print("== Total token usage: ==")
print(f" Input tokens: {total_usage['input_tokens']}")
print(f" Output tokens: {total_usage['output_tokens']}")
print(f" Total tokens: {total_usage['total_tokens']}")

# Print the token usage for each LLM call
print("\n== Detailed usage for each LLM call: ==")
for span in trace.data.spans:
if usage := span.get_attribute("mlflow.chat.tokenUsage"):
print(f"{span.name}:")
print(f" Input tokens: {usage['input_tokens']}")
print(f" Output tokens: {usage['output_tokens']}")
print(f" Total tokens: {usage['total_tokens']}")
bash
== Total token usage: ==
Input tokens: 64
Output tokens: 5
Total tokens: 69

== Detailed usage for each LLM call: ==
OpenAIChatGenerator:
Input tokens: 64
Output tokens: 5

禁用自动跟踪

可以通过调用 mlflow.haystack.autolog(disable=True)mlflow.autolog(disable=True) 在全局范围内禁用 Haystack 的自动跟踪。