追踪 DSPy🧩
DSPy 是一个用于构建模块化 AI 系统的开源框架,提供优化其提示词和权重的算法。
MLflow 追踪为 DSPy 提供自动追踪能力。你可以通过调用 mlflow.dspy.autolog()
函数来启用 DSPy 的追踪,并且在调用 DSPy 模块时,嵌套的追踪会自动记录到当前的 MLflow Experiment 中。
import mlflow
mlflow.dspy.autolog()
MLflow DSPy 集成不仅限于追踪。MLflow 为 DSPy 提供了完整的追踪体验,包括模型追踪、索引管理和评估。请查看 MLflow DSPy Flavor 以了解更多信息!
示例用法
import dspy
import mlflow
# Enabling tracing for DSPy
mlflow.dspy.autolog()
# Optional: Set a tracking URI and an experiment
mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("DSPy")
# Define a simple ChainOfThought model and run it
lm = dspy.LM("openai/gpt-4o-mini")
dspy.configure(lm=lm)
# Define a simple summarizer model and run it
class SummarizeSignature(dspy.Signature):
"""Given a passage, generate a summary."""
passage: str = dspy.InputField(desc="a passage to summarize")
summary: str = dspy.OutputField(desc="a one-line summary of the passage")
class Summarize(dspy.Module):
def __init__(self):
self.summarize = dspy.ChainOfThought(SummarizeSignature)
def forward(self, passage: str):
return self.summarize(passage=passage)
summarizer = Summarize()
summarizer(
passage=(
"MLflow Tracing is a feature that enhances LLM observability in your Generative AI (GenAI) applications "
"by capturing detailed information about the execution of your application's services. Tracing provides "
"a way to record the inputs, outputs, and metadata associated with each intermediate step of a request, "
"enabling you to easily pinpoint the source of bugs and unexpected behaviors."
)
)
评估期间的追踪
评估 DSPy 模型是 AI 系统开发中的重要一步。MLflow 追踪可以通过为每个输入提供关于程序执行的详细信息,帮助你在评估后追踪程序的性能。
当为 DSPy 启用 MLflow 自动追踪时,当你执行 DSPy 内置的 评估套件时,将自动生成追踪。以下示例演示了如何在 MLflow 中运行评估和查看追踪
import dspy
from dspy.evaluate.metrics import answer_exact_match
import mlflow
# Enabling tracing for DSPy evaluation
mlflow.dspy.autolog(log_traces_from_eval=True)
# Define a simple evaluation set
eval_set = [
dspy.Example(
question="How many 'r's are in the word 'strawberry'?", answer="3"
).with_inputs("question"),
dspy.Example(
question="How many 'a's are in the word 'banana'?", answer="3"
).with_inputs("question"),
dspy.Example(
question="How many 'e's are in the word 'elephant'?", answer="2"
).with_inputs("question"),
]
# Define a program
class Counter(dspy.Signature):
question: str = dspy.InputField()
answer: str = dspy.OutputField(
desc="Should only contain a single number as an answer"
)
cot = dspy.ChainOfThought(Counter)
# Evaluate the programs
with mlflow.start_run(run_name="CoT Evaluation"):
evaluator = dspy.evaluate.Evaluate(
devset=eval_set,
return_all_scores=True,
return_outputs=True,
show_progress=True,
)
aggregated_score, outputs, all_scores = evaluator(cot, metric=answer_exact_match)
# Log the aggregated score
mlflow.log_metric("exact_match", aggregated_score)
# Log the detailed evaluation results as a table
mlflow.log_table(
{
"question": [example.question for example in eval_set],
"answer": [example.answer for example in eval_set],
"output": outputs,
"exact_match": all_scores,
},
artifact_file="eval_results.json",
)
如果你打开 MLflow UI 并进入“CoT Evaluation”运行,你将看到评估结果,并在Traces
标签页上看到评估期间生成的追踪列表。
你可以通过调用 mlflow.dspy.autolog()
函数并将 log_traces_from_eval
参数设置为 False
来禁用这些步骤的追踪。
编译(优化)期间的追踪
编译(优化)是 DSPy 的核心概念。通过编译,DSPy 会自动优化你的 DSPy 程序的提示词和权重,以实现最佳性能。
默认情况下,MLflow 在编译期间不生成追踪,因为编译可能触发数百或数千次对 DSPy 模块的调用。要为编译启用追踪,你可以调用 mlflow.dspy.autolog()
函数并将 log_traces_from_compile
参数设置为 True
。
import dspy
import mlflow
# Enable auto-tracing for compilation
mlflow.dspy.autolog(log_traces_from_compile=True)
# Optimize the DSPy program as usual
tp = dspy.MIPROv2(metric=metric, auto="medium", num_threads=24)
optimized = tp.compile(cot, trainset=trainset, ...)
禁用自动追踪
可以通过调用 mlflow.llama_index.autolog(disable=True)
或 mlflow.autolog(disable=True)
来全局禁用 LlamaIndex 的自动追踪。