跟踪 LangChain🦜⛓️
LangChain 是一个用于构建 LLM 驱动型应用的开源框架。
MLflow 跟踪 为 LangChain 提供了自动跟踪功能。您可以通过调用 mlflow.langchain.autolog() 函数来启用 LangChain 的跟踪,并且在调用 chain 时,嵌套的跟踪会自动记录到当前的 MLflow Experiment 中。在 TypeScript 中,您可以将 MLflow LangChain 回调传递给 callbacks 选项。
- Python
- JS / TS
import mlflow
mlflow.langchain.autolog()
LangChain.js 跟踪通过 OpenTelemetry 摄取来支持。有关完整的设置,请参阅下面的 入门部分。
开始使用
MLflow 在 Python 和 TypeScript/JavaScript 中都支持 LangChain 跟踪。请选择下方相应的标签页以开始。
- Python
- JS / TS (v1)
- JS / TS (v0)
1. 启动 MLflow
如果还没有 MLflow 服务器,请按照 自托管指南 启动 MLflow 服务器。
2. 安装依赖项
pip install langchain langchain-openai mlflow
3. 启用跟踪
import mlflow
# Calling autolog for LangChain will enable trace logging.
mlflow.langchain.autolog()
# Optional: Set a tracking URI and an experiment
mlflow.set_experiment("LangChain")
mlflow.set_tracking_uri("https://:5000")
4. 定义 chain 并调用它
import mlflow
import os
from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7, max_tokens=1000)
prompt_template = PromptTemplate.from_template(
"Answer the question as if you are {person}, fully embodying their style, wit, personality, and habits of speech. "
"Emulate their quirks and mannerisms to the best of your ability, embracing their traits—even if they aren't entirely "
"constructive or inoffensive. The question is: {question}"
)
chain = prompt_template | llm | StrOutputParser()
# Let's test another call
chain.invoke(
{
"person": "Linus Torvalds",
"question": "Can I just set everyone's access to sudo to make things easier?",
}
)
5. 在 MLflow UI 中查看跟踪
访问 https://:5000 (或您的自定义 MLflow 跟踪服务器 URL) 以在 MLflow UI 中查看跟踪。
1. 启动 MLflow
如果还没有 MLflow 服务器,请按照 自托管指南 启动 MLflow 服务器。
2. 安装所需的依赖项:
npm i langchain @langchain/core @langchain/openai @arizeai/openinference-instrumentation-langchain
3. 启用 OpenTelemetry
在您的应用程序中为 LangChain 启用 OpenTelemetry 仪器。
import { NodeTracerProvider, SimpleSpanProcessor } from "@opentelemetry/sdk-trace-node";
import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-proto";
import { LangChainInstrumentation } from "@arizeai/openinference-instrumentation-langchain";
import * as CallbackManagerModule from "@langchain/core/callbacks/manager";
// Set up the OpenTelemetry
const provider = new NodeTracerProvider(
{
spanProcessors: [new SimpleSpanProcessor(new OTLPTraceExporter({
// Set MLflow tracking server URL with `/v1/traces` path. You can also use the OTEL_EXPORTER_OTLP_TRACES_ENDPOINT environment variable instead.
url: "https://:5000/v1/traces",
// Set the experiment ID in the header. You can also use the OTEL_EXPORTER_OTLP_TRACES_HEADERS environment variable instead.
headers: {
"x-mlflow-experiment-id": "123",
},
}))],
}
);
provider.register();
// Enable LangChain instrumentation
const lcInstrumentation = new LangChainInstrumentation();
lcInstrumentation.manuallyInstrument(CallbackManagerModule);
4. 定义 LangChain agent 并调用它
请注意,createAgent API 在 LangChain.js v1.0 及更高版本中可用。如果您使用的是 LangChain 0.x,请参阅 v0 示例。
import { createAgent, tool } from "langchain";
import * as z from "zod";
const getWeather = tool(
(input) => `It's always sunny in ${input.city}!`,
{
name: "get_weather",
description: "Get the weather for a given city",
schema: z.object({
city: z.string().describe("The city to get the weather for"),
}),
}
);
const agent = createAgent({
model: "gpt-4o-mini",
tools: [getWeather],
});
await agent.invoke({
messages: [{ role: "user", content: "What's the weather in Tokyo?" }],
});
5. 在 MLflow UI 中查看跟踪
访问 https://:5000 (或您的自定义 MLflow 跟踪服务器 URL) 以在 MLflow UI 中查看跟踪。
1. 启动 MLflow
如果还没有 MLflow 服务器,请按照 自托管指南 启动 MLflow 服务器。
2. 安装依赖项
安装所需的依赖项
npm i langchain @langchain/core @langchain/openai @arizeai/openinference-instrumentation-langchain
3. 启用 OpenTelemetry
在您的应用程序中为 LangChain 启用 OpenTelemetry 仪器。
import { NodeTracerProvider, SimpleSpanProcessor } from "@opentelemetry/sdk-trace-node";
import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-proto";
import { LangChainInstrumentation } from "@arizeai/openinference-instrumentation-langchain";
import * as CallbackManagerModule from "@langchain/core/callbacks/manager";
// Set up the OpenTelemetry
const provider = new NodeTracerProvider(
{
spanProcessors: [new SimpleSpanProcessor(new OTLPTraceExporter({
// Set MLflow tracking server URL. You can also use the OTEL_EXPORTER_OTLP_TRACES_ENDPOINT environment variable instead.
url: "https://:5000/v1/traces",
// Set the experiment ID in the header. You can also use the OTEL_EXPORTER_OTLP_TRACES_HEADERS environment variable instead.
headers: {
"x-mlflow-experiment-id": "123",
},
}))],
}
);
provider.register();
// Enable LangChain instrumentation
const lcInstrumentation = new LangChainInstrumentation();
lcInstrumentation.manuallyInstrument(CallbackManagerModule);
4. 定义 LangChain chain 并调用它
import { OpenAI } from "@langchain/openai";
import { PromptTemplate } from "@langchain/core/prompts";
const model = new OpenAI("gpt-4o-mini");
const prompt = PromptTemplate.fromTemplate("What is a good name for a company that makes {product}?");
const chain = prompt.pipe({ llm: model });
const res = await chain.invoke({ product: "colorful socks" });
console.log({ res });
5. 在 MLflow UI 中查看跟踪
访问 https://:5000 (或您的自定义 MLflow 跟踪服务器 URL) 以在 MLflow UI 中查看跟踪。
以上示例已通过以下要求的版本验证工作正常
pip install openai==1.30.5 langchain==0.2.1 langchain-openai==0.1.8 langchain-community==0.2.1 mlflow==2.14.0 tiktoken==0.7.0
支持的 API
以下 API 受 LangChain 的自动跟踪支持。
invokebatchstreamainvokeabatchastreamget_relevant_documents(用于检索器)__call__(用于 Chains 和 AgentExecutors)
令牌使用跟踪
MLflow >= 3.1.0 支持 LangChain 的 token 使用量跟踪。在 chain 调用期间的每次 LLM 调用所产生的 token 使用量将记录在 mlflow.chat.tokenUsage span 属性中,而整个跟踪中的总使用量将记录在 mlflow.trace.tokenUsage 元数据字段中。
import json
import mlflow
mlflow.langchain.autolog()
# Execute the chain defined in the previous example
chain.invoke(
{
"person": "Linus Torvalds",
"question": "Can I just set everyone's access to sudo to make things easier?",
}
)
# Get the trace object just created
last_trace_id = mlflow.get_last_active_trace_id()
trace = mlflow.get_trace(trace_id=last_trace_id)
# Print the token usage
total_usage = trace.info.token_usage
print("== Total token usage: ==")
print(f" Input tokens: {total_usage['input_tokens']}")
print(f" Output tokens: {total_usage['output_tokens']}")
print(f" Total tokens: {total_usage['total_tokens']}")
# Print the token usage for each LLM call
print("\n== Token usage for each LLM call: ==")
for span in trace.data.spans:
if usage := span.get_attribute("mlflow.chat.tokenUsage"):
print(f"{span.name}:")
print(f" Input tokens: {usage['input_tokens']}")
print(f" Output tokens: {usage['output_tokens']}")
print(f" Total tokens: {usage['total_tokens']}")
== Total token usage: ==
Input tokens: 81
Output tokens: 257
Total tokens: 338
== Token usage for each LLM call: ==
ChatOpenAI:
Input tokens: 81
Output tokens: 257
Total tokens: 338
自定义跟踪行为
有时您可能希望自定义记录在跟踪中的信息。您可以通过创建一个继承自 MlflowLangchainTracer 的自定义回调处理程序来实现此目的。MlflowLangchainTracer 是一个注入到 langchain 模型推理过程中的回调处理程序,用于自动记录跟踪。它在 chain 的一组操作(例如 on_chain_start、on_llm_start)上启动一个新的 span,并在操作完成时结束它。各种元数据(如 span 类型、操作名称、输入、输出、延迟等)会自动记录到 span 中。
以下示例演示了如何在 chat 模型开始运行时将附加属性记录到 span 中。
from mlflow.langchain.langchain_tracer import MlflowLangchainTracer
class CustomLangchainTracer(MlflowLangchainTracer):
# Override the handler functions to customize the behavior. The method signature is defined by LangChain Callbacks.
def on_chat_model_start(
self,
serialized: Dict[str, Any],
messages: List[List[BaseMessage]],
*,
run_id: UUID,
tags: Optional[List[str]] = None,
parent_run_id: Optional[UUID] = None,
metadata: Optional[Dict[str, Any]] = None,
name: Optional[str] = None,
**kwargs: Any,
):
"""Run when a chat model starts running."""
attributes = {
**kwargs,
**metadata,
# Add additional attribute to the span
"version": "1.0.0",
}
# Call the _start_span method at the end of the handler function to start a new span.
self._start_span(
span_name=name or self._assign_span_name(serialized, "chat model"),
parent_run_id=parent_run_id,
span_type=SpanType.CHAT_MODEL,
run_id=run_id,
inputs=messages,
attributes=kwargs,
)
禁用自动跟踪
可以通过调用 mlflow.langchain.autolog(disable=True) 或 mlflow.autolog(disable=True) 来全局禁用 LangChain 的自动跟踪。