LangChain 与 MLflow 的使用介绍

下载此 notebook

欢迎使用本互动式教程，旨在向您介绍 LangChain 及其与 MLflow 的集成。本教程以 notebook 形式呈现，为您提供 LangChain 最简单、最核心功能的动手实践学习体验。

您将学到什么

了解 LangChain：了解 LangChain 的基础知识以及如何在语言模型驱动的应用开发中使用它。
LangChain 中的 Chain：探索 LangChain 中 chain 的概念，它是由一系列动作或操作编排而成，用于执行复杂任务。
与 MLflow 集成：了解 LangChain 如何与 MLflow 集成，MLflow 是一个用于管理机器学习生命周期的平台，包括日志记录、追踪和模型部署。
实际应用：运用您的知识构建一个像副厨师长一样的 LangChain chain，专注于食谱的准备步骤。

LangChain 背景知识

LangChain 是一个基于 Python 的框架，用于简化使用语言模型的应用开发。它旨在增强应用的上下文感知和推理能力，从而实现更复杂和互动的功能。

什么是 Chain？

Chain 定义：在 LangChain 中，chain 指的是一系列相互连接的组件或步骤，旨在完成特定任务。
Chain 示例：在本教程中，我们将创建一个模拟副厨师长为食谱准备食材和工具的 chain。

教程概述

在本教程中，您将

设置 LangChain 和 MLflow：初始化和配置 LangChain 和 MLflow。
创建一个副厨师长 Chain：开发一个 LangChain chain，列出食材、描述准备技巧、组织食材分段以及详细说明给定食谱的烹饪工具准备工作。
记录并加载模型：使用 MLflow 记录 chain 模型，然后加载它进行预测。
运行预测：执行 chain 以查看它如何为特定数量的顾客准备餐厅菜肴。

在本教程结束时，您将打下使用 LangChain 和 MLflow 的坚实基础，并了解如何构建和管理 chain 以用于实际应用。

让我们深入了解 LangChain 和 MLflow 的世界吧！

先决条件

为了开始本教程，我们首先需要准备一些东西。

一个 OpenAI API 账户。您可以在此处注册以获取访问权限，从而开始以编程方式访问全球领先的、高度复杂的 LLM 服务之一。
一个 OpenAI API 密钥。您可以在创建账户后通过导航到API 密钥页面来获取此密钥。
OpenAI SDK。它可以在 PyPI 上找到。对于本教程，我们将使用 0.28.1 版本（1.0 版本之前的最后一个版本）。
LangChain 包。您可以在 PyPI 上找到它。

Notebook 兼容性

由于像 langchain 这样库变化迅速，示例可能很快就会过时并无法正常工作。为了演示目的，以下是建议用于有效运行此 notebook 的关键依赖项

包	版本
langchain	0.1.16
lanchain-community	0.0.33
langchain-openai	0.0.8
openai	1.12.0
tiktoken	0.6.0
mlflow	2.12.1

如果您尝试使用不同版本执行此 notebook，它可能能正常工作，但建议使用上述精确版本以确保您的代码正确执行。

要安装依赖包，只需运行

pip install openai==1.12.0 tiktoken==0.6.0 langchain==0.1.16 langchain-openai==0.0.33 langchain-community==0.0.33 mlflow==2.12.1

注意：本教程不支持 openai<1，并且不保证与 langchain<1.16.0 的版本兼容

API 密钥安全概述

API 密钥，特别是对于 SaaS 大语言模型 (LLMs)，由于与计费相关联，其敏感性与财务信息相当。

如果您有兴趣了解 MLflow 提供的一种安全管理访问密钥的替代解决方案，请在此处阅读关于 MLflow AI 网关的信息。

基本实践：

保密性：始终保持 API 密钥的私密性。
安全存储：优先使用环境变量或安全服务。
定期轮换：定期更新密钥以避免未经授权的访问。

配置 API 密钥

为了安全使用，请将 API 密钥设置为环境变量。

macOS/Linux：请参阅 Apple 关于在终端中使用环境变量的指南以获取详细说明。

Windows：遵循 Microsoft 关于环境变量文档中概述的步骤。

import os

from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain_openai import OpenAI

import mlflow

assert "OPENAI_API_KEY" in os.environ, "Please set the OPENAI_API_KEY environment variable."

注意：如果您想将 Azure OpenAI 与 LangChain 一起使用，您需要安装 openai>=1.10.0 和 langchain-openai>=0.0.6，并指定以下凭据和参数

# NOTE: Only run this cell if you are using Azure interfaces with OpenAI. If you have a direct account with
# OpenAI, ignore this cell.

from langchain_openai import AzureOpenAI, AzureOpenAIEmbeddings

# Set this to `azure`
os.environ["OPENAI_API_TYPE"] = "azure"
# The API version you want to use: set this to `2023-05-15` for the released version.
os.environ["OPENAI_API_VERSION"] = "2023-05-15"
assert "AZURE_OPENAI_ENDPOINT" in os.environ, (
  "Please set the AZURE_OPENAI_ENDPOINT environment variable. It is the base URL for your Azure OpenAI resource. You can find this in the Azure portal under your Azure OpenAI resource."
)
assert "OPENAI_API_KEY" in os.environ, (
  "Please set the OPENAI_API_KEY environment variable. It is the API key for your Azure OpenAI resource. You can find this in the Azure portal under your Azure OpenAI resource."
)

azure_openai_llm = AzureOpenAI(
  deployment_name="<your-deployment-name>",
  model_name="gpt-4o-mini",
)
azure_openai_embeddings = AzureOpenAIEmbeddings(
  azure_deployment="<your-deployment-name>",
)

在 LangChain 中配置 OpenAI Completions 模型

在本教程的这一部分中，我们使用适用于生成语言补全的特定参数配置了 OpenAI 模型。我们使用的是 Completions 模型，而不是 ChatCompletions，这意味着每个请求都是独立的，并且每次都需要包含整个提示才能生成响应。

理解 Completions 模型

Completions 模型：此模型不会在请求之间维护上下文信息。它非常适合每个请求都是独立的且不依赖于过去交互的任务。为各种非对话式应用提供灵活性。
无上下文记忆：缺乏对先前交互的记忆意味着该模型最适合一次性请求或不需要对话连续性的场景。
与 ChatCompletions 模型类型的比较：专为对话式 AI 量身定制，可在多次交互中保持上下文，实现连续对话。适用于聊天机器人或对话历史至关重要的应用。

在本教程中，我们使用 Completions 模型，因为它在处理独立的单个请求方面既简单又有效，这与我们教程专注于烹饪前的准备步骤的重点相符。

llm = OpenAI(temperature=0.1, max_tokens=1000)

副厨师长模拟的模板指令说明

在本教程的这一部分，我们精心设计了一个详细的提示模板，模拟了高级餐厅副厨师长的角色。此模板旨在指导 LangChain 模型准备一道菜，完全专注于 mise-en-place（备料）过程。

模板指令分解

副厨师长角色扮演：提示将语言模型置于副厨师长的角色，强调一丝不苟的准备工作。
任务大纲:
1. 列出食材：指导模型逐项列出给定菜肴所需的所有食材。
2. 准备技巧：要求模型描述食材准备所需的技巧，例如切割和处理。
3. 食材分段：要求模型为每种食材提供详细的分段说明，考虑使用顺序和时机。
4. 烹饪工具准备：指导模型列出并准备菜肴准备阶段所需的所有烹饪工具。
范围限制：此模板明确设计为在准备阶段停止，避免实际烹饪过程。它专注于设置主厨开始烹饪所需的一切。
动态输入：模板可适应不同的食谱和顾客人数，如占位符 {recipe} 和 {customer_count} 所示。

此模板指令是本教程的关键组成部分，演示了如何利用 LangChain 声明具有参数化特征的指导性提示，这些提示针对单用途的补全式应用。

template_instruction = (
  "Imagine you are a fine dining sous chef. Your task is to meticulously prepare for a dish, focusing on the mise-en-place process."
  "Given a recipe, your responsibilities are: "
  "1. List the Ingredients: Carefully itemize all ingredients required for the dish, ensuring every element is accounted for. "
  "2. Preparation Techniques: Describe the techniques and operations needed for preparing each ingredient. This includes cutting, "
  "processing, or any other form of preparation. Focus on the art of mise-en-place, ensuring everything is perfectly set up before cooking begins."
  "3. Ingredient Staging: Provide detailed instructions on how to stage and arrange each ingredient. Explain where each item should be placed for "
  "efficient access during the cooking process. Consider the timing and sequence of use for each ingredient. "
  "4. Cooking Implements Preparation: Enumerate all the cooking tools and implements needed for each phase of the dish's preparation. "
  "Detail any specific preparation these tools might need before the actual cooking starts and describe what pots, pans, dishes, and "
  "other tools will be needed for the final preparation."
  "Remember, your guidance stops at the preparation stage. Do not delve into the actual cooking process of the dish. "
  "Your goal is to set the stage flawlessly for the chef to execute the cooking seamlessly."
  "The recipe you are given is for: {recipe} for {customer_count} people. "
)

构建 LangChain Chain

我们首先在 LangChain 中设置一个 PromptTemplate，该模板专为我们的副厨师长场景量身定制。该模板设计为动态接受食谱名称和顾客人数等输入。然后，我们将 OpenAI 语言模型与提示模板结合，初始化一个 LLMChain，创建一个可以模拟副厨师长准备过程的 chain。

在 MLflow 中记录 Chain

chain 准备就绪后，我们继续在 MLflow 中记录它。这在一个 MLflow 运行中完成，该运行不仅以指定的名称记录 chain 模型，还追踪模型的各种详细信息。记录过程确保了 chain 的所有方面都被记录下来，便于高效的版本控制和将来的检索。

prompt = PromptTemplate(
  input_variables=["recipe", "customer_count"],
  template=template_instruction,
)
chain = LLMChain(llm=llm, prompt=prompt)

mlflow.set_experiment("Cooking Assistant")

with mlflow.start_run():
  model_info = mlflow.langchain.log_model(chain, "langchain_model")

如果我们导航到 MLflow UI，我们将看到我们记录的 LangChain 模型。

Our LangChain Model in the MLflow UI

使用 MLflow 加载模型并进行预测

在本教程的这一部分，我们将演示如何使用 MLflow 实际应用已记录的 LangChain 模型。我们加载模型并对一道特定菜肴运行预测，展示了该模型在烹饪准备方面的辅助能力。

模型加载与执行

在用 MLflow 记录了我们的 LangChain chain 后，我们继续使用 MLflow 的 pyfunc.load_model 函数加载模型。这一步至关重要，因为它将我们先前记录的模型置于可执行状态。

然后，我们将一个特定的食谱以及顾客人数输入到我们的模型中。在本例中，我们使用“勃艮第红烧牛肉”的食谱，并指定是为 12 位顾客准备的。模型充当副厨师长，处理这些信息并生成详细的准备说明。

模型输出

模型的输出提供了关于准备“勃艮第红烧牛肉”的全面指南，涵盖了几个关键方面

食材清单：详细列出所有必要的食材，根据指定数量的顾客进行量化和调整。
准备技巧：按照 mise-en-place（备料）原则，逐步说明如何准备每种食材。
食材分段：指导如何组织和分段食材，确保在烹饪过程中高效取用和使用。
烹饪工具准备：关于准备必要的烹饪工具和器具的说明，包括锅具、碗和滤锅等。

本示例展示了在实际场景中结合 LangChain 和 MLflow 的强大功能和实用性。它强调了这种集成如何有效地将复杂要求转化为可操作的步骤，有助于需要精确和周密计划的任务。

loaded_model = mlflow.pyfunc.load_model(model_info.model_uri)

dish1 = loaded_model.predict({"recipe": "boeuf bourginon", "customer_count": "4"})

print(dish1[0])

1. Ingredients:
- 2 pounds beef chuck, cut into 1-inch cubes
- 6 slices of bacon, diced
- 2 tablespoons olive oil
- 1 onion, diced
- 2 carrots, diced
- 2 cloves of garlic, minced
- 1 tablespoon tomato paste
- 1 bottle of red wine
- 2 cups beef broth
- 1 bouquet garni (thyme, bay leaf, parsley)
- 1 pound pearl onions, peeled
- 1 pound mushrooms, quartered
- Salt and pepper to taste
- Chopped parsley for garnish

2. Preparation Techniques:
- Cut the beef chuck into 1-inch cubes and set aside.
- Dice the bacon and set aside.
- Peel and dice the onion and carrots.
- Mince the garlic cloves.
- Prepare the bouquet garni by tying together a few sprigs of thyme, a bay leaf, and a few sprigs of parsley with kitchen twine.
- Peel the pearl onions and quarter the mushrooms.

3. Ingredient Staging:
- Place the beef cubes in a bowl and season with salt and pepper.
- In a large Dutch oven, heat the olive oil over medium-high heat.
- Add the diced bacon and cook until crispy.
- Remove the bacon from the pot and set aside.
- In the same pot, add the seasoned beef cubes and cook until browned on all sides.
- Remove the beef from the pot and set aside.
- In the same pot, add the diced onion and carrots and cook until softened.
- Add the minced garlic and cook for an additional minute.
- Stir in the tomato paste and cook for another minute.
- Add the beef and bacon back into the pot.
- Pour in the red wine and beef broth.
- Add the bouquet garni and bring to a simmer.
- Cover the pot and let it simmer for 2 hours, stirring occasionally.
- After 2 hours, add the pearl onions and mushrooms to the pot.
- Continue to simmer for an additional hour, or until the beef is tender.
- Remove the bouquet garni and discard.
- Taste and adjust seasoning with salt and pepper if needed.
- Garnish with chopped parsley before serving.

4. Cooking Implements Preparation:
- Large Dutch oven or heavy-bottomed pot
- Kitchen twine
- Cutting board
- Chef's knife
- Wooden spoon
- Measuring cups and spoons
- Bowls for prepped ingredients
- Tongs for handling meat
- Ladle for serving
- Serving dishes for the final dish.

dish2 = loaded_model.predict({"recipe": "Okonomiyaki", "customer_count": "12"})

print(dish2[0])

Ingredients:
- 2 cups all-purpose flour
- 2 teaspoons baking powder
- 1/2 teaspoon salt
- 2 eggs
- 1 1/2 cups water
- 1/2 head cabbage, thinly sliced
- 1/2 cup green onions, thinly sliced
- 1/2 cup carrots, grated
- 1/2 cup red bell pepper, thinly sliced
- 1/2 cup cooked shrimp, chopped
- 1/2 cup cooked bacon, chopped
- 1/2 cup pickled ginger, chopped
- 1/2 cup tenkasu (tempura flakes)
- 1/2 cup mayonnaise
- 1/4 cup okonomiyaki sauce
- 1/4 cup katsuobushi (dried bonito flakes)
- Vegetable oil for cooking

Preparation Techniques:
1. In a large mixing bowl, combine the flour, baking powder, and salt.
2. In a separate bowl, beat the eggs and water together.
3. Slowly pour the egg mixture into the flour mixture, stirring until well combined.
4. Set the batter aside to rest for 10 minutes.
5. Thinly slice the cabbage, green onions, and red bell pepper.
6. Grate the carrots.
7. Chop the cooked shrimp, bacon, and pickled ginger.
8. Prepare the tenkasu, mayonnaise, okonomiyaki sauce, and katsuobushi.

Ingredient Staging:
1. Place the sliced cabbage, green onions, carrots, red bell pepper, shrimp, bacon, and pickled ginger in separate bowls.
2. Arrange the tenkasu, mayonnaise, okonomiyaki sauce, and katsuobushi in small dishes.
3. Set up a large griddle or non-stick pan for cooking the okonomiyaki.

Cooking Implements Preparation:
1. Make sure the griddle or pan is clean and dry.
2. Heat the griddle or pan over medium heat.
3. Have a spatula, tongs, and a large plate ready for flipping and serving the okonomiyaki.
4. Prepare a large plate or platter for serving the finished okonomiyaki.

Remember, mise-en-place is key to a successful dish. Make sure all ingredients are prepped and ready to go before starting the cooking process. Happy cooking!

结论

在本教程的最后一步，我们使用 LangChain 模型执行了另一个预测。这次，我们探讨了为 12 位顾客准备日本料理“大阪烧”的过程。这展示了模型在各种菜肴上的适应性和多功能性。

使用加载的模型进行附加预测

模型处理“大阪烧”的输入并输出详细的准备步骤。这包括列出食材、解释准备技巧、指导食材分段以及详细说明所需的烹饪工具，展示了模型精确处理各种食谱的能力。

我们学到了什么

模型多功能性：本教程重点介绍了 LangChain 框架如何组装基本 LLM 应用的组件，将特定的指导性提示连接到 Completions 风格的 LLM。
MLflow 在模型管理中的作用：LangChain 与 MLflow 的集成展示了有效的模型生命周期管理，从创建和记录到预测执行。

结束语

本教程通过使用 MLflow 创建、管理和利用 LangChain 模型进行烹饪准备，提供了一次富有见地的旅程。它展示了 LangChain 在复杂场景中的实际应用和适应性。我们希望这次体验为您提供了宝贵的知识，并鼓励您在您的项目中进一步探索和创新使用 LangChain 和 MLflow。祝您编程愉快！

下一步是什么？

要继续学习 MLflow 和 LangChain 在更复杂示例中的功能，我们鼓励您继续学习其他 LangChain 教程。

您将学到什么​

LangChain 背景知识​

什么是 Chain？​

教程概述​

先决条件​

Notebook 兼容性​

API 密钥安全概述​

基本实践：​

配置 API 密钥​

在 LangChain 中配置 OpenAI Completions 模型​

理解 Completions 模型​

副厨师长模拟的模板指令说明​

模板指令分解​

构建 LangChain Chain​

在 MLflow 中记录 Chain​

使用 MLflow 加载模型并进行预测​

模型加载与执行​

模型输出​

结论​

使用加载的模型进行附加预测​

我们学到了什么​

结束语​

下一步是什么？​