使用 LangChain 与 MLflow 入门

下载此笔记本

欢迎参加本互动教程，旨在向您介绍 LangChain 及其与 MLflow 的集成。本教程的结构为 Notebook，通过 LangChain 最简单和最核心的功能，提供实践性的学习体验。

您将学到什么

了解 LangChain：了解 LangChain 的基础知识以及如何在开发由语言模型驱动的应用程序中使用它。
LangChain 中的 Chains (链)：探索 LangChain 中 chains (链) 的概念，它是一系列协调执行以完成复杂任务的动作或操作。
与 MLflow 集成：了解 LangChain 如何与 MLflow 集成，MLflow 是一个用于管理机器学习生命周期的平台，包括日志记录、跟踪和部署模型。
实际应用：应用您的知识来构建一个 LangChain 链，该链的作用就像一个副厨师，专注于食谱的准备步骤。

LangChain 背景知识

LangChain 是一个基于 Python 的框架，可以简化使用语言模型开发应用程序的过程。它旨在增强应用程序的上下文感知和推理能力，从而实现更复杂和交互式的功能。

什么是 Chain (链)？

Chain (链) 定义：在 LangChain 中，chain (链) 指的是一系列相互连接的组件或步骤，旨在完成特定任务。
Chain (链) 示例：在本教程中，我们将创建一个链，模拟副厨师在为食谱准备配料和工具方面的角色。

教程概述

在本教程中，您将

设置 LangChain 和 MLflow：初始化和配置 LangChain 和 MLflow。
创建副厨师链：开发一个 LangChain 链，该链列出配料，描述准备技巧，组织配料的分期，并详细说明给定食谱的烹饪工具准备。
记录和加载模型：利用 MLflow 记录链模型，然后加载它以进行预测。
运行预测：执行该链，看看它将如何为特定数量的顾客准备一道餐厅菜肴。

在本教程结束时，您将掌握使用 LangChain 和 MLflow 的扎实基础，并了解如何构建和管理链以进行实际应用。

让我们深入研究并探索 LangChain 和 MLflow 的世界！

先决条件

为了开始本教程，我们首先需要准备一些东西。

一个 OpenAI API 帐户。您可以在此处注册以获得访问权限，从而开始以编程方式访问地球上领先的高度复杂的 LLM 服务之一。
一个 OpenAI API 密钥。您可以在创建帐户后通过导航到 API 密钥页面来访问此密钥。
OpenAI SDK。它可以在 PyPI 上找到。在本教程中，我们将使用 0.28.1 版本（1.0 版本之前的最后一个版本）。
LangChain 包。您可以在此处在 PyPI 上找到它。

Notebook 兼容性

由于像 langchain 这样的库变化迅速，示例可能会很快过时并且无法再工作。为了演示的目的，以下是建议使用的关键依赖项，以便有效地运行此 Notebook。

软件包	版本
langchain	0.1.16
lanchain-community	0.0.33
langchain-openai	0.0.8
openai	1.12.0
tiktoken	0.6.0
mlflow	2.12.1

如果您尝试使用不同的版本执行此 Notebook，它可能可以正常运行，但建议使用上述精确版本，以确保您的代码正确执行。

要安装依赖包，只需运行

pip install openai==1.12.0 tiktoken==0.6.0 langchain==0.1.16 langchain-openai==0.0.33 langchain-community==0.0.33 mlflow==2.12.1

注意：本教程不支持 openai<1，并且不保证与 langchain<1.16.0 版本一起使用。

API 密钥安全概述

API 密钥，尤其是 SaaS 大型语言模型 (LLM) 的 API 密钥，由于其与计费的连接，与财务信息一样敏感。

如果您有兴趣了解有关安全管理您的访问密钥的替代 MLflow 解决方案的更多信息，请在此处阅读有关 MLflow AI Gateway 的信息。

基本实践：

保密性：始终对 API 密钥保密。
安全存储：首选环境变量或安全服务。
频繁轮换：定期更新密钥以避免未经授权的访问。

配置 API 密钥

为了安全使用，请将 API 密钥设置为环境变量。

macOS/Linux：有关详细说明，请参阅Apple 关于在 Terminal 中使用环境变量的指南。

Windows：按照Microsoft 关于环境变量的文档中概述的步骤进行操作。

import os

from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain_openai import OpenAI

import mlflow

assert "OPENAI_API_KEY" in os.environ, "Please set the OPENAI_API_KEY environment variable."

注意：如果您想将 Azure OpenAI 与 LangChain 一起使用，则需要安装 openai>=1.10.0 和 langchain-openai>=0.0.6，以及指定以下凭据和参数。

# NOTE: Only run this cell if you are using Azure interfaces with OpenAI. If you have a direct account with
# OpenAI, ignore this cell.

from langchain_openai import AzureOpenAI, AzureOpenAIEmbeddings

# Set this to `azure`
os.environ["OPENAI_API_TYPE"] = "azure"
# The API version you want to use: set this to `2023-05-15` for the released version.
os.environ["OPENAI_API_VERSION"] = "2023-05-15"
assert "AZURE_OPENAI_ENDPOINT" in os.environ, (
  "Please set the AZURE_OPENAI_ENDPOINT environment variable. It is the base URL for your Azure OpenAI resource. You can find this in the Azure portal under your Azure OpenAI resource."
)
assert "OPENAI_API_KEY" in os.environ, (
  "Please set the OPENAI_API_KEY environment variable. It is the API key for your Azure OpenAI resource. You can find this in the Azure portal under your Azure OpenAI resource."
)

azure_openai_llm = AzureOpenAI(
  deployment_name="<your-deployment-name>",
  model_name="gpt-4o-mini",
)
azure_openai_embeddings = AzureOpenAIEmbeddings(
  azure_deployment="<your-deployment-name>",
)

在 LangChain 中配置 OpenAI Completions 模型

在本教程的这一部分中，我们使用适合生成语言补全的特定参数配置了 OpenAI 模型。我们使用的是 Completions 模型，而不是 ChatCompletions，这意味着每个请求都是独立的，并且每次都需要包含整个提示才能生成响应。

了解 Completions 模型

Completions 模型：此模型不跨请求维护上下文信息。它非常适合每个请求都是独立的并且不依赖于过去交互的任务。为各种非会话应用程序提供灵活性。
没有上下文记忆：缺少以前交互的记忆意味着该模型最适合一次性请求或不需要会话连续性的场景。
与 ChatCompletions 模型类型比较：专为会话 AI 量身定制，跨多个交换维护上下文以进行持续会话。适用于聊天机器人或对话历史记录至关重要的应用程序。

在本教程中，我们使用 Completions 模型，因为它在处理单个、独立请求方面的简单性和有效性，这与我们教程对烹饪前准备步骤的关注相一致。

llm = OpenAI(temperature=0.1, max_tokens=1000)

副厨师模拟的模板说明的解释

在本教程的这一部分中，我们制作了一个详细的提示模板，该模板模拟了高级餐厅副厨师的角色。此模板旨在指导 LangChain 模型为一道菜做准备，仅专注于备料 (mise-en-place) 过程。

模板说明的细分

副厨师角色扮演：提示将语言模型置于副厨师的角色中，强调细致的准备工作。
任务大纲:
1. 列出配料：指示模型列出给定菜肴的所有必要配料。
2. 准备技巧：要求模型描述配料准备的必要技巧，例如切割和加工。
3. 配料分期：要求模型为每种配料提供详细的分期说明，并考虑使用顺序和时间。
4. 烹饪工具准备：指导模型列出和准备菜肴准备阶段所需的所有烹饪工具。
范围限制：该模板明确设计为在准备阶段停止，避免实际的烹饪过程。它专注于设置厨师开始烹饪所需的一切。
动态输入：该模板可以适应不同的食谱和客户数量，如占位符 {recipe} 和 {customer_count} 所示。

此模板说明是本教程的关键组成部分，演示了如何利用 LangChain 声明指导性提示，提示具有针对单用途完成式应用程序的参数化功能。

template_instruction = (
  "Imagine you are a fine dining sous chef. Your task is to meticulously prepare for a dish, focusing on the mise-en-place process."
  "Given a recipe, your responsibilities are: "
  "1. List the Ingredients: Carefully itemize all ingredients required for the dish, ensuring every element is accounted for. "
  "2. Preparation Techniques: Describe the techniques and operations needed for preparing each ingredient. This includes cutting, "
  "processing, or any other form of preparation. Focus on the art of mise-en-place, ensuring everything is perfectly set up before cooking begins."
  "3. Ingredient Staging: Provide detailed instructions on how to stage and arrange each ingredient. Explain where each item should be placed for "
  "efficient access during the cooking process. Consider the timing and sequence of use for each ingredient. "
  "4. Cooking Implements Preparation: Enumerate all the cooking tools and implements needed for each phase of the dish's preparation. "
  "Detail any specific preparation these tools might need before the actual cooking starts and describe what pots, pans, dishes, and "
  "other tools will be needed for the final preparation."
  "Remember, your guidance stops at the preparation stage. Do not delve into the actual cooking process of the dish. "
  "Your goal is to set the stage flawlessly for the chef to execute the cooking seamlessly."
  "The recipe you are given is for: {recipe} for {customer_count} people. "
)

构建 LangChain 链

我们首先在 LangChain 中设置一个 PromptTemplate，该模板专为我们的副厨师场景定制。该模板旨在动态接受食谱名称和客户数量等输入。然后，我们通过将 OpenAI 语言模型与提示模板相结合来初始化一个 LLMChain，从而创建一个可以模拟副厨师准备过程的链。

在 MLflow 中记录链

链准备好后，我们将继续在 MLflow 中记录它。这是在 MLflow 运行中完成的，该运行不仅在指定名称下记录链模型，还跟踪有关模型的各种详细信息。日志记录过程确保记录链的所有方面，从而实现高效的版本控制和未来的检索。

prompt = PromptTemplate(
  input_variables=["recipe", "customer_count"],
  template=template_instruction,
)
chain = LLMChain(llm=llm, prompt=prompt)

mlflow.set_experiment("Cooking Assistant")

with mlflow.start_run():
  model_info = mlflow.langchain.log_model(chain, name="langchain_model")

如果我们导航到 MLflow UI，我们将看到我们记录的 LangChain 模型。

Our LangChain Model in the MLflow UI

加载模型并使用 MLflow 进行预测

在本教程的这一部分中，我们将演示使用 MLflow 记录的 LangChain 模型的实际应用。我们加载模型并针对特定菜肴运行预测，展示了模型协助烹饪准备的能力。

模型加载和执行

使用 MLflow 记录 LangChain 链后，我们继续使用 MLflow 的 pyfunc.load_model 函数加载模型。这一步至关重要，因为它使我们之前记录的模型进入可执行状态。

然后，我们将特定的食谱以及客户数量输入到我们的模型中。在这种情况下，我们使用“勃艮第牛肉”的食谱，并指定它是为 12 位顾客准备的。该模型充当副厨师，处理此信息并生成详细的准备说明。

模型的输出

模型的输出提供了有关准备“勃艮第牛肉”的全面指南，涵盖了几个关键方面

配料清单：详细列出所有必要的配料，并根据指定的客户数量进行量化和定制。
准备技巧：有关如何准备每种配料的分步说明，遵循备料的原则。
配料分期：有关如何组织和分阶段配料的指南，确保在烹饪过程中高效访问和使用。
烹饪工具准备：有关准备必要的烹饪工具和器具的说明，从锅和平底锅到碗和滤器。

此示例演示了在实际场景中组合 LangChain 和 MLflow 的强大功能和实用性。它突出了这种集成如何有效地将复杂的需求转化为可操作的步骤，从而有助于需要精确性和仔细计划的任务。

loaded_model = mlflow.pyfunc.load_model(model_info.model_uri)

dish1 = loaded_model.predict({"recipe": "boeuf bourginon", "customer_count": "4"})

print(dish1[0])

1. Ingredients:
- 2 pounds beef chuck, cut into 1-inch cubes
- 6 slices of bacon, diced
- 2 tablespoons olive oil
- 1 onion, diced
- 2 carrots, diced
- 2 cloves of garlic, minced
- 1 tablespoon tomato paste
- 1 bottle of red wine
- 2 cups beef broth
- 1 bouquet garni (thyme, bay leaf, parsley)
- 1 pound pearl onions, peeled
- 1 pound mushrooms, quartered
- Salt and pepper to taste
- Chopped parsley for garnish

2. Preparation Techniques:
- Cut the beef chuck into 1-inch cubes and set aside.
- Dice the bacon and set aside.
- Peel and dice the onion and carrots.
- Mince the garlic cloves.
- Prepare the bouquet garni by tying together a few sprigs of thyme, a bay leaf, and a few sprigs of parsley with kitchen twine.
- Peel the pearl onions and quarter the mushrooms.

3. Ingredient Staging:
- Place the beef cubes in a bowl and season with salt and pepper.
- In a large Dutch oven, heat the olive oil over medium-high heat.
- Add the diced bacon and cook until crispy.
- Remove the bacon from the pot and set aside.
- In the same pot, add the seasoned beef cubes and cook until browned on all sides.
- Remove the beef from the pot and set aside.
- In the same pot, add the diced onion and carrots and cook until softened.
- Add the minced garlic and cook for an additional minute.
- Stir in the tomato paste and cook for another minute.
- Add the beef and bacon back into the pot.
- Pour in the red wine and beef broth.
- Add the bouquet garni and bring to a simmer.
- Cover the pot and let it simmer for 2 hours, stirring occasionally.
- After 2 hours, add the pearl onions and mushrooms to the pot.
- Continue to simmer for an additional hour, or until the beef is tender.
- Remove the bouquet garni and discard.
- Taste and adjust seasoning with salt and pepper if needed.
- Garnish with chopped parsley before serving.

4. Cooking Implements Preparation:
- Large Dutch oven or heavy-bottomed pot
- Kitchen twine
- Cutting board
- Chef's knife
- Wooden spoon
- Measuring cups and spoons
- Bowls for prepped ingredients
- Tongs for handling meat
- Ladle for serving
- Serving dishes for the final dish.

dish2 = loaded_model.predict({"recipe": "Okonomiyaki", "customer_count": "12"})

print(dish2[0])

Ingredients:
- 2 cups all-purpose flour
- 2 teaspoons baking powder
- 1/2 teaspoon salt
- 2 eggs
- 1 1/2 cups water
- 1/2 head cabbage, thinly sliced
- 1/2 cup green onions, thinly sliced
- 1/2 cup carrots, grated
- 1/2 cup red bell pepper, thinly sliced
- 1/2 cup cooked shrimp, chopped
- 1/2 cup cooked bacon, chopped
- 1/2 cup pickled ginger, chopped
- 1/2 cup tenkasu (tempura flakes)
- 1/2 cup mayonnaise
- 1/4 cup okonomiyaki sauce
- 1/4 cup katsuobushi (dried bonito flakes)
- Vegetable oil for cooking

Preparation Techniques:
1. In a large mixing bowl, combine the flour, baking powder, and salt.
2. In a separate bowl, beat the eggs and water together.
3. Slowly pour the egg mixture into the flour mixture, stirring until well combined.
4. Set the batter aside to rest for 10 minutes.
5. Thinly slice the cabbage, green onions, and red bell pepper.
6. Grate the carrots.
7. Chop the cooked shrimp, bacon, and pickled ginger.
8. Prepare the tenkasu, mayonnaise, okonomiyaki sauce, and katsuobushi.

Ingredient Staging:
1. Place the sliced cabbage, green onions, carrots, red bell pepper, shrimp, bacon, and pickled ginger in separate bowls.
2. Arrange the tenkasu, mayonnaise, okonomiyaki sauce, and katsuobushi in small dishes.
3. Set up a large griddle or non-stick pan for cooking the okonomiyaki.

Cooking Implements Preparation:
1. Make sure the griddle or pan is clean and dry.
2. Heat the griddle or pan over medium heat.
3. Have a spatula, tongs, and a large plate ready for flipping and serving the okonomiyaki.
4. Prepare a large plate or platter for serving the finished okonomiyaki.

Remember, mise-en-place is key to a successful dish. Make sure all ingredients are prepped and ready to go before starting the cooking process. Happy cooking!

结论

在本教程的最后一步中，我们使用 LangChain 模型执行另一个预测。这次，我们探索为 12 位顾客准备日本料理“大阪烧”。这证明了该模型在各种美食中的适应性和多功能性。

使用加载的模型进行额外预测

该模型处理“大阪烧”的输入并输出详细的准备步骤。这包括列出配料、解释准备技巧、指导配料分期以及详细说明所需的烹饪工具，展示了该模型能够精确地处理各种食谱。

我们学到了什么

模型多功能性：本教程重点介绍了 LangChain 框架，用于组装基本 LLM 应用程序的组件部分，将特定的指导性提示链接到完成式 LLM。
MLflow 在模型管理中的作用：LangChain 与 MLflow 的集成演示了有效的模型生命周期管理，从创建和记录到预测执行。

结束语

本教程提供了一个富有洞察力的旅程，通过使用 MLflow 创建、管理和利用 LangChain 模型进行烹饪准备。它展示了 LangChain 在复杂场景中的实际应用和适应性。我们希望这次体验能为您提供宝贵的知识，并鼓励您在项目中使用 LangChain 和 MLflow 进一步探索和创新。祝您编码愉快！

下一步是什么？

为了继续学习 MLflow 和 LangChain 在更复杂示例中的功能，我们鼓励您继续学习其他 LangChain 教程。

您将学到什么​

LangChain 背景知识​

什么是 Chain (链)？​

教程概述​

先决条件​

Notebook 兼容性​

API 密钥安全概述​

基本实践：​

配置 API 密钥​

在 LangChain 中配置 OpenAI Completions 模型​

了解 Completions 模型​

副厨师模拟的模板说明的解释​

模板说明的细分​

构建 LangChain 链​

在 MLflow 中记录链​

加载模型并使用 MLflow 进行预测​

模型加载和执行​

模型的输出​

结论​

使用加载的模型进行额外预测​

我们学到了什么​

结束语​

下一步是什么？​