使用 MLflow 和 DialoGPT 探索对话式 AI

下载此笔记本

欢迎阅读我们的教程，本教程介绍如何将 Microsoft 的 DialoGPT 与 MLflow 的 transformers flavor 集成，以探索对话式 AI。

学习目标

在本教程中，您将

使用 Transformers 库中的 DialoGPT 设置对话式 AI 管道。
使用 MLflow 记录 DialoGPT 模型及其配置。
推断 DialoGPT 模型的输入和输出签名。
从 MLflow 加载存储的 DialoGPT 模型以进行交互式使用。
与聊天机器人模型互动，了解对话式 AI 的细微差别。

在本教程结束时，您将对使用 MLflow 管理和部署对话式 AI 模型有深刻的理解，从而增强您在自然语言处理方面的能力。

什么是 DialoGPT？

DialoGPT 是微软开发的一种对话模型，经过大量对话数据集的微调，可以生成类似人类的回复。作为 GPT 系列的一部分，DialoGPT 在自然语言理解和生成方面表现出色，非常适合聊天机器人。

为什么将 MLflow 与 DialoGPT 结合使用？

将 MLflow 与 DialoGPT 集成可增强对话式 AI 模型的开发

实验跟踪：跟踪跨实验的配置和指标。
模型管理：管理聊天机器人模型的不同版本和配置。
可重现性：确保模型行为的可重现性。
部署：简化生产环境中对话模型的部署。

# Disable tokenizers warnings when constructing pipelines
%env TOKENIZERS_PARALLELISM=false

import warnings

# Disable a few less-than-useful UserWarnings from setuptools and pydantic
warnings.filterwarnings("ignore", category=UserWarning)

env: TOKENIZERS_PARALLELISM=false

设置对话管道

我们首先使用transformers并通过 MLflow 管理，来设置使用 DialoGPT 的对话管道。

我们首先导入必要的库。来自 Hugging Face 的 transformers 库提供了丰富的预训练模型集合，包括 DialoGPT，用于各种 NLP 任务。MLflow 是一种用于 ML 生命周期中的综合工具，可帮助进行实验跟踪、可重现性和部署。

初始化对话管道

使用 transformers.pipeline 函数，我们设置一个对话管道。我们选择“microsoft/DialoGPT-medium”模型，在性能和资源效率之间取得平衡，非常适合对话式 AI。此步骤对于确保模型已准备好进行交互并集成到各种应用程序中至关重要。

使用 MLflow 推断模型签名

模型签名是定义模型如何与输入数据交互的关键。为了推断它，我们使用一个示例输入（“Hi there, chatbot!”），并利用 mlflow.transformers.generate_signature_output 来理解模型的输入输出模式。此过程确保了模型数据需求和预测格式的清晰度，这对于无缝部署和使用至关重要。

此配置阶段为强大的对话式 AI 系统奠定了基础，利用 DialoGPT 和 MLflow 的优势实现高效且有效的对话交互。

import transformers

import mlflow

# Define our pipeline, using the default configuration specified in the model card for DialoGPT-medium
conversational_pipeline = transformers.pipeline(model="microsoft/DialoGPT-medium")

# Infer the signature by providing a representnative input and the output from the pipeline inference abstraction in the transformers flavor in MLflow
signature = mlflow.models.infer_signature(
  "Hi there, chatbot!",
  mlflow.transformers.generate_signature_output(conversational_pipeline, "Hi there, chatbot!"),
)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.

创建实验

我们创建一个新的 MLflow 实验，以便我们要将模型记录到的运行不会记录到默认实验，而是具有其自己的上下文相关条目。

# If you are running this tutorial in local mode, leave the next line commented out.
# Otherwise, uncomment the following line and set your tracking uri to your local or remote tracking server.

# mlflow.set_tracking_uri("http://127.0.0.1:8080")

# Set a name for the experiment that is indicative of what the runs being created within it are in regards to
mlflow.set_experiment("Conversational")

<Experiment: artifact_location='file:///Users/benjamin.wilson/repos/mlflow-fork/mlflow/docs/source/llms/transformers/tutorials/conversational/mlruns/370178017237207703', creation_time=1701292102618, experiment_id='370178017237207703', last_update_time=1701292102618, lifecycle_stage='active', name='Conversational', tags={}>

使用 MLflow 记录模型

现在我们将使用 MLflow 记录我们的对话式 AI 模型，以确保系统的版本控制、跟踪和管理。

启动 MLflow 运行

我们的第一步是使用 mlflow.start_run() 启动 MLflow 运行。此操作会启动一个新的跟踪环境，捕获唯一运行 ID 下的所有模型相关数据。这是分离和组织不同建模实验的关键步骤。

记录对话模型

我们使用 mlflow.transformers.log_model 记录我们的 DialoGPT 对话模型。这个专门的函数可以高效地记录 Transformer 模型，并且需要几个关键参数

transformers_model：我们传递我们的 DialoGPT 对话管道。
artifact_path：MLflow 运行中的存储位置，恰如其分地命名为 "chatbot"。
task：设置为 "conversational" 以反映模型的用途。
signature：推断的模型签名，规定了预期的输入和输出。
input_example：一个示例提示，例如 "A clever and witty question"，用于演示预期用法。

通过此过程，MLflow 不仅跟踪我们的模型，还组织其元数据，从而有助于未来的检索、理解和部署。

with mlflow.start_run():
  model_info = mlflow.transformers.log_model(
      transformers_model=conversational_pipeline,
      name="chatbot",
      task="conversational",
      signature=signature,
      input_example="A clever and witty question",
  )

加载聊天机器人模型并与之交互

接下来，我们将加载 MLflow 记录的聊天机器人模型并与之交互，看看它的实际效果。

使用 MLflow 加载模型

我们使用 mlflow.pyfunc.load_model 加载我们的对话式 AI 模型。此函数是 MLflow 的 Python 函数 flavor 的一个关键方面，提供了一种与 Python 模型交互的通用方式。通过指定 model_uri=model_info.model_uri，我们精确地定位了我们的 DialoGPT 模型在 MLflow 跟踪系统中的存储位置。

与聊天机器人互动

加载后，该模型（称为 chatbot）已准备好进行交互。我们通过以下方式展示其对话能力：

提出问题：向聊天机器人提出诸如“到达南极洲的最佳方式是什么？”之类的问题。
捕获响应：聊天机器人的响应通过 predict 方法生成，提供了其对话技巧的实际示例。例如，它可能会提供关于乘船到达南极洲的建议。

此演示突出了部署和使用 MLflow 记录的模型（尤其是在像对话式 AI 这样的动态和交互式场景中）的实用性和便利性。

# Load the model as a generic python function in order to leverage the integrated Conversational Context
# Note that loading a conversational model with the native flavor (i.e., `mlflow.transformers.load_model()`) will not include anything apart from the
# pipeline itself; if choosing to load in this way, you will need to manage your own Conversational Context instance to maintain state on the
# conversation history.
chatbot = mlflow.pyfunc.load_model(model_uri=model_info.model_uri)

# Validate that the model is capable of responding to a question
first = chatbot.predict("What is the best way to get to Antarctica?")

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.

print(f"Response: {first}")

Response: I think you can get there by boat.

继续与聊天机器人对话

我们进一步探索 MLflow pyfunc 实现的对话上下文状态，其中使用了 DialoGPT 聊天机器人模型。

测试上下文记忆

我们提出一个后续问题“我应该使用哪种船？”以测试聊天机器人的上下文理解。我们得到的回答“可以去南极洲的船”虽然简单，但展示了 MLflow pyfunc 模型在使用 ConversationalPipeline 类型的模型时，能够保留并利用对话历史记录来获得连贯的响应。

了解响应风格

响应的风格（诙谐且略带俏皮）反映了训练数据的性质，主要是来自 Reddit 的对话交流。这种训练来源对模型的语气和风格产生了重大影响，从而导致响应可能具有幽默感和多样性。

训练数据的意义

这种交互突出了训练数据的来源在塑造模型响应方面的重要性。在将此类模型部署到现实世界中的应用程序时，必须理解并考虑训练数据对模型的对话风格和知识库的影响。

# Verify that the PyFunc implementation has maintained state on the conversation history by asking a vague follow-up question that requires context
# in order to answer properly
second = chatbot.predict("What sort of boat should I use?")

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.

print(f"Response: {second}")

Response: A boat that can go to Antarctica.

结论和主要收获

在本教程中，我们探讨了 MLflow 与对话式 AI 模型的集成，特别是使用了 Microsoft 的 DialoGPT 模型。我们涵盖了几个重要的方面和技术，这些方面和技术对于任何希望在实际的现实环境中处理高级机器学习模型的人员来说至关重要。

主要收获

用于模型管理的 MLflow：我们演示了如何有效地使用 MLflow 来管理和部署机器学习模型。在机器学习工作流程中，记录模型、跟踪实验和管理模型的不同版本的能力非常宝贵。
对话式 AI：通过使用 DialoGPT 模型，我们深入研究了对话式 AI 的世界，展示了如何设置对话模型并与之交互。这包括理解维护对话上下文的细微差别以及训练数据对模型响应的影响。
实践实施：通过实际示例，我们展示了如何在 MLflow 中记录模型、推断模型签名以及使用 pyfunc 模型 flavor 以便轻松部署和交互。这种实践方法旨在为您提供在您自己的项目中实施这些技术所需的技能。
了解模型响应：我们强调了理解模型训练数据性质的重要性。这种理解对于解释模型的响应以及为特定用例定制模型至关重要。
上下文历史记录：MLflow 的 transformers pyfunc 实现（用于 ConversationalPipelines）维护一个 Conversation 上下文，而无需您自己管理状态。这使得创建聊天机器人能够以最小的努力实现，因为状态会为您维护。

总结

在结束本教程之际，我们希望您对如何将 MLflow 与对话式 AI 模型集成以及部署这些模型所涉及的实际考虑因素有了更深入的了解。在这里获得的技能和知识不仅适用于对话式 AI，也适用于更广泛的机器学习应用。

请记住，机器学习领域广阔且不断发展。持续学习和实验是保持更新并充分利用这些令人兴奋的技术的关键。

感谢您加入我们，共同探索 MLflow 和对话式 AI 的世界。我们鼓励您采纳这些经验教训，并将它们应用于您自己独特的挑战和项目。祝您编码愉快！

学习目标​

什么是 DialoGPT？​

为什么将 MLflow 与 DialoGPT 结合使用？​

设置对话管道​

初始化对话管道​

使用 MLflow 推断模型签名​

创建实验​

使用 MLflow 记录模型​

启动 MLflow 运行​

记录对话模型​

加载聊天机器人模型并与之交互​

使用 MLflow 加载模型​

与聊天机器人互动​

继续与聊天机器人对话​

测试上下文记忆​

了解响应风格​

训练数据的意义​

结论和主要收获​

主要收获​

总结​