使用MLflow进行检索器评估

下载此Notebook

在MLflow 2.8.0中，我们在mlflow.evaluate() API中引入了一个新的模型类型“retriever”（检索器）。它帮助您评估RAG应用中的检索器。它包含两个内置指标：precision_at_k和recall_at_k。在MLflow 2.9.0中，`ndcg_at_k`可用。

此Notebook演示了如何使用mlflow.evaluate()评估RAG应用中的检索器。它包含以下步骤：

步骤 1: 安装和加载软件包
步骤 2: 评估数据集准备
步骤 3: 调用 mlflow.evaluate()
步骤 4: 结果分析与可视化

步骤 1: 安装和加载软件包

%pip install mlflow==2.9.0 langchain==0.0.339 openai faiss-cpu gensim nltk pyLDAvis tiktoken

import ast
import os
import pprint

import pandas as pd
from langchain.docstore.document import Document
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import FAISS

import mlflow

os.environ["OPENAI_API_KEY"] = "<redacted>"

CHUNK_SIZE = 1000

# Assume running from https://github.com/mlflow/mlflow/blob/master/examples/llms/rag
OUTPUT_DF_PATH = "question_answer_source.csv"
SCRAPPED_DOCS_PATH = "mlflow_docs_scraped.csv"
EVALUATION_DATASET_PATH = "static_evaluation_dataset.csv"
DB_PERSIST_DIR = "faiss_index"

步骤 2: 评估数据集准备

评估数据集应包含三列：问题、真实文档ID、检索到的相关文档ID。“文档ID”是RAG应用中文档的唯一字符串标识符。例如，它可以是文档网页的URL，或PDF文档的文件路径。

如果您已有要评估的问题列表，请参阅 1.1 手动准备。如果您还没有问题列表，请参阅 1.2 生成评估数据集。

手动准备

在评估检索器时，建议将检索到的文档ID保存到一个静态数据集中，该数据集可以用Pandas Dataframe或MLflow Pandas Dataset表示，其中包含输入查询、检索到的相关文档ID以及用于评估的真实文档ID。

概念

“文档ID”是一个用于标识文档的字符串。

“检索到的相关文档ID”列表是检索器针对特定输入查询和k值的输出。

“真实文档ID”列表是针对特定输入查询的标记相关文档。

预期数据格式

对于每一行，检索到的相关文档ID和真实相关文档ID应以文档ID字符串元组的形式提供。

检索到的相关文档ID的列名应由predictions参数指定，真实相关文档ID的列名应由targets参数指定。

这里是一个简单的示例数据集，展示了预期的数据格式。文档ID是文档页面的路径。

data = pd.DataFrame(
  {
      "questions": [
          "What is MLflow?",
          "What is Databricks?",
          "How to serve a model on Databricks?",
          "How to enable MLflow Autologging for my workspace by default?",
      ],
      "retrieved_context": [
          [
              "mlflow/index.html",
              "mlflow/quick-start.html",
          ],
          [
              "introduction/index.html",
              "getting-started/overview.html",
          ],
          [
              "machine-learning/model-serving/index.html",
              "machine-learning/model-serving/model-serving-intro.html",
          ],
          [],
      ],
      "ground_truth_context": [
          ["mlflow/index.html"],
          ["introduction/index.html"],
          [
              "machine-learning/model-serving/index.html",
              "machine-learning/model-serving/llm-optimized-model-serving.html",
          ],
          ["mlflow/databricks-autologging.html"],
      ],
  }
)

生成评估数据集

生成评估数据集有两个步骤：生成带有真实文档ID的问题以及检索相关文档ID。

生成带有真实文档ID的问题

如果您没有要评估的问题列表，可以使用LLMs生成。Question Generation Notebook提供了一个示例方法。以下是运行该Notebook的结果。

generated_df = pd.read_csv(OUTPUT_DF_PATH)

generated_df.head(3)

	问题	回答	块	块ID	来源
0	MLflow模型注册表的目的是什么？	MLflow模型注册表的目的是...	文档 MLflow 模型注册表 MLflow 模...	0	model-registry.html
1	在MLflow模型注册表中注册模型的目的是什么...	在模型注册表中注册模型的目的是...	已记录，该模型可以随后注册到...	1	model-registry.html
2	您可以使用注册的模型和模型版本做什么...	使用注册的模型和模型版本，您可以...	与注册的模型和模型版本关联...	2	model-registry.html

# Prepare dataframe `data` with the required format
data = pd.DataFrame({})
data["question"] = generated_df["question"].copy(deep=True)
data["source"] = generated_df["source"].apply(lambda x: [x])
data.head(3)

	问题	来源
0	MLflow模型注册表的目的是什么？	[model-registry.html]
1	在MLflow模型注册表中注册模型的目的是什么...	[model-registry.html]
2	您可以使用注册的模型和模型版本做什么...	[model-registry.html]

检索相关文档ID

一旦我们有了来自1.1的带有真实文档ID的问题列表，我们就可以收集检索到的相关文档ID。在本教程中，我们使用LangChain检索器。您可以根据需要接入您自己的检索器。

首先，我们从保存在 https://github.com/mlflow/mlflow/blob/master/examples/llms/question_generation/mlflow_docs_scraped.csv 的文档构建一个FAISS检索器。如何创建此csv文件请参见 Question Generation Notebook。

embeddings = OpenAIEmbeddings()

scrapped_df = pd.read_csv(SCRAPPED_DOCS_PATH)
list_of_documents = [
  Document(page_content=row["text"], metadata={"source": row["source"]})
  for i, row in scrapped_df.iterrows()
]
text_splitter = CharacterTextSplitter(chunk_size=CHUNK_SIZE, chunk_overlap=0)
docs = text_splitter.split_documents(list_of_documents)
db = FAISS.from_documents(docs, embeddings)

# Save the db to local disk
db.save_local(DB_PERSIST_DIR)

# Load the db from local disk
db = FAISS.load_local(DB_PERSIST_DIR, embeddings)
retriever = db.as_retriever()

# Test the retriever with a query
retrieved_docs = retriever.get_relevant_documents(
  "What is the purpose of the MLflow Model Registry?"
)
len(retrieved_docs)

构建检索器后，我们定义一个函数，该函数接受问题字符串作为输入，并返回相关文档ID字符串列表。

# Define a function to return a list of retrieved doc ids
def retrieve_doc_ids(question: str) -> list[str]:
  docs = retriever.get_relevant_documents(question)
  return [doc.metadata["source"] for doc in docs]

我们可以将检索到的文档ID存储在dataframe中，列名为“retrieved_doc_ids”。

data["retrieved_doc_ids"] = data["question"].apply(retrieve_doc_ids)
data.head(3)

	问题	来源	retrieved_doc_ids
0	MLflow模型注册表的目的是什么？	[model-registry.html]	[model-registry.html, introduction/index.html,...
1	在MLflow模型注册表中注册模型的目的是什么...	[model-registry.html]	[model-registry.html, models.html, introductio...
2	您可以使用注册的模型和模型版本做什么...	[model-registry.html]	[model-registry.html, models.html, deployment/...

# Persist the static evaluation dataset to disk
data.to_csv(EVALUATION_DATASET_PATH, index=False)

# Load the static evaluation dataset from disk and deserialize the source and retrieved doc ids
data = pd.read_csv(EVALUATION_DATASET_PATH)
data["source"] = data["source"].apply(ast.literal_eval)
data["retrieved_doc_ids"] = data["retrieved_doc_ids"].apply(ast.literal_eval)
data.head(3)

	问题	来源	retrieved_doc_ids
0	MLflow模型注册表的目的是什么？	[model-registry.html]	[model-registry.html, introduction/index.html,...
1	在MLflow模型注册表中注册模型的目的是什么...	[model-registry.html]	[model-registry.html, models.html, introductio...
2	您可以使用注册的模型和模型版本做什么...	[model-registry.html]	[model-registry.html, models.html, deployment/...

步骤 3: 调用 `mlflow.evaluate()`

指标定义

为检索器模型类型提供了三个内置指标。点击下面的指标名称查看指标定义。

所有指标都会为每一行计算一个0到1之间的分数，表示检索器模型在给定k值下的相应指标。

k参数应为正整数，表示每行要评估的检索文档数量。k默认为3。

当模型类型为"retriever"时，将自动使用默认k值3计算这些指标。

基本用法

有两种支持的方式来指定检索器的输出

情况 1: 将检索器的输出保存到静态评估数据集
情况 2: 将检索器包装在一个函数中

# Case 1: Evaluating a static evaluation dataset
with mlflow.start_run() as run:
  evaluate_results = mlflow.evaluate(
      data=data,
      model_type="retriever",
      targets="source",
      predictions="retrieved_doc_ids",
      evaluators="default",
  )

2023/11/22 14:39:59 WARNING mlflow.data.pandas_dataset: Failed to infer schema for Pandas dataset. Exception: Unable to map 'object' type to MLflow DataType. object can be mapped iff all values have identical data type which is one of (string, (bytes or byterray),  int, float).
2023/11/22 14:39:59 INFO mlflow.models.evaluation.base: Evaluating the model with the default evaluator.
2023/11/22 14:39:59 INFO mlflow.models.evaluation.default_evaluator: Testing metrics on first row...
2023/11/22 14:39:59 INFO mlflow.models.evaluation.default_evaluator: Evaluating builtin metrics: precision_at_3
2023/11/22 14:39:59 INFO mlflow.models.evaluation.default_evaluator: Evaluating builtin metrics: recall_at_3
2023/11/22 14:39:59 INFO mlflow.models.evaluation.default_evaluator: Evaluating builtin metrics: ndcg_at_3

question_source_df = data[["question", "source"]]
question_source_df.head(3)

	问题	来源
0	MLflow模型注册表的目的是什么？	[model-registry.html]
1	在MLflow模型注册表中注册模型的目的是什么...	[model-registry.html]
2	您可以使用注册的模型和模型版本做什么...	[model-registry.html]

# Case 2: Evaluating a function
def retriever_model_function(question_df: pd.DataFrame) -> pd.Series:
  return question_df["question"].apply(retrieve_doc_ids)


with mlflow.start_run() as run:
  evaluate_results = mlflow.evaluate(
      model=retriever_model_function,
      data=question_source_df,
      model_type="retriever",
      targets="source",
      evaluators="default",
  )

2023/11/22 14:09:12 WARNING mlflow.data.pandas_dataset: Failed to infer schema for Pandas dataset. Exception: Unable to map 'object' type to MLflow DataType. object can be mapped iff all values have identical data type which is one of (string, (bytes or byterray),  int, float).
2023/11/22 14:09:12 INFO mlflow.models.evaluation.base: Evaluating the model with the default evaluator.
2023/11/22 14:09:12 INFO mlflow.models.evaluation.default_evaluator: Computing model predictions.
2023/11/22 14:09:24 INFO mlflow.models.evaluation.default_evaluator: Testing metrics on first row...
2023/11/22 14:09:24 INFO mlflow.models.evaluation.default_evaluator: Evaluating builtin metrics: precision_at_3
2023/11/22 14:09:24 INFO mlflow.models.evaluation.default_evaluator: Evaluating builtin metrics: recall_at_3
2023/11/22 14:09:24 INFO mlflow.models.evaluation.default_evaluator: Evaluating builtin metrics: ndcg_at_3

pp = pprint.PrettyPrinter(indent=4)
pp.pprint(evaluate_results.metrics)

{   'ndcg_at_3/mean': 0.7530888125490431,
  'ndcg_at_3/p90': 1.0,
  'ndcg_at_3/variance': 0.1209151911325433,
  'precision_at_3/mean': 0.26785714285714285,
  'precision_at_3/p90': 0.3333333333333333,
  'precision_at_3/variance': 0.017538265306122448,
  'recall_at_3/mean': 0.8035714285714286,
  'recall_at_3/p90': 1.0,
  'recall_at_3/variance': 0.15784438775510204}

尝试不同的k值

要使用另一个k值，请在mlflow.evaluate() API中使用evaluator_config参数，如下所示：evaluator_config={"retriever_k": }。

# Case 1: Specifying the model type
evaluate_results = mlflow.evaluate(
    data=data,
    model_type="retriever",
    targets="ground_truth_context",
    predictions="retrieved_context",
    evaluators="default",
    evaluator_config={"retriever_k": 5}
  )

或者，您也可以直接在mlflow.evaluate() API的extra_metrics参数中指定所需的指标，而无需指定模型类型。在这种情况下，evaluator_config参数中指定的k值将被忽略。

# Case 2: Specifying the extra_metrics
evaluate_results = mlflow.evaluate(
    data=data,
    targets="ground_truth_context",
    predictions="retrieved_context",
    extra_metrics=[
      mlflow.metrics.precision_at_k(4),
      mlflow.metrics.precision_at_k(5)
    ],
  )

with mlflow.start_run() as run:
  evaluate_results = mlflow.evaluate(
      data=data,
      targets="source",
      predictions="retrieved_doc_ids",
      evaluators="default",
      extra_metrics=[
          mlflow.metrics.precision_at_k(1),
          mlflow.metrics.precision_at_k(2),
          mlflow.metrics.precision_at_k(3),
          mlflow.metrics.recall_at_k(1),
          mlflow.metrics.recall_at_k(2),
          mlflow.metrics.recall_at_k(3),
          mlflow.metrics.ndcg_at_k(1),
          mlflow.metrics.ndcg_at_k(2),
          mlflow.metrics.ndcg_at_k(3),
      ],
  )

2023/11/22 14:40:22 WARNING mlflow.data.pandas_dataset: Failed to infer schema for Pandas dataset. Exception: Unable to map 'object' type to MLflow DataType. object can be mapped iff all values have identical data type which is one of (string, (bytes or byterray),  int, float).
2023/11/22 14:40:22 INFO mlflow.models.evaluation.base: Evaluating the model with the default evaluator.
2023/11/22 14:40:22 INFO mlflow.models.evaluation.default_evaluator: Testing metrics on first row...
2023/11/22 14:40:22 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: precision_at_1
2023/11/22 14:40:22 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: precision_at_2
2023/11/22 14:40:22 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: precision_at_3
2023/11/22 14:40:22 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: recall_at_1
2023/11/22 14:40:22 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: recall_at_2
2023/11/22 14:40:22 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: recall_at_3
2023/11/22 14:40:22 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: ndcg_at_1
2023/11/22 14:40:22 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: ndcg_at_2
2023/11/22 14:40:22 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: ndcg_at_3

import matplotlib.pyplot as plt

# Plotting each metric
for metric_name in ["precision", "recall", "ndcg"]:
  y = [evaluate_results.metrics[f"{metric_name}_at_{k}/mean"] for k in range(1, 4)]
  plt.plot([1, 2, 3], y, label=f"{metric_name}@k")

# Adding labels and title
plt.xlabel("k")
plt.ylabel("Metric Value")
plt.title("Metrics Comparison at Different Ks")
# Setting x-axis ticks
plt.xticks([1, 2, 3])
plt.legend()

# Display the plot
plt.show()

边缘情况处理

对于每个内置指标，都有一些特殊处理的边缘情况。

检索到的文档ID为空

当没有检索到相关文档时

mlflow.metrics.precision_at_k(k)定义为
- 如果真实文档ID非空，则为0
- 如果真实文档ID也为空，则为1
mlflow.metrics.ndcg_at_k(k)定义为
- 如果真实文档ID非空，则为0
- 如果真实文档ID也为空，则为1

真实文档ID为空

当没有提供真实文档ID时

mlflow.metrics.recall_at_k(k)定义为
- 如果检索到的文档ID非空，则为0
- 如果检索到的文档ID也为空，则为1
mlflow.metrics.ndcg_at_k(k)定义为
- 如果检索到的文档ID非空，则为0
- 如果检索到的文档ID也为空，则为1

检索到的文档ID重复

在RAG系统中，检索器对于给定的查询检索同一文档中的多个块是常见的情况。在这种情况下，mlflow.metrics.ndcg_at_k(k)的计算方式如下：

如果重复的文档ID在真实文档中，它们将被视为不同的文档。例如，如果真实文档ID是[1, 2]，检索到的文档ID是[1, 1, 1, 3]，则得分将等同于真实文档ID [10, 11, 12, 2]和检索到的文档ID [10, 11, 12, 3]。

如果重复的文档ID不在真实文档中，则ndcg分数正常计算。

步骤 4: 结果分析与可视化

您可以通过将其加载到pandas dataframe（如下所示）或访问MLflow运行比较界面，查看artifacts中记录的“eval_results_table.json”中的每行得分。

eval_results_table = evaluate_results.tables["eval_results_table"]
eval_results_table.head(5)

Downloading artifacts:   0%|          | 0/1 [00:00<?, ?it/s]

	问题	来源	retrieved_doc_ids	precision_at_1/score	precision_at_2/score	precision_at_3/score	recall_at_1/score	recall_at_2/score	recall_at_3/score	ndcg_at_1/score	ndcg_at_2/score	ndcg_at_3/score
0	MLflow模型注册表的目的是什么？	[model-registry.html]	[model-registry.html, introduction/index.html,...	1	0.5	0.333333	1	1	1	1	1.0	0.919721
1	在MLflow模型注册表中注册模型的目的是什么...	[model-registry.html]	[model-registry.html, models.html, introductio...	1	0.5	0.333333	1	1	1	1	1.0	1.000000
2	您可以使用注册的模型和模型版本做什么...	[model-registry.html]	[model-registry.html, models.html, deployment/...	1	0.5	0.333333	1	1	1	1	1.0	1.000000
3	如何添加、修改、更新或删除模型...	[model-registry.html]	[model-registry.html, models.html, deployment/...	1	0.5	0.333333	1	1	1	1	1.0	1.000000
4	如何在模型注册表中部署和组织模型...	[model-registry.html]	[model-registry.html, deployment/index.html, d...	1	0.5	0.333333	1	1	1	1	1.0	0.919721

利用评估结果表，您可以使用主题分析技术进一步可视化回答良好的问题和回答不佳的问题。

import nltk
import pyLDAvis.gensim_models as gensimvis
from gensim import corpora, models
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize

# Initialize NLTK resources
nltk.download("punkt")
nltk.download("stopwords")


def topical_analysis(questions: list[str]):
  stop_words = set(stopwords.words("english"))

  # Tokenize and remove stop words
  tokenized_data = []
  for question in questions:
      tokens = word_tokenize(question.lower())
      filtered_tokens = [word for word in tokens if word not in stop_words and word.isalpha()]
      tokenized_data.append(filtered_tokens)

  # Create a dictionary and corpus
  dictionary = corpora.Dictionary(tokenized_data)
  corpus = [dictionary.doc2bow(text) for text in tokenized_data]

  # Apply LDA model
  lda_model = models.LdaModel(corpus, num_topics=5, id2word=dictionary, passes=15)

  # Get topic distribution for each question
  topic_distribution = []
  for i, ques in enumerate(questions):
      bow = dictionary.doc2bow(tokenized_data[i])
      topics = lda_model.get_document_topics(bow)
      topic_distribution.append(topics)
      print(f"Question: {ques}
Topic: {topics}")

  # Print all topics
  print("
Topics found are:")
  for idx, topic in lda_model.print_topics(-1):
      print(f"Topic: {idx} 
Words: {topic}
")
  return lda_model, corpus, dictionary

[nltk_data] Downloading package punkt to
[nltk_data]     /Users/liang.zhang/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/liang.zhang/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!

filtered_df = eval_results_table[eval_results_table["precision_at_1/score"] == 1]
hit_questions = filtered_df["question"].tolist()
filtered_df = eval_results_table[eval_results_table["precision_at_1/score"] == 0]
miss_questions = filtered_df["question"].tolist()

lda_model, corpus, dictionary = topical_analysis(hit_questions)
vis_data = gensimvis.prepare(lda_model, corpus, dictionary)

Question: What is the purpose of the MLflow Model Registry?
Topic: [(0, 0.0400703), (1, 0.040002838), (2, 0.040673085), (3, 0.04075462), (4, 0.8384991)]
Question: What is the purpose of registering a model with the Model Registry?
Topic: [(0, 0.0334267), (1, 0.033337697), (2, 0.033401005), (3, 0.033786207), (4, 0.8660484)]
Question: What can you do with registered models and model versions?
Topic: [(0, 0.04019648), (1, 0.04000775), (2, 0.040166058), (3, 0.8391777), (4, 0.040452003)]
Question: How can you add, modify, update, or delete a model in the Model Registry?
Topic: [(0, 0.025052568), (1, 0.025006149), (2, 0.025024023), (3, 0.025236268), (4, 0.899681)]
Question: How can you deploy and organize models in the Model Registry?
Topic: [(0, 0.033460867), (1, 0.033337582), (2, 0.033362914), (3, 0.8659808), (4, 0.033857808)]
Question: What method do you use to create a new registered model?
Topic: [(0, 0.028867528), (1, 0.028582651), (2, 0.882546), (3, 0.030021703), (4, 0.029982116)]
Question: How can you deploy and organize models in the Model Registry?
Topic: [(0, 0.033460878), (1, 0.033337586), (2, 0.033362918), (3, 0.8659798), (4, 0.03385884)]
Question: How can you fetch a list of registered models in the MLflow registry?
Topic: [(0, 0.0286206), (1, 0.028577656), (2, 0.02894385), (3, 0.88495284), (4, 0.028905064)]
Question: What is the default channel logged for models using MLflow v1.18 and above?
Topic: [(0, 0.02862059), (1, 0.028577654), (2, 0.028883327), (3, 0.8851736), (4, 0.028744776)]
Question: What information is stored in the conda.yaml file?
Topic: [(0, 0.050020963), (1, 0.051287953), (2, 0.051250603), (3, 0.7968765), (4, 0.05056402)]
Question: How can you save a model with a manually specified conda environment?
Topic: [(0, 0.02862434), (1, 0.02858204), (2, 0.02886313), (3, 0.8851747), (4, 0.028755778)]
Question: What are inference params and how are they used during model inference?
Topic: [(0, 0.86457103), (1, 0.03353862), (2, 0.033417325), (3, 0.034004394), (4, 0.034468662)]
Question: What is the purpose of model signatures in MLflow?
Topic: [(0, 0.040070876), (1, 0.04000346), (2, 0.040688124), (3, 0.040469088), (4, 0.8387685)]
Question: What is the API used to set signatures on models?
Topic: [(0, 0.033873636), (1, 0.033508822), (2, 0.033337757), (3, 0.035357967), (4, 0.8639218)]
Question: What components are used to generate the final time series?
Topic: [(0, 0.028693806), (1, 0.8853218), (2, 0.028573763), (3, 0.02862714), (4, 0.0287835)]
Question: What functionality does the configuration DataFrame submitted to the pyfunc flavor provide?
Topic: [(0, 0.02519801), (1, 0.025009492), (2, 0.025004204), (3, 0.025004204), (4, 0.8997841)]
Question: What is a common configuration for lowering the total memory pressure for pytorch models within transformers pipelines?
Topic: [(0, 0.93316424), (1, 0.016669936), (2, 0.016668117), (3, 0.016788227), (4, 0.016709473)]
Question: What does the save_model() function do?
Topic: [(0, 0.10002145), (1, 0.59994656), (2, 0.10001026), (3, 0.10001026), (4, 0.10001151)]
Question: What is an MLflow Project?
Topic: [(0, 0.06667001), (1, 0.06667029), (2, 0.7321751), (3, 0.06711196), (4, 0.06737265)]
Question: What are the entry points in a MLproject file and how can you specify parameters for them?
Topic: [(0, 0.02857626), (1, 0.88541776), (2, 0.02868285), (3, 0.028626908), (4, 0.02869626)]
Question: What are the project environments supported by MLflow?
Topic: [(0, 0.040009078), (1, 0.040009864), (2, 0.839655), (3, 0.040126894), (4, 0.040199146)]
Question: What is the purpose of specifying a Conda environment in an MLflow project?
Topic: [(0, 0.028579442), (1, 0.028580135), (2, 0.8841217), (3, 0.028901232), (4, 0.029817443)]
Question: What is the purpose of the MLproject file?
Topic: [(0, 0.05001335), (1, 0.052611485), (2, 0.050071735), (3, 0.05043289), (4, 0.7968705)]
Question: How can you pass runtime parameters to the entry point of an MLflow Project?
Topic: [(0, 0.025007373), (1, 0.025498485), (2, 0.8993807), (3, 0.02504522), (4, 0.025068246)]
Question: How does MLflow run a Project on Kubernetes?
Topic: [(0, 0.04000677), (1, 0.040007353), (2, 0.83931196), (3, 0.04012452), (4, 0.04054937)]
Question: What fields are replaced when MLflow creates a Kubernetes Job for an MLflow Project?
Topic: [(0, 0.022228329), (1, 0.022228856), (2, 0.023192631), (3, 0.02235802), (4, 0.90999216)]
Question: What is the syntax for searching runs using the MLflow UI and API?
Topic: [(0, 0.025003674), (1, 0.02500399), (2, 0.02527212), (3, 0.89956146), (4, 0.025158761)]
Question: What is the syntax for searching runs using the MLflow UI and API?
Topic: [(0, 0.025003672), (1, 0.025003988), (2, 0.025272164), (3, 0.8995614), (4, 0.025158769)]
Question: What are the key parts of a search expression in MLflow?
Topic: [(0, 0.03334423), (1, 0.03334517), (2, 0.8662702), (3, 0.033611353), (4, 0.033429127)]
Question: What are the key attributes for the model with the run_id 'a1b2c3d4' and run_name 'my-run'?
Topic: [(0, 0.05017508), (1, 0.05001634), (2, 0.05058142), (3, 0.7985237), (4, 0.050703418)]
Question: What information does each run record in MLflow Tracking?
Topic: [(0, 0.03333968), (1, 0.033340227), (2, 0.86639804), (3, 0.03349555), (4, 0.033426523)]
Question: What are the two components used by MLflow for storage?
Topic: [(0, 0.0334928), (1, 0.033938777), (2, 0.033719826), (3, 0.03357158), (4, 0.86527705)]
Question: What interfaces does the MLflow client use to record MLflow entities and artifacts when running MLflow on a local machine with a SQLAlchemy-compatible database?
Topic: [(0, 0.014289577), (1, 0.014289909), (2, 0.94276434), (3, 0.014325481), (4, 0.014330726)]
Question: What is the default backend store used by MLflow?
Topic: [(0, 0.033753525), (1, 0.03379533), (2, 0.033777602), (3, 0.86454684), (4, 0.0341267)]
Question: What information does autologging capture when launching short-lived MLflow runs?
Topic: [(0, 0.028579954), (1, 0.02858069), (2, 0.8851724), (3, 0.029027484), (4, 0.028639426)]
Question: What is the purpose of the --serve-artifacts flag?
Topic: [(0, 0.06670548), (1, 0.066708855), (2, 0.067003354), (3, 0.3969311), (4, 0.40265122)]

Topics found are:
Topic: 0 
Words: 0.059*"inference" + 0.032*"models" + 0.032*"used" + 0.032*"configuration" + 0.032*"common" + 0.032*"transformers" + 0.032*"total" + 0.032*"within" + 0.032*"pytorch" + 0.032*"pipelines"

Topic: 1 
Words: 0.036*"file" + 0.035*"mlproject" + 0.035*"used" + 0.035*"components" + 0.035*"entry" + 0.035*"parameters" + 0.035*"specify" + 0.035*"final" + 0.035*"points" + 0.035*"time"

Topic: 2 
Words: 0.142*"mlflow" + 0.066*"project" + 0.028*"information" + 0.028*"use" + 0.028*"record" + 0.028*"run" + 0.015*"key" + 0.015*"running" + 0.015*"artifacts" + 0.015*"client"

Topic: 3 
Words: 0.066*"models" + 0.066*"model" + 0.066*"mlflow" + 0.041*"using" + 0.041*"registry" + 0.028*"api" + 0.028*"registered" + 0.028*"runs" + 0.028*"syntax" + 0.028*"searching"

Topic: 4 
Words: 0.089*"model" + 0.074*"purpose" + 0.074*"mlflow" + 0.046*"registry" + 0.031*"used" + 0.031*"signatures" + 0.017*"kubernetes" + 0.017*"fields" + 0.017*"job" + 0.017*"replaced"

# Uncomment the following line to render the interactive widget
# pyLDAvis.display(vis_data)

lda_model, corpus, dictionary = topical_analysis(miss_questions)
vis_data = gensimvis.prepare(lda_model, corpus, dictionary)

Question: What is the purpose of the mlflow.sklearn.log_model() method?
Topic: [(0, 0.0669118), (1, 0.06701085), (2, 0.06667974), (3, 0.73235476), (4, 0.06704286)]
Question: How can you fetch a specific model version?
Topic: [(0, 0.83980393), (1, 0.040003464), (2, 0.04000601), (3, 0.040101767), (4, 0.040084846)]
Question: How can you fetch the latest model version in a specific stage?
Topic: [(0, 0.88561153), (1, 0.028575428), (2, 0.028578365), (3, 0.0286214), (4, 0.028613236)]
Question: What can you do to promote MLflow Models across environments?
Topic: [(0, 0.8661927), (1, 0.0333396), (2, 0.03362743), (3, 0.033428304), (4, 0.033411972)]
Question: What is the name of the model and its version details?
Topic: [(0, 0.83978903), (1, 0.04000637), (2, 0.04001106), (3, 0.040105395), (4, 0.040088095)]
Question: What is the purpose of saving the model in pickled format?
Topic: [(0, 0.033948876), (1, 0.03339717), (2, 0.033340737), (3, 0.86575514), (4, 0.033558063)]
Question: What is an MLflow Model and what is its purpose?
Topic: [(0, 0.7940762), (1, 0.05068333), (2, 0.050770763), (3, 0.053328265), (4, 0.05114142)]
Question: What are the flavors defined in the MLmodel file for the mlflow.sklearn library?
Topic: [(0, 0.86628276), (1, 0.033341788), (2, 0.03334801), (3, 0.03368498), (4, 0.033342462)]
Question: What command can be used to package and deploy models to AWS SageMaker?
Topic: [(0, 0.89991224), (1, 0.025005225), (2, 0.025009066), (3, 0.025006713), (4, 0.025066752)]
Question: What is the purpose of the --build-image flag when running mlflow run?
Topic: [(0, 0.033957016), (1, 0.033506736), (2, 0.034095332), (3, 0.034164555), (4, 0.86427635)]
Question: What is the relative path to the python_env YAML file within the MLflow project's directory?
Topic: [(0, 0.02243), (1, 0.02222536), (2, 0.022470985), (3, 0.9105873), (4, 0.02228631)]
Question: What are the additional local volume mounted and environment variables in the docker container?
Topic: [(0, 0.022225259), (1, 0.9110914), (2, 0.02222932), (3, 0.022227468), (4, 0.022226628)]
Question: What are some examples of entity names that contain special characters?
Topic: [(0, 0.028575381), (1, 0.88568854), (2, 0.02858065), (3, 0.028578246), (4, 0.028577149)]
Question: What type of constant does the RHS need to be if LHS is a metric?
Topic: [(0, 0.028575381), (1, 0.8856886), (2, 0.028580645), (3, 0.028578239), (4, 0.028577147)]
Question: How can you get all active runs from experiments IDs 3, 4, and 17 that used a CNN model with 10 layers and had a prediction accuracy of 94.5% or higher?
Topic: [(0, 0.015563371), (1, 0.015387185), (2, 0.015389071), (3, 0.015427767), (4, 0.9382326)]
Question: What is the purpose of the 'experimentIds' variable in the given paragraph?
Topic: [(0, 0.040206533), (1, 0.8384999), (2, 0.040013183), (3, 0.040967643), (4, 0.040312726)]
Question: What is the MLflow Tracking component used for?
Topic: [(0, 0.8390845), (1, 0.04000697), (2, 0.040462855), (3, 0.04014182), (4, 0.040303845)]
Question: How can you create an experiment in MLflow?
Topic: [(0, 0.050333958), (1, 0.0500024), (2, 0.7993825), (3, 0.050153885), (4, 0.05012722)]
Question: How can you create an experiment using MLflow?
Topic: [(0, 0.04019285), (1, 0.04000254), (2, 0.8396381), (3, 0.040091105), (4, 0.04007539)]
Question: What is the architecture depicted in this example scenario?
Topic: [(0, 0.04000523), (1, 0.040007014), (2, 0.040012203), (3, 0.04000902), (4, 0.83996654)]

Topics found are:
Topic: 0 
Words: 0.078*"model" + 0.059*"mlflow" + 0.059*"version" + 0.041*"models" + 0.041*"fetch" + 0.041*"specific" + 0.041*"used" + 0.022*"command" + 0.022*"deploy" + 0.022*"sagemaker"

Topic: 1 
Words: 0.030*"local" + 0.030*"container" + 0.030*"variables" + 0.030*"docker" + 0.030*"mounted" + 0.030*"environment" + 0.030*"volume" + 0.030*"additional" + 0.030*"special" + 0.030*"names"

Topic: 2 
Words: 0.096*"experiment" + 0.096*"create" + 0.096*"mlflow" + 0.051*"using" + 0.009*"purpose" + 0.009*"model" + 0.009*"method" + 0.009*"file" + 0.009*"version" + 0.009*"used"

Topic: 3 
Words: 0.071*"purpose" + 0.039*"file" + 0.039*"mlflow" + 0.039*"yaml" + 0.039*"directory" + 0.039*"relative" + 0.039*"within" + 0.039*"path" + 0.039*"project" + 0.039*"format"

Topic: 4 
Words: 0.032*"purpose" + 0.032*"used" + 0.032*"model" + 0.032*"prediction" + 0.032*"get" + 0.032*"accuracy" + 0.032*"active" + 0.032*"layers" + 0.032*"higher" + 0.032*"experiments"

# Uncomment the following line to render the interactive widget
# pyLDAvis.display(vis_data)

步骤 1: 安装和加载软件包​

步骤 2: 评估数据集准备​

手动准备​

概念​

预期数据格式​

生成评估数据集​

生成带有真实文档ID的问题​

检索相关文档ID​

步骤 3: 调用 mlflow.evaluate()​

指标定义​

基本用法​

尝试不同的k值​

边缘情况处理​

检索到的文档ID为空​

真实文档ID为空​

检索到的文档ID重复​

步骤 4: 结果分析与可视化​