查询 MLflow 部署服务器中的端点

部署服务器运行正常后，就可以向其发送一些数据了。您可以使用部署 API 或 REST API 与网关服务器进行交互。在此示例中，为简化操作，我们将使用部署 API。

让我们详细介绍支持的三种模型类型

补全：此类型模型用于根据提供的输入生成预测或建议，有助于“补全”序列或模式。
聊天：此类模型有助于进行交互式对话，能够理解并以对话方式响应用户输入。
嵌入：嵌入模型将输入数据（如文本或图像）转换为数值向量空间，相似项在此空间中位置接近，从而有助于各种机器学习任务。

在接下来的步骤中，我们将探讨如何使用这些模型类型查询网关服务器。

示例 1：补全

补全模型旨在完成句子或响应提示。

要通过 MLflow AI 网关查询这些模型，您需要提供一个 prompt 参数，该参数是语言模型 (LLM) 将响应的字符串。网关服务器还支持各种其他参数。有关详细信息，请参阅文档。

from mlflow.deployments import get_deploy_client

client = get_deploy_client("https://:5000")
name = "completions"
data = dict(
    prompt="Name three potions or spells in harry potter that sound like an insult. Only show the names.",
    n=2,
    temperature=0.2,
    max_tokens=1000,
)

response = client.predict(endpoint=name, inputs=data)
print(response)

示例 2：聊天

聊天模型有助于与用户进行交互式对话，随着时间的推移逐渐积累上下文。

与其它模型类型相比，创建聊天负载稍微复杂一些，因为它容纳了来自三种不同角色的无限数量的消息：system（系统）、user（用户）和 assistant（助手）。要通过 MLflow AI 网关设置聊天负载，您需要指定一个 messages 参数。此参数接受按如下格式排列的字典列表：

{"role": "system/user/assistant", "content": "用户指定内容"}

有关更多详情，请参阅文档。

from mlflow.deployments import get_deploy_client

client = get_deploy_client("https://:5000")
name = "chat_3.5"
data = dict(
    messages=[
        {"role": "system", "content": "You are the sorting hat from harry potter."},
        {
            "role": "user",
            "content": "I am brave, hard-working, wise, and backstabbing.",
        },
        {
            "role": "user",
            "content": "Which harry potter house am I most likely to belong to?",
        },
    ],
    n=3,
    temperature=0.5,
)

response = client.predict(endpoint=name, inputs=data)
print(response)

示例 3：嵌入

嵌入模型将 token 转换为数值向量。

要通过 MLflow AI 网关使用嵌入模型，请提供一个 text 参数，该参数可以是一个字符串或字符串列表。然后网关服务器处理这些字符串并返回它们各自的数值向量。让我们继续看一个示例...

from mlflow.deployments import get_deploy_client

client = get_deploy_client("https://:5000")
name = "embeddings"
data = dict(
    input=[
        "Gryffindor: Values bravery, courage, and leadership.",
        "Hufflepuff: Known for loyalty, a strong work ethic, and a grounded nature.",
        "Ravenclaw: A house for individuals who value wisdom, intellect, and curiosity.",
        "Slytherin: Appreciates ambition, cunning, and resourcefulness.",
    ],
)

response = client.predict(endpoint=name, inputs=data)
print(response)

瞧！您已成功设置了您的第一个网关服务器并提供了三个 OpenAI 模型服务。

示例 1：补全​

示例 2：聊天​

示例 3：嵌入​

示例 1：补全

示例 2：聊天

示例 3：嵌入