mlflow.deployments

提供将 MLflow 模型部署到自定义服务工具的功能。

注意：模型部署到 AWS Sagemaker 目前可以通过 mlflow.sagemaker 模块完成。部署模型到 Azure 可以通过使用 azureml 库完成。

MLflow 目前不为任何其他部署目标提供内置支持，但可以通过第三方插件安装对自定义目标的 YF 支持。有关已知插件的列表，请参见此处。

本页主要关注面向用户的部署 API。有关如何为自定义服务工具实现自己的部署插件的说明，请参见插件文档。

class mlflow.deployments.BaseDeploymentClient(target_uri)[source]

提供 Python 模型部署 API 的基类。

插件实现者应在插件模块中通过 BaseDeploymentClient 的子类来定义目标特定的部署逻辑，并使用目标特定的信息自定义方法文档字符串。

注意

子类应在错误情况下（例如，模型部署失败时）引发 mlflow.exceptions.MlflowException。

abstract create_deployment(name, model_uri, flavor=None, config=None, endpoint=None)[source]

将模型部署到指定目标。默认情况下，此方法应阻塞直到部署完成（即，直到可以通过部署进行推理）。在发生冲突时（例如，由于与现有部署冲突而无法创建指定的部署），将引发 mlflow.exceptions.MlflowException 或远程部署的 HTTPError。有关异步部署和其他配置支持的详细信息，请参阅特定于目标的插件文档。

参数

name – 用于部署的唯一名称。如果存在同名的部署，将引发 mlflow.exceptions.MlflowException
model_uri – 要部署的模型的 URI
flavor – (可选) 要部署的模型风味。如果未指定，将选择默认风味。
config – (可选) 包含部署的更新的目标特定配置的字典
endpoint – (可选) 创建部署所用的端点。并非所有目标都支持此功能

返回

对应于创建的部署的字典，其中必须包含“name”键。

create_endpoint(name, config=None)[source]

使用指定的目标创建一个端点。默认情况下，此方法应阻塞直到创建完成（即，直到可以在端点内创建部署）。在发生冲突时（例如，由于与现有端点冲突而无法创建指定的端点），将引发 mlflow.exceptions.MlflowException 或远程部署的 HTTPError。有关异步创建和其他配置支持的详细信息，请参阅特定于目标的插件文档。

参数

name – 用于端点的唯一名称。如果存在同名的端点，将引发 mlflow.exceptions.MlflowException。
config – (可选) 包含端点目标特定配置的字典。

返回

对应于创建的端点的字典，其中必须包含“name”键。

abstract delete_deployment(name, config=None, endpoint=None)[source]

从指定目标删除名为 name 的部署。

删除应该是幂等的（即，如果对不存在的部署重试删除，不应失败）。

参数

name – 要删除的部署的名称
config – (可选) 包含部署更新的目标特定配置的字典
endpoint – (可选) 包含要删除的部署的端点。并非所有目标都支持此功能

返回

无

delete_endpoint(endpoint)[source]

从指定目标删除端点。删除应该是幂等的（即，如果对不存在的部署重试删除，不应失败）。

参数: endpoint – 要删除的端点的名称
返回: 无

explain(deployment_name=None, df=None, endpoint=None)[source]

针对指定输入 pandas DataFrame df，为已部署的模型生成模型预测的解释。解释输出格式因部署目标而异，可能包括特征重要性等详细信息，以理解/调试预测。

参数

deployment_name – 要进行预测的部署名称
df – 用于解释模型预测中特征重要性的 pandas DataFrame
endpoint – 要进行预测的端点。并非所有目标都支持此功能

返回

一个 JSON 可处理对象（pandas DataFrame、numpy 数组、字典）或异常（如果部署目标的类中不提供该实现）

abstract get_deployment(name, endpoint=None)[source]

返回描述指定部署的字典，如果不存在具有所提供 ID 的部署，则会引发 mlflow.exceptions.MlflowException 或远程部署的 HTTPError。字典保证包含一个“name”键，其中包含部署名称。返回字典的其他字段及其类型可能因部署目标而异。

参数

name – 要获取的部署的 ID。
endpoint – (可选) 包含要获取的部署的端点。并非所有目标都支持此功能。

返回

对应于检索到的部署的字典。字典保证包含一个对应于部署名称的“name”键。返回字典的其他字段及其类型可能因目标而异。

get_endpoint(endpoint)[source]

返回描述指定端点的字典，如果不存在具有所提供名称的端点，则会引发 py:class:mlflow.exception.MlflowException 或远程部署的 HTTPError。字典保证包含一个“name”键，其中包含端点名称。返回字典的其他字段及其类型可能因目标而异。

参数: endpoint – 要获取的端点的名称
返回: 对应于检索到的端点的字典。字典保证包含一个对应于端点名称的“name”键。返回字典的其他字段及其类型可能因目标而异。

abstract list_deployments(endpoint=None)[source]

列出部署。

此方法应返回所有部署的未分页列表（另一种方法是返回一个包含实际部署的“deployments”字段的字典，插件可以指定返回字典中的其他字段，例如 next_page_token 字段，用于分页，并接受此方法的 pagination_args 参数来传递与分页相关的参数）。

参数: endpoint – (可选) 列出指定端点中的部署。并非所有目标都支持此功能
返回: 对应于部署的字典列表。每个字典都保证包含一个包含部署名称的“name”键。返回字典的其他字段及其类型可能因部署目标而异。

list_endpoints()[source]

列出指定目标中的端点。此方法应返回所有端点的未分页列表（另一种方法是返回一个包含实际端点的“endpoints”字段的字典，插件可以指定返回字典中的其他字段，例如 next_page_token 字段，用于分页，并接受此方法的 pagination_args 参数来传递与分页相关的参数）。

返回: 对应于端点的字典列表。每个字典都保证包含一个包含端点名称的“name”键。返回字典的其他字段及其类型可能因目标而异。

abstract predict(deployment_name=None, inputs=None, endpoint=None)[source]

使用指定的部署或模型端点对输入进行预测。

请注意，此方法的输入/输出类型与 mlflow pyfunc predict 的类型匹配。

参数

deployment_name – 要进行预测的部署名称。
inputs – 要传递给部署或模型端点进行推理的输入数据（或参数）。
endpoint – 要进行预测的端点。并非所有目标都支持此功能。

返回

一个 mlflow.deployments.PredictionsResponse 实例，表示预测和相关的模型服务器响应元数据。

predict_stream(deployment_name=None, inputs=None, endpoint=None)[source]

向配置的提供程序端点提交查询，并获取流式响应

参数

deployment_name – 要进行预测的部署名称。
inputs – 查询的输入，以字典形式提供。
endpoint – 要查询的端点的名称。

返回

一个包含端点响应的字典的迭代器。

abstract update_deployment(name, model_uri=None, flavor=None, config=None, endpoint=None)[source]

更新具有指定名称的部署。您可以更新模型的 URI、已部署模型的风味（在这种情况下，还必须指定模型 URI）和/或部署的任何特定于目标的属性（通过 config）。默认情况下，此方法应阻塞直到部署完成（即，直到可以通过更新后的部署进行推理）。有关异步部署和其他配置支持的详细信息，请参阅特定于目标的插件文档。

参数

name – 要更新的部署的唯一名称。
model_uri – 要部署的新模型的 URI。
flavor – (可选) 要用于部署的新模型风味。如果提供，则 model_uri 也必须指定。如果 flavor 未指定但 model_uri 已指定，将选择默认风味，并使用该风味更新部署。
config – (可选) 包含部署更新的目标特定配置的字典。
endpoint – (可选) 包含要更新的部署的端点。并非所有目标都支持此功能。

返回

无

update_endpoint(endpoint, config=None)[source]

使用提供的配置更新指定名称的端点。您可以更新端点的任何特定于目标的属性（通过 config）。默认情况下，此方法应阻塞直到更新完成（即，直到可以在端点内创建部署）。有关异步更新和其他配置支持的详细信息，请参阅特定于目标的插件文档。

参数

endpoint – 要更新的端点的唯一名称
config – (可选) 包含端点目标特定配置的字典

返回

无

class mlflow.deployments.DatabricksDeploymentClient(target_uri)[source]

与 Databricks 服务端点进行交互的客户端。

示例

首先，设置身份验证凭据

export DATABRICKS_HOST=...
export DATABRICKS_TOKEN=...

另请参阅

有关其他身份验证方法，请参阅 https://docs.databricks.com/en/dev-tools/auth.html。

然后，创建一个部署客户端并使用它与 Databricks 服务端点进行交互

from mlflow.deployments import get_deploy_client

client = get_deploy_client("databricks")
endpoints = client.list_endpoints()
assert endpoints == [
    {
        "name": "chat",
        "creator": "alice@company.com",
        "creation_timestamp": 0,
        "last_updated_timestamp": 0,
        "state": {...},
        "config": {...},
        "tags": [...],
        "id": "88fd3f75a0d24b0380ddc40484d7a31b",
    },
]

create_deployment(name, model_uri, flavor=None, config=None, endpoint=None)[source]: 警告

此方法未为 DatabricksDeploymentClient 实现。

create_endpoint(name=None, config=None, route_optimized=False)[source]

使用提供的名称和配置创建新的服务终结点。

有关请求/响应模式，请参阅 https://docs.databricks.com/api/workspace/servingendpoints/create。

参数

name –
要创建的服务终结点的名称。

警告

已弃用。请改在 config 中包含 name。
config – 包含服务终结点完整 API 请求负载或配置的字典。
route_optimized –
一个布尔值，定义 Databricks 服务终结点是否已优化以路由流量。仅在已弃用的方法中使用。

警告

已弃用。请改在 config 中包含 route_optimized。

返回

一个 DatabricksEndpoint 对象，其中包含请求响应。

示例

from mlflow.deployments import get_deploy_client

client = get_deploy_client("databricks")
endpoint = client.create_endpoint(
    config={
        "name": "test",
        "config": {
            "served_entities": [
                {
                    "external_model": {
                        "name": "gpt-4",
                        "provider": "openai",
                        "task": "llm/v1/chat",
                        "openai_config": {
                            "openai_api_key": "{{secrets/scope/key}}",
                        },
                    },
                }
            ],
            "route_optimized": True,
        },
    },
)
assert endpoint == {
    "name": "test",
    "creator": "alice@company.com",
    "creation_timestamp": 0,
    "last_updated_timestamp": 0,
    "state": {...},
    "config": {...},
    "tags": [...],
    "id": "88fd3f75a0d24b0380ddc40484d7a31b",
    "permission_level": "CAN_MANAGE",
    "route_optimized": False,
    "task": "llm/v1/chat",
    "endpoint_type": "EXTERNAL_MODEL",
    "creator_display_name": "Alice",
    "creator_kind": "User",
}

delete_deployment(name, config=None, endpoint=None)[source]: 警告

此方法未为 DatabricksDeploymentClient 实现。

delete_endpoint(endpoint)[source]

删除指定的服务终结点。有关请求/响应模式，请参阅 https://docs.databricks.com/api/workspace/servingendpoints/delete。

参数: endpoint – 要删除的服务终结点的名称。
返回: 一个 DatabricksEndpoint 对象，其中包含请求响应。

示例

from mlflow.deployments import get_deploy_client

client = get_deploy_client("databricks")
client.delete_endpoint(endpoint="chat")

get_deployment(name, endpoint=None)[source]: 警告

此方法未为 DatabricksDeploymentClient 实现。

get_endpoint(endpoint)[source]

获取指定的服务终结点。有关请求/响应模式，请参阅 https://docs.databricks.com/api/workspace/servingendpoints/get。

参数: endpoint – 要获取的服务终结点的名称。
返回: 一个 DatabricksEndpoint 对象，其中包含请求响应。

示例

from mlflow.deployments import get_deploy_client

client = get_deploy_client("databricks")
endpoint = client.get_endpoint(endpoint="chat")
assert endpoint == {
    "name": "chat",
    "creator": "alice@company.com",
    "creation_timestamp": 0,
    "last_updated_timestamp": 0,
    "state": {...},
    "config": {...},
    "tags": [...],
    "id": "88fd3f75a0d24b0380ddc40484d7a31b",
}

list_deployments(endpoint=None)[source]: 警告

此方法未为 DatabricksDeploymentClient 实现。

list_endpoints()[source]

检索所有服务终结点。

有关请求/响应模式，请参阅 https://docs.databricks.com/api/workspace/servingendpoints/list。

返回: 包含请求响应的 DatabricksEndpoint 对象列表。

示例

from mlflow.deployments import get_deploy_client

client = get_deploy_client("databricks")
endpoints = client.list_endpoints()
assert endpoints == [
    {
        "name": "chat",
        "creator": "alice@company.com",
        "creation_timestamp": 0,
        "last_updated_timestamp": 0,
        "state": {...},
        "config": {...},
        "tags": [...],
        "id": "88fd3f75a0d24b0380ddc40484d7a31b",
    },
]

predict(deployment_name=None, inputs=None, endpoint=None)[source]

使用提供的模型输入查询服务终结点。有关请求/响应模式，请参阅 https://docs.databricks.com/api/workspace/servingendpoints/query。

参数

deployment_name – 未使用。
inputs – 包含要查询的模型输入的字典。
endpoint – 要查询的服务终结点的名称。

返回

一个 DatabricksEndpoint 对象，其中包含查询响应。

示例

from mlflow.deployments import get_deploy_client

client = get_deploy_client("databricks")
response = client.predict(
    endpoint="chat",
    inputs={
        "messages": [
            {"role": "user", "content": "Hello!"},
        ],
    },
)
assert response == {
    "id": "chatcmpl-8OLm5kfqBAJD8CpsMANESWKpLSLXY",
    "object": "chat.completion",
    "created": 1700814265,
    "model": "gpt-4-0613",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "Hello! How can I assist you today?",
            },
            "finish_reason": "stop",
        }
    ],
    "usage": {
        "prompt_tokens": 9,
        "completion_tokens": 9,
        "total_tokens": 18,
    },
}

predict_stream(deployment_name=None, inputs=None, endpoint=None) → Iterator[dict[str, typing.Any]][source]

向配置的提供程序端点提交查询，并获取流式响应

参数

deployment_name – 未使用。
inputs – 查询的输入，以字典形式提供。
endpoint – 要查询的端点的名称。

返回

一个包含端点响应的字典的迭代器。

示例

from mlflow.deployments import get_deploy_client

client = get_deploy_client("databricks")
chunk_iter = client.predict_stream(
    endpoint="databricks-llama-2-70b-chat",
    inputs={
        "messages": [{"role": "user", "content": "Hello!"}],
        "temperature": 0.0,
        "n": 1,
        "max_tokens": 500,
    },
)
for chunk in chunk_iter:
    print(chunk)
    # Example:
    # {
    #     "id": "82a834f5-089d-4fc0-ad6c-db5c7d6a6129",
    #     "object": "chat.completion.chunk",
    #     "created": 1712133837,
    #     "model": "llama-2-70b-chat-030424",
    #     "choices": [
    #         {
    #             "index": 0, "delta": {"role": "assistant", "content": "Hello"},
    #             "finish_reason": None,
    #         }
    #     ],
    #     "usage": {"prompt_tokens": 11, "completion_tokens": 1, "total_tokens": 12},
    # }

update_deployment(name, model_uri=None, flavor=None, config=None, endpoint=None)[source]: 警告

此方法未为 DatabricksDeploymentClient 实现。

update_endpoint(endpoint, config=None)[source]

警告

mlflow.deployments.databricks.DatabricksDeploymentClient.update_endpoint 已弃用。此方法将在将来的版本中删除。请改用 update_endpoint_config, update_endpoint_tags, update_endpoint_rate_limits, or update_endpoint_ai_gateway。

使用提供的配置更新指定的服务终结点。有关请求/响应模式，请参阅 https://docs.databricks.com/api/workspace/servingendpoints/updateconfig。

参数

endpoint – 要更新的服务终结点的名称。
config – 包含要更新的服务终结点配置的字典。

返回

一个 DatabricksEndpoint 对象，其中包含请求响应。

示例

from mlflow.deployments import get_deploy_client

client = get_deploy_client("databricks")
endpoint = client.update_endpoint(
    endpoint="chat",
    config={
        "served_entities": [
            {
                "name": "test",
                "external_model": {
                    "name": "gpt-4",
                    "provider": "openai",
                    "task": "llm/v1/chat",
                    "openai_config": {
                        "openai_api_key": "{{secrets/scope/key}}",
                    },
                },
            }
        ],
    },
)
assert endpoint == {
    "name": "chat",
    "creator": "alice@company.com",
    "creation_timestamp": 0,
    "last_updated_timestamp": 0,
    "state": {...},
    "config": {...},
    "tags": [...],
    "id": "88fd3f75a0d24b0380ddc40484d7a31b",
}

rate_limits = client.update_endpoint(
    endpoint="chat",
    config={
        "rate_limits": [
            {
                "key": "user",
                "renewal_period": "minute",
                "calls": 10,
            }
        ],
    },
)
assert rate_limits == {
    "rate_limits": [
        {
            "key": "user",
            "renewal_period": "minute",
            "calls": 10,
        }
    ],
}

update_endpoint_ai_gateway(endpoint, config)[source]

更新指定服务终结点的人工智能网关配置。

参数

endpoint (str) – 要更新的服务终结点的名称。
config (dict) – 包含要更新的人工智能网关配置的字典。

返回

包含更新后的人工智能网关配置的字典。

返回类型

dict

示例

from mlflow.deployments import get_deploy_client

client = get_deploy_client("databricks")
name = "test"

gateway_config = {
    "usage_tracking_config": {"enabled": True},
    "inference_table_config": {
        "enabled": True,
        "catalog_name": "my_catalog",
        "schema_name": "my_schema",
    },
}

updated_gateway = client.update_endpoint_ai_gateway(
    endpoint=name, config=gateway_config
)
assert updated_gateway == {
    "usage_tracking_config": {"enabled": True},
    "inference_table_config": {
        "catalog_name": "my_catalog",
        "schema_name": "my_schema",
        "table_name_prefix": "test",
        "enabled": True,
    },
}

update_endpoint_config(endpoint, config)[source]

更新指定服务终结点的值。有关请求/响应模式，请参阅 https://docs.databricks.com/api/workspace/servingendpoints/updateconfig。

参数

endpoint – 要更新的服务终结点的名称。
config – 包含要更新的服务终结点配置的字典。

返回

一个 DatabricksEndpoint 对象，其中包含请求响应。

示例

from mlflow.deployments import get_deploy_client

client = get_deploy_client("databricks")
updated_endpoint = client.update_endpoint_config(
    endpoint="test",
    config={
        "served_entities": [
            {
                "name": "gpt-4o-mini",
                "external_model": {
                    "name": "gpt-4o-mini",
                    "provider": "openai",
                    "task": "llm/v1/chat",
                    "openai_config": {
                        "openai_api_key": "{{secrets/scope/key}}",
                    },
                },
            }
        ]
    },
)
assert updated_endpoint == {
    "name": "test",
    "creator": "alice@company.com",
    "creation_timestamp": 1729527763000,
    "last_updated_timestamp": 1729530896000,
    "state": {"ready": "READY", "config_update": "NOT_UPDATING"},
    "config": {...},
    "id": "44b258fb39804564b37603d8d14b853e",
    "permission_level": "CAN_MANAGE",
    "route_optimized": False,
    "task": "llm/v1/chat",
    "endpoint_type": "EXTERNAL_MODEL",
    "creator_display_name": "Alice",
    "creator_kind": "User",
}

update_endpoint_rate_limits(endpoint, config)[source]

更新指定服务终结点的速率限制。有关请求/响应模式，请参阅 https://docs.databricks.com/api/workspace/servingendpoints/put。

参数

endpoint – 要更新的服务终结点的名称。
config – 包含更新后的速率限制配置的字典。

返回

一个 DatabricksEndpoint 对象，其中包含更新后的速率限制。

示例

from mlflow.deployments import get_deploy_client

client = get_deploy_client("databricks")
name = "databricks-dbrx-instruct"
rate_limits = {
    "rate_limits": [{"calls": 10, "key": "endpoint", "renewal_period": "minute"}]
}
updated_rate_limits = client.update_endpoint_rate_limits(
    endpoint=name, config=rate_limits
)
assert updated_rate_limits == {
    "rate_limits": [{"calls": 10, "key": "endpoint", "renewal_period": "minute"}]
}

update_endpoint_tags(endpoint, config)[source]

更新指定服务终结点的标签。请参阅 https://docs.databricks.com/api/workspace/servingendpoints/patch 获取请求/响应模式。

参数

endpoint – 要更新的服务终结点的名称。
config – 包含要添加和/或删除的标签的字典。

返回

一个 DatabricksEndpoint 对象，其中包含请求响应。

示例

from mlflow.deployments import get_deploy_client

client = get_deploy_client("databricks")
updated_tags = client.update_endpoint_tags(
    endpoint="test", config={"add_tags": [{"key": "project", "value": "test"}]}
)
assert updated_tags == {"tags": [{"key": "project", "value": "test"}]}

class mlflow.deployments.DatabricksEndpoint[source]

表示 Databricks 服务终结点的类字典对象。

endpoint = DatabricksEndpoint(
    {
        "name": "chat",
        "creator": "alice@company.com",
        "creation_timestamp": 0,
        "last_updated_timestamp": 0,
        "state": {...},
        "config": {...},
        "tags": [...],
        "id": "88fd3f75a0d24b0380ddc40484d7a31b",
    }
)
assert endpoint.name == "chat"

class mlflow.deployments.MlflowDeploymentClient(target_uri)[source]

用于与 MLflow AI Gateway 交互的客户端。

示例

首先，启动 MLflow AI Gateway

mlflow gateway start --config-path path/to/config.yaml

然后，创建客户端并使用它与服务器交互

from mlflow.deployments import get_deploy_client

client = get_deploy_client("https://:5000")
endpoints = client.list_endpoints()
assert [e.dict() for e in endpoints] == [
    {
        "name": "chat",
        "endpoint_type": "llm/v1/chat",
        "model": {"name": "gpt-4o-mini", "provider": "openai"},
        "endpoint_url": "https://:5000/gateway/chat/invocations",
    },
]

create_deployment(name, model_uri, flavor=None, config=None, endpoint=None)[source]: 警告

此方法未为 MlflowDeploymentClient 实现。

create_endpoint(name, config=None)[source]: 警告

此方法未为 MlflowDeploymentClient 实现。

delete_deployment(name, config=None, endpoint=None)[source]: 警告

此方法未为 MlflowDeploymentClient 实现。

delete_endpoint(endpoint)[source]: 警告

此方法未为 MlflowDeploymentClient 实现。

get_deployment(name, endpoint=None)[source]: 警告

此方法未为 MLflowDeploymentClient 实现。

get_endpoint(endpoint) → Endpoint[source]

获取为 MLflow AI Gateway 配置的指定终结点。

参数: endpoint – 要检索的终结点的名称。
返回: 一个表示终结点的 Endpoint 对象。

示例

from mlflow.deployments import get_deploy_client

client = get_deploy_client("https://:5000")
endpoint = client.get_endpoint(endpoint="chat")
assert endpoint.dict() == {
    "name": "chat",
    "endpoint_type": "llm/v1/chat",
    "model": {"name": "gpt-4o-mini", "provider": "openai"},
    "endpoint_url": "https://:5000/gateway/chat/invocations",
}

list_deployments(endpoint=None)[source]: 警告

此方法未为 MlflowDeploymentClient 实现。

list_endpoints() → list[Endpoint][source]

列出 MLflow AI Gateway 配置的终结点。

返回: 一个 Endpoint 对象列表。

示例

from mlflow.deployments import get_deploy_client

client = get_deploy_client("https://:5000")

endpoints = client.list_endpoints()
assert [e.dict() for e in endpoints] == [
    {
        "name": "chat",
        "endpoint_type": "llm/v1/chat",
        "model": {"name": "gpt-4o-mini", "provider": "openai"},
        "endpoint_url": "https://:5000/gateway/chat/invocations",
    },
]

predict(deployment_name=None, inputs=None, endpoint=None) → dict[str, typing.Any][source]

向已配置的提供程序终结点提交查询。

参数

deployment_name – 未使用。
inputs – 查询的输入，以字典形式提供。
endpoint – 要查询的端点的名称。

返回

包含来自终结点响应的字典。

示例

from mlflow.deployments import get_deploy_client

client = get_deploy_client("https://:5000")

response = client.predict(
    endpoint="chat",
    inputs={"messages": [{"role": "user", "content": "Hello"}]},
)
assert response == {
    "id": "chatcmpl-8OLoQuaeJSLybq3NBoe0w5eyqjGb9",
    "object": "chat.completion",
    "created": 1700814410,
    "model": "gpt-4o-mini",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "Hello! How can I assist you today?",
            },
            "finish_reason": "stop",
        }
    ],
    "usage": {
        "prompt_tokens": 9,
        "completion_tokens": 9,
        "total_tokens": 18,
    },
}

给定提供程序和终结点配置有效的其他参数可以包含在请求中，如下所示，以 OpenAI completions 终结点请求为例

from mlflow.deployments import get_deploy_client

client = get_deploy_client("https://:5000")
client.predict(
    endpoint="completions",
    inputs={
        "prompt": "Hello!",
        "temperature": 0.3,
        "max_tokens": 500,
    },
)

update_deployment(name, model_uri=None, flavor=None, config=None, endpoint=None)[source]: 警告

此方法未为 MlflowDeploymentClient 实现。

update_endpoint(endpoint, config=None)[source]: 警告

此方法未为 MlflowDeploymentClient 实现。

class mlflow.deployments.OpenAIDeploymentClient(target_uri)[source]

用于与 OpenAI 终结点交互的客户端。

示例

首先，设置身份验证凭据

export OPENAI_API_KEY=...

另请参阅

有关其他身份验证方法，请参阅 https://mlflow.org.cn/docs/latest/python_api/openai/index.html。

然后，创建部署客户端并使用它与 OpenAI 终结点交互

from mlflow.deployments import get_deploy_client

client = get_deploy_client("openai")
client.predict(
    endpoint="gpt-4o-mini",
    inputs={
        "messages": [
            {"role": "user", "content": "Hello!"},
        ],
    },
)

create_deployment(name, model_uri, flavor=None, config=None, endpoint=None)[source]: 警告

此方法未为 OpenAIDeploymentClient 实现。

create_endpoint(name, config=None)[source]: 警告

此方法未为 OpenAIDeploymentClient 实现。

delete_deployment(name, config=None, endpoint=None)[source]: 警告

此方法未为 OpenAIDeploymentClient 实现。

delete_endpoint(endpoint)[source]: 警告

此方法未为 OpenAIDeploymentClient 实现。

get_deployment(name, endpoint=None)[source]: 警告

此方法未为 OpenAIDeploymentClient 实现。

get_endpoint(endpoint)[source]: 获取特定模型的有关信息。

list_deployments(endpoint=None)[source]: 警告

此方法未为 OpenAIDeploymentClient 实现。

list_endpoints()[source]: 列出当前可用的模型。

predict(deployment_name=None, inputs=None, endpoint=None)[source]

查询 OpenAI 终结点。有关详细信息，请参阅 https://platform.openai.com/docs/api-reference。

参数

deployment_name – 未使用。
inputs – 包含要查询的模型输入的字典。
endpoint – 要查询的端点的名称。

返回

包含模型输出的字典。

update_deployment(name, model_uri=None, flavor=None, config=None, endpoint=None)[source]: 警告

此方法未为 OpenAIDeploymentClient 实现。

update_endpoint(endpoint, config=None)[source]: 警告

此方法未为 OpenAIDeploymentClient 实现。

mlflow.deployments.get_deploy_client(target_uri=None)[source]

返回一个 mlflow.deployments.BaseDeploymentClient 的子类，该子类公开用于将模型部署到指定目标的标准 API。请通过调用返回对象的 help() 或查阅 mlflow.deployments.BaseDeploymentClient 的文档来查看可用的部署 API。您也可以通过 CLI 运行 mlflow deployments help -t <target-uri> 以获取有关目标特定配置选项的更多详细信息。

参数: target_uri – 可选的部署目标 URI。如果未提供目标 URI，则 MLflow 将尝试获取通过 get_deployments_target() 或 MLFLOW_DEPLOYMENTS_TARGET 环境变量设置的部署目标。

示例

from mlflow.deployments import get_deploy_client
import pandas as pd

client = get_deploy_client("redisai")
# Deploy the model stored at artifact path 'myModel' under run with ID 'someRunId'. The
# model artifacts are fetched from the current tracking server and then used for deployment.
client.create_deployment("spamDetector", "runs:/someRunId/myModel")
# Load a CSV of emails and score it against our deployment
emails_df = pd.read_csv("...")
prediction_df = client.predict_deployment("spamDetector", emails_df)
# List all deployments, get details of our particular deployment
print(client.list_deployments())
print(client.get_deployment("spamDetector"))
# Update our deployment to serve a different model
client.update_deployment("spamDetector", "runs:/anotherRunId/myModel")
# Delete our deployment
client.delete_deployment("spamDetector")

mlflow.deployments.get_deployments_target() → str[source]: 返回当前设置的 MLflow 部署目标（如果已设置）。如果部署目标未通过使用 set_deployments_target 设置，则会引发 MlflowException。

mlflow.deployments.run_local(target, name, model_uri, flavor=None, config=None)[source]

将指定模型本地部署，用于测试。请注意，本地部署的模型不能被其他部署 API（例如 update_deployment、delete_deployment 等）管理。

参数

target – 要部署到的目标。
name – 用于部署的名称
model_uri – 要部署的模型的 URI
flavor – (可选) 要部署的模型风味。如果未指定，将选择默认风味。
config – (可选) 包含部署的更新的目标特定配置的字典

返回

无

mlflow.deployments.set_deployments_target(target: str)[source]

设置 MLflow 部署的目标客户端

参数: target – 正在运行的 MLflow AI Gateway 的完整 URI，或者，如果在 Databricks 上运行，则为“databricks”。

class mlflow.deployments.PredictionsResponse[source]

表示发送到 MLflow 模型服务器的 /invocations 端点的 REST API 请求等评分请求响应中返回的预测和元数据。

get_predictions(predictions_format='dataframe', dtype=None)[source]

以指定格式获取 MLflow 模型服务器返回的预测。

参数

predictions_format – 返回预测的格式。可以是 "dataframe" 或 "ndarray"。
dtype – 要强制转换预测的 NumPy 数据类型。仅在使用指定了“ndarray”predictions_format 时使用。

引发

Exception – 如果预测无法以指定的格式表示。

返回

以指定格式表示的预测。

to_json(path=None)[source]

获取 MLflow Predictions Response 的 JSON 表示。

参数: path – 如果指定，则 JSON 表示将写入此文件路径。
返回: 如果未指定 path，则返回 MLflow Predictions Response 的 JSON 表示。否则，返回 None。