删除追踪
您可以使用 mlflow.client.MlflowClient.delete_traces()
方法根据特定条件删除跟踪。此方法允许您通过实验ID、最大时间戳或跟踪ID删除跟踪。
提示
删除跟踪是一个不可逆的过程。请确保提供给delete_traces
API的参数符合预期的删除范围。
删除方法
- 按时间戳
- 按跟踪ID
- 批量删除
删除早于特定时间戳的跟踪
import time
from mlflow import MlflowClient
client = MlflowClient()
# Get current timestamp in milliseconds
current_time = int(time.time() * 1000)
# Delete traces older than current time, limit to 100 traces
deleted_count = client.delete_traces(
experiment_id="1", max_timestamp_millis=current_time, max_traces=100
)
print(f"Deleted {deleted_count} traces")
删除早于特定时间段的跟踪
from datetime import datetime, timedelta
# Calculate timestamp for 7 days ago
seven_days_ago = datetime.now() - timedelta(days=7)
timestamp_ms = int(seven_days_ago.timestamp() * 1000)
deleted_count = client.delete_traces(
experiment_id="1", max_timestamp_millis=timestamp_ms
)
按跟踪ID删除特定跟踪
from mlflow import MlflowClient
client = MlflowClient()
# Delete specific traces
trace_ids = ["trace_id_1", "trace_id_2", "trace_id_3"]
deleted_count = client.delete_traces(experiment_id="1", trace_ids=trace_ids)
print(f"Deleted {deleted_count} traces")
批量删除跟踪以提高性能
import time
from datetime import datetime, timedelta
from mlflow import MlflowClient
def cleanup_old_traces(experiment_id: str, days_old: int = 30, batch_size: int = 100):
"""Delete traces older than specified days in batches"""
client = MlflowClient()
# Calculate cutoff timestamp
cutoff_date = datetime.now() - timedelta(days=days_old)
cutoff_timestamp = int(cutoff_date.timestamp() * 1000)
total_deleted = 0
while True:
deleted_count = client.delete_traces(
experiment_id=experiment_id,
max_timestamp_millis=cutoff_timestamp,
max_traces=batch_size,
)
total_deleted += deleted_count
print(f"Deleted {deleted_count} traces (total: {total_deleted})")
if deleted_count < batch_size:
break
time.sleep(0.1) # Brief pause between batches
return total_deleted
# Usage
cleanup_old_traces(experiment_id="1", days_old=7)
高级用例
- 选择性清理
- 试运行模式
- 错误处理
根据特定条件删除跟踪
import mlflow
from mlflow import MlflowClient
def delete_error_traces(experiment_id: str):
"""Delete only traces that resulted in errors"""
client = MlflowClient()
# Search for error traces
traces = mlflow.search_traces(
experiment_ids=[experiment_id],
filter_string="status = 'ERROR'",
max_results=1000,
)
if traces:
trace_ids = [trace.info.trace_id for trace in traces]
deleted_count = client.delete_traces(
experiment_id=experiment_id, trace_ids=trace_ids
)
print(f"Deleted {deleted_count} error traces")
return deleted_count
return 0
# Usage
delete_error_traces("1")
在实际删除前测试删除条件
import mlflow
from mlflow import MlflowClient
def delete_with_dry_run(experiment_id: str, max_timestamp: int, dry_run: bool = True):
"""Delete traces with optional dry-run mode"""
client = MlflowClient()
if dry_run:
# Search to see what would be deleted
traces = mlflow.search_traces(
experiment_ids=[experiment_id], filter_string=f"timestamp < {max_timestamp}"
)
print(f"DRY RUN: Would delete {len(traces)} traces")
return len(traces)
else:
deleted_count = client.delete_traces(
experiment_id=experiment_id,
max_timestamp_millis=max_timestamp,
max_traces=1000,
)
print(f"ACTUAL: Deleted {deleted_count} traces")
return deleted_count
# Always test with dry run first
count = delete_with_dry_run("1", 1234567890000, dry_run=True)
if count < 100: # Only proceed if reasonable number
delete_with_dry_run("1", 1234567890000, dry_run=False)
优雅地处理删除错误
from mlflow import MlflowClient
from mlflow.exceptions import MlflowException
def safe_delete_traces(experiment_id: str, **delete_params):
"""Delete traces with error handling"""
client = MlflowClient()
try:
deleted_count = client.delete_traces(
experiment_id=experiment_id, **delete_params
)
print(f"Successfully deleted {deleted_count} traces")
return deleted_count
except MlflowException as e:
if "experiment not found" in str(e).lower():
print("Error: Experiment not found")
elif "permission" in str(e).lower():
print("Error: Permission denied")
else:
print(f"MLflow error: {e}")
return 0
except Exception as e:
print(f"Unexpected error: {e}")
return 0
# Usage
safe_delete_traces("1", max_timestamp_millis=1234567890000, max_traces=50)
最佳实践
始终先测试:使用搜索查询或试运行模式,在进行删除前验证将删除哪些跟踪。
批量删除:对于大量跟踪,请分批删除以避免超时和性能问题。
设置合理限制:使用max_traces
参数,防止一次意外删除过多跟踪。
监控和日志记录:记录删除活动,以供审计,尤其是在生产环境中。
优雅地处理错误:实施适当的错误处理,以管理网络问题、权限问题或无效参数。
参数参考
参数 | 类型 | 描述 |
---|---|---|
experiment_id | str | 必需。包含要删除跟踪的实验ID |
max_timestamp_millis | int | 删除在此时间戳(毫秒)之前创建的跟踪 |
trace_ids | List[str] | 删除具有这些特定跟踪ID的跟踪 |
max_traces | int | 在此操作中要删除的最大跟踪数量 |
注意
您必须指定max_timestamp_millis
或trace_ids
之一,但不能同时指定两者。max_traces
参数不能与trace_ids
一起使用。