dagster_mlflow.
mlflow_tracking
ResourceDefinition[source]¶MlFlow experiment name.
MlFlow tracking server uri.
Default Value: None
Mlflow run ID of parent run if this is a nested run.
Default Value: None
Environment variables for mlflow setup.
{}
List of environment variables to log as tags in mlflow.
Default Value: None
Any extra key-value tags to log to mlflow.
{}
This resource initializes an MLflow run that’s used for all steps within a Dagster run.
This resource provides access to all of mlflow’s methods as well as the mlflow tracking client’s methods.
Usage:
Add the mlflow resource to any solids in which you want to invoke mlflow tracking APIs.
Add the end_mlflow_on_run_finished hook to your pipeline to end the MLflow run when the Dagster run is finished.
Examples
from dagster_mlflow import end_mlflow_on_run_finished, mlflow_tracking
@op(required_resource_keys={"mlflow"})
def mlflow_solid(context):
mlflow.log_params(some_params)
mlflow.tracking.MlflowClient().create_registered_model(some_model_name)
@end_mlflow_on_run_finished
@job(resource_defs={"mlflow": mlflow_tracking})
def mlf_example():
mlflow_op()
# example using an mlflow instance with s3 storage
mlf_example.execute_in_process(run_config={
"resources": {
"mlflow": {
"config": {
"experiment_name": my_experiment,
"mlflow_tracking_uri": "http://localhost:5000",
# if want to run a nested run, provide parent_run_id
"parent_run_id": an_existing_mlflow_run_id,
# env variables to pass to mlflow
"env": {
"MLFLOW_S3_ENDPOINT_URL": my_s3_endpoint,
"AWS_ACCESS_KEY_ID": my_aws_key_id,
"AWS_SECRET_ACCESS_KEY": my_secret,
},
# env variables you want to log as mlflow tags
"env_to_tag": ["DOCKER_IMAGE_TAG"],
# key-value tags to add to your experiment
"extra_tags": {"super": "experiment"},
}
}
}
})
dagster_mlflow.
end_mlflow_on_run_finished
HookDefinition¶