I am using MLflow to track my experiments. I am using an S3 bucket as an artifact store. For acessing it, I want to use proxied artifact access, as described in the docs, however this does not work for me, since it locally looks for credentials (but the server should handle this).
Expected Behaviour
As described in the docs, I would expect that locally, I do not need to specify my AWS credentials, since the server handles this for me. From docs:
This eliminates the need to allow end users to have direct path access to a remote object store (e.g., s3, adls, gcs, hdfs) for artifact handling and eliminates the need for an end-user to provide access credentials to interact with an underlying object store.
Actual Behaviour / Error
Whenever I run an experiment on my machine, I am running into the following error:
botocore.exceptions.NoCredentialsError: Unable to locate credentials
So the error is local. However, this should not happen since the server should handle the auth instead of me needing to store my credentials locally. Also, I would expect that I would not even need library boto3
locally.
Solutions Tried
I am aware that I need to create a new experiment, because existing experiments might still use a different artifact location which is proposed in this SO answer as well as in the note in the docs. Creating a new experiment did not solve the error for me. Whenever I run the experiment, I get an explicit log in the console validating this:
INFO mlflow.tracking.fluent: Experiment with name 'test' does not exist. Creating a new experiment.
Related Questions (#1 and #2) refer to a different scenario, which is also described in the docs
Server Config
The server runs on a kubernetes pod with the following config:
mlflow server
--host 0.0.0.0
--port 5000
--backend-store-uri postgresql://user:pw@endpoint
--artifacts-destination s3://my_bucket/artifacts
--serve-artifacts
--default-artifact-root s3://my_bucket/artifacts
I would expect my config to be correct, looking at doc page 1 and page 2
I am able to see the mlflow UI if I forward the port to my local machine. I also see the experiment runs as failed, because of the error I sent above.
My Code
The relevant part of my code which fails is the logging of the model:
mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("test2)
...
# this works
mlflow.log_params(hyperparameters)
model = self._train(model_name, hyperparameters, X_train, y_train)
y_pred = model.predict(X_test)
self._evaluate(y_test, y_pred)
# this fails with the error from above
mlflow.sklearn.log_model(model, "artifacts")
Question
I am probably overlooking something. Is there a need to locally indicate that I want to use proxied artified access? If yes, how do I do this? Is there something I have missed?
Full Traceback
File /dir/venv/lib/python3.9/site-packages/mlflow/models/model.py", line 295, in log
mlflow.tracking.fluent.log_artifacts(local_path, artifact_path)
File /dir/venv/lib/python3.9/site-packages/mlflow/tracking/fluent.py", line 726, in log_artifacts
MlflowClient().log_artifacts(run_id, local_dir, artifact_path)
File /dir/venv/lib/python3.9/site-packages/mlflow/tracking/client.py", line 1001, in log_artifacts
self._tracking_client.log_artifacts(run_id, local_dir, artifact_path)
File /dir/venv/lib/python3.9/site-packages/mlflow/tracking/_tracking_service/client.py", line 346, in log_artifacts
self._get_artifact_repo(run_id).log_artifacts(local_dir, artifact_path)
File /dir/venv/lib/python3.9/site-packages/mlflow/store/artifact/s3_artifact_repo.py", line 141, in log_artifacts
self._upload_file(
File /dir/venv/lib/python3.9/site-packages/mlflow/store/artifact/s3_artifact_repo.py", line 117, in _upload_file
s3_client.upload_file(Filename=local_file, Bucket=bucket, Key=key, ExtraArgs=extra_args)
File /dir/venv/lib/python3.9/site-packages/boto3/s3/inject.py", line 143, in upload_file
return transfer.upload_file(
File /dir/venv/lib/python3.9/site-packages/boto3/s3/transfer.py", line 288, in upload_file
future.result()
File /dir/venv/lib/python3.9/site-packages/s3transfer/futures.py", line 103, in result
return self._coordinator.result()
File /dir/venv/lib/python3.9/site-packages/s3transfer/futures.py", line 266, in result
raise self._exception
File /dir/venv/lib/python3.9/site-packages/s3transfer/tasks.py", line 139, in __call__
return self._execute_main(kwargs)
File /dir/venv/lib/python3.9/site-packages/s3transfer/tasks.py", line 162, in _execute_main
return_value = self._main(**kwargs)
File /dir/venv/lib/python3.9/site-packages/s3transfer/upload.py", line 758, in _main
client.put_object(Bucket=bucket, Key=key, Body=body, **extra_args)
File /dir/venv/lib/python3.9/site-packages/botocore/client.py", line 508, in _api_call
return self._make_api_call(operation_name, kwargs)
File /dir/venv/lib/python3.9/site-packages/botocore/client.py", line 898, in _make_api_call
http, parsed_response = self._make_request(
File /dir/venv/lib/python3.9/site-packages/botocore/client.py", line 921, in _make_request
return self._endpoint.make_request(operation_model, request_dict)
File /dir/venv/lib/python3.9/site-packages/botocore/endpoint.py", line 119, in make_request
return self._send_request(request_dict, operation_model)
File /dir/venv/lib/python3.9/site-packages/botocore/endpoint.py", line 198, in _send_request
request = self.create_request(request_dict, operation_model)
File /dir/venv/lib/python3.9/site-packages/botocore/endpoint.py", line 134, in create_request
self._event_emitter.emit(
File /dir/venv/lib/python3.9/site-packages/botocore/hooks.py", line 412, in emit
return self._emitter.emit(aliased_event_name, **kwargs)
File /dir/venv/lib/python3.9/site-packages/botocore/hooks.py", line 256, in emit
return self._emit(event_name, kwargs)
File /dir/venv/lib/python3.9/site-packages/botocore/hooks.py", line 239, in _emit
response = handler(**kwargs)
File /dir/venv/lib/python3.9/site-packages/botocore/signers.py", line 103, in handler
return self.sign(operation_name, request)
File /dir/venv/lib/python3.9/site-packages/botocore/signers.py", line 187, in sign
auth.add_auth(request)
File /dir/venv/lib/python3.9/site-packages/botocore/auth.py", line 407, in add_auth
raise NoCredentialsError()
botocore.exceptions.NoCredentialsError: Unable to locate credentials
3
Answers
The problem is that the server is running on wrong run parameters, the
--default-artifact-root
needs to either be removed or set tomlflow-artifacts:/
.From
mlflow server --help
:Having the same problem and the accepted answer doesn’t seem to solve my issue.
Neither removing or setting
mlflow-artifacts
instead ofs3
works for me. Moreover it gave me an error that since I have a remotebackend-store-uri
I need to setdefault-artifact-root
while running the mlflow server.How I solved it that I find the error self explanatory, and the reason it states that it was unable to find credential is that mlflow underneath uses boto3 to do all the transaction. Since I had setup my environment variables in
.env
, just loading the file was enough for me and solved the issue. If you have the similar scenario then just run the following commands before starting your mlflow server,This will load the environment variables and you will be good to go.
Note:
The answer @bk_ helped me. I ended up with the following command to get my Tracking Server running with proxied connection for artifact storage: