I can’t find the proper way to add dependencies to my Azure Container Instance for ML Inference.
I basically started by following this tutorial : Train and deploy an image classification model with an example Jupyter Notebook
It works fine.
Now I want to deploy my trained TensorFlow model for inference. I tried many ways, but I was never able to add python dependencies to the Environment.
From the TensorFlow curated environment
Using AzureML-tensorflow-2.4-ubuntu18.04-py37-cpu-inference :
from azureml.core import Workspace
# connect to your workspace
ws = Workspace.from_config()
# names
experiment_name = "my-experiment"
model_name = "my-model"
env_version="1"
env_name="my-env-"+env_version
service_name = str.lower(model_name + "-service-" + env_version)
# create environment for the deploy
from azureml.core.environment import Environment, DEFAULT_CPU_IMAGE
from azureml.core.conda_dependencies import CondaDependencies
from azureml.core.webservice import AciWebservice
# get a curated environment
env = Environment.get(
workspace=ws,
name="AzureML-tensorflow-2.4-ubuntu18.04-py37-cpu-inference",
# )
custom_env = env.clone(env_name)
custom_env.inferencing_stack_version='latest'
# add packages
conda_dep = CondaDependencies()
python_packages = ['joblib', 'numpy', 'os', 'json', 'tensorflow']
for package in python_packages:
conda_dep.add_pip_package(package)
conda_dep.add_conda_package(package)
# Adds dependencies to PythonSection of env
custom_env.python.user_managed_dependencies=True
custom_env.python.conda_dependencies=conda_dep
custom_env.register(workspace=ws)
# create deployment config i.e. compute resources
aciconfig = AciWebservice.deploy_configuration(
cpu_cores=1,
memory_gb=1,
tags={"experiment": experiment_name, "model": model_name},
)
from azureml.core.model import InferenceConfig
from azureml.core.model import Model
# get the registered model
model = Model(ws, model_name)
# create an inference config i.e. the scoring script and environment
inference_config = InferenceConfig(entry_script="score.py", environment=custom_env)
# deploy the service
service = Model.deploy(
workspace=ws,
name=service_name,
models=[model],
inference_config=inference_config,
deployment_config=aciconfig,
)
service.wait_for_deployment(show_output=True)
I get the following log :
AzureML image information: tensorflow-2.4-ubuntu18.04-py37-cpu-inference:20220110.v1
PATH environment variable: /opt/miniconda/envs/amlenv/bin:/opt/miniconda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PYTHONPATH environment variable:
Pip Dependencies
---------------
EdgeHubConnectionString and IOTEDGE_IOTHUBHOSTNAME are not set. Exiting...
2022-01-24T10:21:09,855130300+00:00 - iot-server/finish 1 0
2022-01-24T10:21:09,856870100+00:00 - Exit code 1 is normal. Not restarting iot-server.
absl-py==0.15.0
applicationinsights==0.11.10
astunparse==1.6.3
azureml-inference-server-http==0.4.2
cachetools==4.2.4
certifi==2021.10.8
charset-normalizer==2.0.10
click==8.0.3
Flask==1.0.3
flatbuffers==1.12
gast==0.3.3
google-auth==2.3.3
google-auth-oauthlib==0.4.6
google-pasta==0.2.0
grpcio==1.32.0
gunicorn==20.1.0
h5py==2.10.0
idna==3.3
importlib-metadata==4.10.0
inference-schema==1.3.0
itsdangerous==2.0.1
Jinja2==3.0.3
Keras-Preprocessing==1.1.2
Markdown==3.3.6
MarkupSafe==2.0.1
numpy==1.19.5
oauthlib==3.1.1
opt-einsum==3.3.0
pandas==1.1.5
protobuf==3.19.1
pyasn1==0.4.8
pyasn1-modules==0.2.8
python-dateutil==2.8.2
pytz==2021.3
requests==2.27.1
requests-oauthlib==1.3.0
rsa==4.8
six==1.15.0
tensorboard==2.7.0
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tensorflow==2.4.0
tensorflow-estimator==2.4.0
termcolor==1.1.0
typing-extensions==3.7.4.3
urllib3==1.26.8
Werkzeug==2.0.2
wrapt==1.12.1
zipp==3.7.0
Entry script directory: /var/azureml-app/.
Dynamic Python package installation is disabled.
Starting AzureML Inference Server HTTP.
Azure ML Inferencing HTTP server v0.4.2
Server Settings
---------------
Entry Script Name: score.py
Model Directory: /var/azureml-app/azureml-models/my-model/1
Worker Count: 1
Worker Timeout (seconds): 300
Server Port: 31311
Application Insights Enabled: false
Application Insights Key: None
Server Routes
---------------
Liveness Probe: GET 127.0.0.1:31311/
Score: POST 127.0.0.1:31311/score
Starting gunicorn 20.1.0
Listening at: http://0.0.0.0:31311 (69)
Using worker: sync
Booting worker with pid: 100
Exception in worker process
Traceback (most recent call last):
File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/gunicorn/arbiter.py", line 589, in spawn_worker
worker.init_process()
File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/gunicorn/workers/base.py", line 134, in init_process
self.load_wsgi()
File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/gunicorn/workers/base.py", line 146, in load_wsgi
self.wsgi = self.app.wsgi()
File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/gunicorn/app/base.py", line 67, in wsgi
self.callable = self.load()
File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 58, in load
return self.load_wsgiapp()
File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 48, in load_wsgiapp
return util.import_app(self.app_uri)
File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/gunicorn/util.py", line 359, in import_app
mod = importlib.import_module(module)
File "/opt/miniconda/envs/amlenv/lib/python3.7/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 728, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/azureml_inference_server_http/server/entry.py", line 1, in <module>
import create_app
File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/azureml_inference_server_http/server/create_app.py", line 4, in <module>
from routes_common import main
File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/azureml_inference_server_http/server/routes_common.py", line 32, in <module>
from aml_blueprint import AMLBlueprint
File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/azureml_inference_server_http/server/aml_blueprint.py", line 28, in <module>
main_module_spec.loader.exec_module(main)
File "/var/azureml-app/score.py", line 4, in <module>
import joblib
ModuleNotFoundError: No module named 'joblib'
Worker exiting (pid: 100)
Shutting down: Master
Reason: Worker failed to boot.
2022-01-24T10:21:13,851467800+00:00 - gunicorn/finish 3 0
2022-01-24T10:21:13,853259700+00:00 - Exit code 3 is not normal. Killing image.
From a Conda specification
Same as before, but with a fresh environment from Conda specification and changing the env_version
number :
# ...
env_version="2"
# ...
custom_env = Environment.from_conda_specification(name=env_name, file_path="my-env.yml")
custom_env.docker.base_image = DEFAULT_CPU_IMAGE
# ...
with my-env.yml
:
name: my-env
dependencies:
- python
- pip:
- azureml-defaults
- azureml-sdk
- sklearn
- numpy
- matplotlib
- joblib
- uuid
- requests
- tensorflow
I get this log :
2022-01-24T11:06:54,887886931+00:00 - iot-server/run
2022-01-24T11:06:54,891839877+00:00 - rsyslog/run
2022-01-24T11:06:54,893640998+00:00 - gunicorn/run
2022-01-24T11:06:54,912032812+00:00 - nginx/run
EdgeHubConnectionString and IOTEDGE_IOTHUBHOSTNAME are not set. Exiting...
2022-01-24T11:06:55,398420960+00:00 - iot-server/finish 1 0
2022-01-24T11:06:55,414425146+00:00 - Exit code 1 is normal. Not restarting iot-server.
PATH environment variable: /opt/miniconda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PYTHONPATH environment variable:
Pip Dependencies
---------------
brotlipy==0.7.0
certifi==2020.6.20
cffi @ file:///tmp/build/80754af9/cffi_1605538037615/work
chardet @ file:///tmp/build/80754af9/chardet_1605303159953/work
conda==4.9.2
conda-package-handling @ file:///tmp/build/80754af9/conda-package-handling_1603018138503/work
cryptography @ file:///tmp/build/80754af9/cryptography_1605544449973/work
idna @ file:///tmp/build/80754af9/idna_1593446292537/work
pycosat==0.6.3
pycparser @ file:///tmp/build/80754af9/pycparser_1594388511720/work
pyOpenSSL @ file:///tmp/build/80754af9/pyopenssl_1605545627475/work
PySocks @ file:///tmp/build/80754af9/pysocks_1594394576006/work
requests @ file:///tmp/build/80754af9/requests_1592841827918/work
ruamel-yaml==0.15.87
six @ file:///tmp/build/80754af9/six_1605205313296/work
tqdm @ file:///tmp/build/80754af9/tqdm_1605303662894/work
urllib3 @ file:///tmp/build/80754af9/urllib3_1603305693037/work
Starting HTTP server
2022-01-24T11:06:59,701365128+00:00 - gunicorn/finish 127 0
./run: line 127: exec: gunicorn: not found
2022-01-24T11:06:59,706177784+00:00 - Exit code 127 is not normal. Killing image.
I really don’t know what I’m missing, and I’ve been searching for too long already (Azure docs, SO, …).
Thanks for your help !
Edit : Non-exhaustive list of solutions I tried :
- How to create AzureML environement and add required packages
- how to use existing conda environment as a AzureML environment
- …
- https://learn.microsoft.com/en-us/azure/machine-learning/concept-environments#environment-building-caching-and-reuse
- https://learn.microsoft.com/en-us/azure/machine-learning/how-to-use-environments#add-packages-to-an-environment
- https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-inferencing-gpus
- https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-and-where?tabs=python#define-a-deployment-configuration
- …
3
Answers
OK, I got it working : I started over from scratch and it worked.
I have no idea what was wrong in all my preceding tries, and that is terrible.
Multiple problems and how I (think I) solved them :
joblib
: I actually didn't need it to load my Keras model. But the problem was not with this specific library, rather that I couldn't add dependencies to the inference environment.Environment
: finally, I was only able to make things work with a custom env :Environment.from_conda_specification(name=version, file_path="conda_dependencies.yml")
. I haven't been able to add my libraries (or specify a specific package version) to a "currated environment". I don't know why though...TensorFlow
: last problem I had was that I trained and registered my model in AzureML Notebook'sazureml_py38_PT_TF
kernel (tensorflow==2.7.0
), and tried to load it in the inference Docker image (tensorflow==2.4.0
). So I had to specify the version of TensorFlow I wanted to use in the inference image (which required the previous point to be solved).What finally worked :
If you want to create a custom environment you can use the below code to set the env configuration.
Creating the enviroment
myenv = Environment(name="Environment")
myenv.docker.enabled = True
myenv.python.conda_dependencies = CondaDependencies.create(conda_packages = ['numpy','scikit-learn','pip','pandas'], pip_packages = ['azureml-defaults~= 1.34.0','azureml','azureml-core~= 1.34.0',"azureml-sdk",'inference-schema','azureml-telemetry~= 1.34.0','azureml- train-automl~= 1.34.0','azure-ml-api-sdk','python-dotenv','azureml-contrib-server','azureml-inference-server-http'])
Ref doc: https://learn.microsoft.com/en-us/python/api/azureml-core/azureml.core.environment(class)?view=azure-ml-py#:~:text=Upload%20the%20private%20pip%20wheel,in%20the%20workspace%20storage%20blob.&text=Build%20a%20Docker%20image%20for%20this%20environment%20in%20the%20cloud.&text=Build%20the%20local%20Docker%20or%20conda%20environment.
I think there is a small security issue with the implementation of joblib in Azure servers, do not load it in your code and it will run.