skip to Main Content

I can’t find the proper way to add dependencies to my Azure Container Instance for ML Inference.

I basically started by following this tutorial : Train and deploy an image classification model with an example Jupyter Notebook

It works fine.

Now I want to deploy my trained TensorFlow model for inference. I tried many ways, but I was never able to add python dependencies to the Environment.

From the TensorFlow curated environment

Using AzureML-tensorflow-2.4-ubuntu18.04-py37-cpu-inference :

from azureml.core import Workspace


# connect to your workspace
ws = Workspace.from_config()

# names
experiment_name = "my-experiment"
model_name = "my-model"
env_version="1"
env_name="my-env-"+env_version
service_name = str.lower(model_name + "-service-" + env_version)


# create environment for the deploy
from azureml.core.environment import Environment, DEFAULT_CPU_IMAGE
from azureml.core.conda_dependencies import CondaDependencies
from azureml.core.webservice import AciWebservice

# get a curated environment
env = Environment.get(
    workspace=ws, 
    name="AzureML-tensorflow-2.4-ubuntu18.04-py37-cpu-inference",
# )
custom_env = env.clone(env_name)
custom_env.inferencing_stack_version='latest'

# add packages
conda_dep = CondaDependencies()
python_packages = ['joblib', 'numpy', 'os', 'json', 'tensorflow']
for package in python_packages:
    conda_dep.add_pip_package(package)
    conda_dep.add_conda_package(package)

# Adds dependencies to PythonSection of env
custom_env.python.user_managed_dependencies=True
custom_env.python.conda_dependencies=conda_dep

custom_env.register(workspace=ws)

# create deployment config i.e. compute resources
aciconfig = AciWebservice.deploy_configuration(
    cpu_cores=1,
    memory_gb=1,
    tags={"experiment": experiment_name, "model": model_name},
)

from azureml.core.model import InferenceConfig
from azureml.core.model import Model

# get the registered model
model = Model(ws, model_name)

# create an inference config i.e. the scoring script and environment
inference_config = InferenceConfig(entry_script="score.py", environment=custom_env)

# deploy the service
service = Model.deploy(
    workspace=ws,
    name=service_name,
    models=[model],
    inference_config=inference_config,
    deployment_config=aciconfig,
)

service.wait_for_deployment(show_output=True)

I get the following log :


AzureML image information: tensorflow-2.4-ubuntu18.04-py37-cpu-inference:20220110.v1


PATH environment variable: /opt/miniconda/envs/amlenv/bin:/opt/miniconda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PYTHONPATH environment variable: 

Pip Dependencies
---------------
EdgeHubConnectionString and IOTEDGE_IOTHUBHOSTNAME are not set. Exiting...
2022-01-24T10:21:09,855130300+00:00 - iot-server/finish 1 0
2022-01-24T10:21:09,856870100+00:00 - Exit code 1 is normal. Not restarting iot-server.
absl-py==0.15.0
applicationinsights==0.11.10
astunparse==1.6.3
azureml-inference-server-http==0.4.2
cachetools==4.2.4
certifi==2021.10.8
charset-normalizer==2.0.10
click==8.0.3
Flask==1.0.3
flatbuffers==1.12
gast==0.3.3
google-auth==2.3.3
google-auth-oauthlib==0.4.6
google-pasta==0.2.0
grpcio==1.32.0
gunicorn==20.1.0
h5py==2.10.0
idna==3.3
importlib-metadata==4.10.0
inference-schema==1.3.0
itsdangerous==2.0.1
Jinja2==3.0.3
Keras-Preprocessing==1.1.2
Markdown==3.3.6
MarkupSafe==2.0.1
numpy==1.19.5
oauthlib==3.1.1
opt-einsum==3.3.0
pandas==1.1.5
protobuf==3.19.1
pyasn1==0.4.8
pyasn1-modules==0.2.8
python-dateutil==2.8.2
pytz==2021.3
requests==2.27.1
requests-oauthlib==1.3.0
rsa==4.8
six==1.15.0
tensorboard==2.7.0
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tensorflow==2.4.0
tensorflow-estimator==2.4.0
termcolor==1.1.0
typing-extensions==3.7.4.3
urllib3==1.26.8
Werkzeug==2.0.2
wrapt==1.12.1
zipp==3.7.0


Entry script directory: /var/azureml-app/.

Dynamic Python package installation is disabled.
Starting AzureML Inference Server HTTP.

Azure ML Inferencing HTTP server v0.4.2


Server Settings
---------------
Entry Script Name: score.py
Model Directory: /var/azureml-app/azureml-models/my-model/1
Worker Count: 1
Worker Timeout (seconds): 300
Server Port: 31311
Application Insights Enabled: false
Application Insights Key: None


Server Routes
---------------
Liveness Probe: GET   127.0.0.1:31311/
Score:          POST  127.0.0.1:31311/score

Starting gunicorn 20.1.0
Listening at: http://0.0.0.0:31311 (69)
Using worker: sync
Booting worker with pid: 100
Exception in worker process
Traceback (most recent call last):
  File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/gunicorn/arbiter.py", line 589, in spawn_worker
    worker.init_process()
  File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/gunicorn/workers/base.py", line 134, in init_process
    self.load_wsgi()
  File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/gunicorn/workers/base.py", line 146, in load_wsgi
    self.wsgi = self.app.wsgi()
  File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/gunicorn/app/base.py", line 67, in wsgi
    self.callable = self.load()
  File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 58, in load
    return self.load_wsgiapp()
  File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 48, in load_wsgiapp
    return util.import_app(self.app_uri)
  File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/gunicorn/util.py", line 359, in import_app
    mod = importlib.import_module(module)
  File "/opt/miniconda/envs/amlenv/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/azureml_inference_server_http/server/entry.py", line 1, in <module>
    import create_app
  File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/azureml_inference_server_http/server/create_app.py", line 4, in <module>
    from routes_common import main
  File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/azureml_inference_server_http/server/routes_common.py", line 32, in <module>
    from aml_blueprint import AMLBlueprint
  File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/azureml_inference_server_http/server/aml_blueprint.py", line 28, in <module>
    main_module_spec.loader.exec_module(main)
  File "/var/azureml-app/score.py", line 4, in <module>
    import joblib
ModuleNotFoundError: No module named 'joblib'
Worker exiting (pid: 100)
Shutting down: Master
Reason: Worker failed to boot.
2022-01-24T10:21:13,851467800+00:00 - gunicorn/finish 3 0
2022-01-24T10:21:13,853259700+00:00 - Exit code 3 is not normal. Killing image.

From a Conda specification

Same as before, but with a fresh environment from Conda specification and changing the env_version number :

# ...


env_version="2"

# ...

custom_env = Environment.from_conda_specification(name=env_name, file_path="my-env.yml")
custom_env.docker.base_image = DEFAULT_CPU_IMAGE

# ...

with my-env.yml :

name: my-env
dependencies:
- python

- pip:
  - azureml-defaults
  - azureml-sdk
  - sklearn
  - numpy
  - matplotlib
  - joblib
  - uuid
  - requests
  - tensorflow

I get this log :

2022-01-24T11:06:54,887886931+00:00 - iot-server/run 
2022-01-24T11:06:54,891839877+00:00 - rsyslog/run 
2022-01-24T11:06:54,893640998+00:00 - gunicorn/run 
2022-01-24T11:06:54,912032812+00:00 - nginx/run 
EdgeHubConnectionString and IOTEDGE_IOTHUBHOSTNAME are not set. Exiting...
2022-01-24T11:06:55,398420960+00:00 - iot-server/finish 1 0
2022-01-24T11:06:55,414425146+00:00 - Exit code 1 is normal. Not restarting iot-server.

PATH environment variable: /opt/miniconda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PYTHONPATH environment variable: 

Pip Dependencies
---------------
brotlipy==0.7.0
certifi==2020.6.20
cffi @ file:///tmp/build/80754af9/cffi_1605538037615/work
chardet @ file:///tmp/build/80754af9/chardet_1605303159953/work
conda==4.9.2
conda-package-handling @ file:///tmp/build/80754af9/conda-package-handling_1603018138503/work
cryptography @ file:///tmp/build/80754af9/cryptography_1605544449973/work
idna @ file:///tmp/build/80754af9/idna_1593446292537/work
pycosat==0.6.3
pycparser @ file:///tmp/build/80754af9/pycparser_1594388511720/work
pyOpenSSL @ file:///tmp/build/80754af9/pyopenssl_1605545627475/work
PySocks @ file:///tmp/build/80754af9/pysocks_1594394576006/work
requests @ file:///tmp/build/80754af9/requests_1592841827918/work
ruamel-yaml==0.15.87
six @ file:///tmp/build/80754af9/six_1605205313296/work
tqdm @ file:///tmp/build/80754af9/tqdm_1605303662894/work
urllib3 @ file:///tmp/build/80754af9/urllib3_1603305693037/work

Starting HTTP server
2022-01-24T11:06:59,701365128+00:00 - gunicorn/finish 127 0
./run: line 127: exec: gunicorn: not found
2022-01-24T11:06:59,706177784+00:00 - Exit code 127 is not normal. Killing image.
    

I really don’t know what I’m missing, and I’ve been searching for too long already (Azure docs, SO, …).

Thanks for your help !

Edit : Non-exhaustive list of solutions I tried :

3

Answers


  1. Chosen as BEST ANSWER

    OK, I got it working : I started over from scratch and it worked.

    I have no idea what was wrong in all my preceding tries, and that is terrible.

    Multiple problems and how I (think I) solved them :

    • joblib : I actually didn't need it to load my Keras model. But the problem was not with this specific library, rather that I couldn't add dependencies to the inference environment.
    • Environment : finally, I was only able to make things work with a custom env : Environment.from_conda_specification(name=version, file_path="conda_dependencies.yml") . I haven't been able to add my libraries (or specify a specific package version) to a "currated environment". I don't know why though...
    • TensorFlow : last problem I had was that I trained and registered my model in AzureML Notebook's azureml_py38_PT_TF kernel (tensorflow==2.7.0), and tried to load it in the inference Docker image (tensorflow==2.4.0). So I had to specify the version of TensorFlow I wanted to use in the inference image (which required the previous point to be solved).

    What finally worked :

    • notebook.ipynb
    import uuid
    from azureml.core import Workspace, Environment, Model
    from azureml.core.webservice import AciWebservice
    from azureml.core.model import InferenceConfig
    
    
    version = "test-"+str(uuid.uuid4())[:8]
    
    env = Environment.from_conda_specification(name=version, file_path="conda_dependencies.yml")
    inference_config = InferenceConfig(entry_script="score.py", environment=env)
    
    ws = Workspace.from_config()
    model = Model(ws, model_name)
    
    aci_config = AciWebservice.deploy_configuration(
        cpu_cores=1,
        memory_gb=1,
    )
    
    service = Model.deploy(
        workspace=ws,
        name=version,
        models=[model],
        inference_config=inference_config,
        deployment_config=aci_config,
        overwrite=True,
    )
    
    service.wait_for_deployment(show_output=True)
    
    • conda_dependencies.yml
    channels:
    - conda-forge
    dependencies:
    - python=3.8
    - pip:
      - azureml-defaults
      - azureml-sdk
      - numpy
      - tensorflow==2.7.0
    
    
    • score.py
    import os
    import json
    import numpy as np
    import tensorflow as tf
    
    
    def init():
        global model
    
        model_path = os.path.join(os.getenv("AZUREML_MODEL_DIR"), "model/data/model")
        model = tf.keras.models.load_model(model_path)
    
    
    
    def run(raw_data):
        data = np.array(json.loads(raw_data)["data"])
        y_hat = model.predict(data)
    
        return y_hat.tolist()
    
    

  2. If you want to create a custom environment you can use the below code to set the env configuration.

    Creating the enviroment

    myenv = Environment(name="Environment")

    myenv.docker.enabled = True

    myenv.python.conda_dependencies = CondaDependencies.create(conda_packages = ['numpy','scikit-learn','pip','pandas'], pip_packages = ['azureml-defaults~= 1.34.0','azureml','azureml-core~= 1.34.0',"azureml-sdk",'inference-schema','azureml-telemetry~= 1.34.0','azureml- train-automl~= 1.34.0','azure-ml-api-sdk','python-dotenv','azureml-contrib-server','azureml-inference-server-http'])

    Ref doc: https://learn.microsoft.com/en-us/python/api/azureml-core/azureml.core.environment(class)?view=azure-ml-py#:~:text=Upload%20the%20private%20pip%20wheel,in%20the%20workspace%20storage%20blob.&text=Build%20a%20Docker%20image%20for%20this%20environment%20in%20the%20cloud.&text=Build%20the%20local%20Docker%20or%20conda%20environment.

    Login or Signup to reply.
  3. I think there is a small security issue with the implementation of joblib in Azure servers, do not load it in your code and it will run.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search