skip to Main Content

I am trying to create a Sagemaker endpoint for model inference using the Build your own algorithm container (https://sagemaker-examples.readthedocs.io/en/latest/advanced_functionality/scikit_bring_your_own/scikit_bring_your_own.html) but am having an issue when installing Numpy in the creation of the image.

We’ve already previously have gotten it to work with our old model, but the new vowpal wabbit model requires numpy, scikit-learn, pandas and vowpal wabbit library which is causing it to fail in the docker build. I’m not sure if we should continue using this container or should migrate to a python one or sagemaker one, but would need to support nginx.

#EDIT: Forgot to mention that when I build it locally, it is created successfully but when fails through Cloudformation.

Dockerfile here:

# This is a Python 3 image that uses the nginx, gunicorn, flask stack
# for serving inferences in a stable way.
FROM ubuntu:18.04

# Retrieves information about what packages can be installed
RUN apt-get -y update && 
    apt-get install -y --no-install-recommends 
        wget 
        python3-pip 
        python3.8 
        python3-setuptools 
        nginx 
        ca-certificates && 
    rm -rf /var/lib/apt/lists/*

# Set python 3.8 as default
RUN update-alternatives --install /usr/bin/python python /usr/bin/python3.8 1
RUN update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.8 1

# Get all python packages without excess cache created by pip.
COPY requirements.txt .
RUN pip3 install --upgrade pip setuptools wheel
RUN pip3 --no-cache-dir install -r requirements.txt

# Set some environment variables. PYTHONUNBUFFERED keeps Python from buffering our standard
# model_output stream, which means that logs can be delivered to the user quickly. PYTHONDONTWRITEBYTECODE
# keeps Python from writing the .pyc files which are unnecessary in this case. We also update
# PATH so that the train and serve programs are found when the container is invoked.
ENV PYTHONUNBUFFERED=TRUE
ENV PYTHONDONTWRITEBYTECODE=TRUE
ENV PATH="/opt/program:${PATH}"
ENV PYTHONPATH /model_contents

# Set up the program in the image
COPY bandit/ /opt/program/
WORKDIR /opt/program/

# create directories for storing model and vectorizer
RUN mkdir model && mkdir vectorizer

# Give permissions to run scripts
RUN chmod +x /opt/program/serve && chmod +x /opt/program/train

requirements.txt here:

sagemaker==2.25.1
typing-extensions==3.7.4.3
numpy==1.20.1
boto3==1.17.12
awscli==1.19.12
python-dotenv==0.15.0
flask==1.1.2
scikit-learn==1.0.0
pandas==1.3.5
vowpalwabbit==8.11.0

Full traceback here:

Running setup.py install for numpy: started

    Running setup.py install for numpy: finished with status 'error'

    Complete output from command /usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-cd653krx/numpy/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('rn', 'n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-q3eo46tw-record/install-record.txt --single-version-externally-managed --compile:

    Running from numpy source directory.

    Note: if you need reliable uninstall behavior, then install

    with pip instead of using `setup.py install`:

      - `pip install .`       (from a git repo or downloaded source

                               release)

      - `pip install numpy`   (last NumPy release on PyPi)

    Cythonizing sources

    Processing numpy/random/_bounded_integers.pxd.in

    Processing numpy/random/_bounded_integers.pyx.in

    Traceback (most recent call last):

      File "/tmp/pip-build-cd653krx/numpy/tools/cythonize.py", line 53, in process_pyx

        import Cython

    ModuleNotFoundError: No module named 'Cython'

    The above exception was the direct cause of the following exception:

    Traceback (most recent call last):

      File "/tmp/pip-build-cd653krx/numpy/tools/cythonize.py", line 234, in <module>

        main()

      File "/tmp/pip-build-cd653krx/numpy/tools/cythonize.py", line 230, in main

        find_process_files(root_dir)

      File "/tmp/pip-build-cd653krx/numpy/tools/cythonize.py", line 221, in find_process_files

        process(root_dir, fromfile, tofile, function, hash_db)

      File "/tmp/pip-build-cd653krx/numpy/tools/cythonize.py", line 187, in process

        processor_function(fromfile, tofile)

      File "/tmp/pip-build-cd653krx/numpy/tools/cythonize.py", line 90, in process_tempita_pyx

        process_pyx(pyxfile, tofile)

      File "/tmp/pip-build-cd653krx/numpy/tools/cythonize.py", line 60, in process_pyx

        raise OSError(msg) from e

    OSError: Cython needs to be installed in Python as a module

    Traceback (most recent call last):

      File "<string>", line 1, in <module>

      File "/tmp/pip-build-cd653krx/numpy/setup.py", line 450, in <module>

        setup_package()

      File "/tmp/pip-build-cd653krx/numpy/setup.py", line 432, in setup_package

        generate_cython()

      File "/tmp/pip-build-cd653krx/numpy/setup.py", line 237, in generate_cython

        raise RuntimeError("Running cythonize failed!")

    RuntimeError: Running cythonize failed!

2

Answers


  1. Chosen as BEST ANSWER

    Solved the issue. The numpy version was causing conflicts with the rest of the packages so downgraded which solved the issue.


  2. There are 2 ways to get around the issue –

    1. Add the numpy version you need as part of your requirements.txt ( preferred way so that you can manage your dependencies and version better)

    2. Install in dependency in the Dockerfile directly.

    I work at AWS and my opinions are my own – Thanks,Raghu

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search