skip to Main Content

I’m trying to build a docker with python 3 and google-cloud-bigquery with the following docker file:

FROM python:3.10-alpine

RUN pip3 install google-cloud-bigquery

WORKDIR /home

COPY *.py /home/

ENTRYPOINT ["python3", "-u", "myscript.py"]

But getting errors on the pip3 install google-cloud-bigquery (too long for here)..
What’s missing for installing this on python-alpine?

2

Answers


  1. Looks like an incompatibility issue with the latest version of google-cloud-bigquery (>3) and numpy:

    ERROR: Could not build wheels for numpy, which is required to install pyproject.toml-based projects

    Try specifying a previous version, this works for me:

    RUN pip3 install google-cloud-bigquery==2.34.4

    Login or Signup to reply.
  2. Actually it seems like not a problem with numpy, which builds smoothly with all the dependency libs install, but rather with pyarrow, which does not support alpine+pip build. I’ve found a workaround by using alpine pre-built version of pyarrow. It is much easier than building pyarrow from source. This build works for me just fine:

    FROM python:3.10.6-alpine3.16
    
    RUN apk add --no-cache build-base linux-headers 
        py3-apache-arrow=8.0.0-r0
    
    # Copying pyarrow to site-package of actual python path. Alpine python path
    # and python's docker hub path are different.
    RUN mv /usr/lib/python3.10/site-packages/*  
        /usr/local/lib/python3.10/site-packages/
    RUN rm -rf /usr/lib/python3.10
    
    RUN --mount=type=cache,target=/root/.cache/pip 
        pip install google-cloud-bigquery==3.3.2
    

    Update python version, alpine version and py3-apache-arrow version to install later versions. This is the latest one on the time of writing.

    And make sure to remove build dependencies (build-base, linux-headers) for your release docker. I prefer multistage dockers for this.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search