I am pretty new to Great Expectations (GX) and very new to Docker, and now I am trying to combine the two. I can get a Docker image to build just fine, but when I try to run a container, it fails. I can get my GX Checkpoint to run from both the GX CLI, as well as from a Python file.
I have tried to run a docker image using both a Python base image (and running the Python file from the image), as well as a GX base image.
Something specific to the GX documentation that I think is important, I will highlight below:
You need to mount the local great_expectations directory into the container at /usr/app/great_expectations, and from there you can run all non-interactive commands, such as running checkpoints and listing items.
I will break up the two paths below:
Python Base Image
The Python Image version of my Dockerfile is basically:
FROM python:3.8-slim
COPY . ./src
RUN pip install -r ./src/requirements.txt
CMD ["python3", "./src/validate_data.py"]
(where my Python file that works outside of Docker is validate_data.py
)
When I run this container, I get the following error:
Error: No great_expectations directory was found here!
- Please check that you are in the correct directory or have specified the correct directory.
- If you have never run Great Expectations in this project, please run `great_expectations init` to get started.
GX Base Image
The GX Image version of my Dockerfile (which is contained in my great_expectations/
folder is similar to:
FROM greatexpectations/great_expectations:python-3.7-buster-ge-0.12.0
ADD . /usr/app/great_expectations
COPY . ./src
CMD ["checkpoint", "run", "data_checkpoint"]
(where my Checkpoint that works from the CLI outside of Docker is data_checkpoint
)
Note: Prior to adding ADD . /usr/app/great_expectations
to the Dockerfile, I was getting an identical error to the Python path.
I get the following error:
{'include_rendered_content': ['Unknown field.'], 'checkpoint_store_name': ['Unknown field.']}
Encountered errors during loading data context config. See ValidationError for more details.
Things I have tried:
Python Base Image
All the things I have tried:
- Adding
ADD . /usr/app/great_expectations
to my Dockerfile - Moving the Dockerfile from within my
great_expectations/
folder to a level above - Adding
great_expectations init
to the Dockerfile. (The image doesn’t build in this case) - Mounting my local GX directory to
/usr/app/great_expectations
when I run the container
No matter what I have tried, I get the same error.
GX Base Image
I found include_rendered_content
and checkpoint_store_name
in my great_expectations.yml
config file. I commented out those lines because I was unsure of their utility, and I got a new error:
You appear to have an invalid config version (3.0). The maximum valid version is 2.
So, I am guessing the reason I am getting these new errors is because the GX base image was built off of v2 of Great Expectations, and I have been using v3 when building out the GX testing infrastructure on my local.
So, that is really leading me to want to make the Python base image path described above work, but that’s the one I have made less progress on solving.
2
Answers
I am not sure if this is a legit solution or just a hack, but I was able to get round my problem by changing
COPY . ./src
in my Dockerfile toCOPY . ./great_expectations
, so my Dockerfile (which exists inside mygreat_expectations/
directory) now looks similar to this:If you want to use the image based on
python:3.8-slim
it looks like what you need is a Docker volume to mount your local version of thegreat_expectations
directory to the container. You can do that by adding-v /usr/app/great_expectations:<your_local_great_expectations_folder>
to yourdocker run
command.