I’ve seen a lot of tutorials about how to build an application using docker and most of the time the Dockerfile copies package.json, package-lock.json then runs install, and finally copies the rest of the application.
Example of Dockerfile copying package.json, package-lock.json before the rest of the application
FROM node:12.18.2 as build
ARG REACT_APP_SERVICES_HOST=/services/m
WORKDIR /app
COPY ./package.json /app/package.json
COPY ./package-lock.json /app/package-lock.json
RUN yarn install
COPY . .
RUN yarn build
FROM nginx
COPY ./nginx/nginx.conf /etc/nginx/conf.d/default.conf
COPY --from=build /app/build /usr/share/nginx/html
It got me triggered why this approach and not copying the entire application and then running the install. If the intention is to not copy the node_modules folder, just adding the folder at the .dockerignore file would resolve this issue. Or it is another problem that it’s been prevented when doing in this order.
2
Answers
Docker images consist of layers. We can see layers, for example, when we pull an image. If we pull an image, we may see a message like "layer already exists", indicating that we have already downloaded this layer and do not need to re-download it again.
When we build an image from a containerfile, the instructions
RUN
,COPY
andADD
in the containerfile create a new layer. Our goal is to layer our image in such a way that – read from top to bottom – the content changes less frequent.Within our application, the most likely part to change is the source code itself. The dependencies of an app change less frequently. Hence, we want to layer the dependencies in a separate layer such that when we rebuild the container and only the source code changes, only the changes in source code can be pushed (and consequently pulled when we deploy), speeding up our deployment time.
It’s done because of Docker layer caching. If you run
docker build
when none of the copied files have changed, the image layers will just be read from the cache. If aCOPY
file has changed, all layers after that in the image will be rebuilt.RUN yarn install
is an expensive operation and we don’t want to need to execute it again when any random source files in the project change. This way, it’s only re-executed if thepackage.json
orpackage-lock.json
files have changed.