In my Dockerfile, I have
RUN apk update && apk add tesseract-ocr=3.04
Which errors with:
unable to select packages:
tesseract-ocr-4.1.3-r0:
breaks: world[tesseract-ocr=3.04]
I’ve also tried add tesseract-ocr=3.04.01
, which is how it’s listed on the releases page.
Simple add tesseract-ocr
installs version 4.13, but I need 3.04 specifically.
2
Answers
Answer in progress:
From this question: Install older package version in Alpine
I see that each new version of Alpine updates its packages, tossing out the older versions to stay lean.
So we need to find the version of Alpine that corresponds to the date that Tesseract 3.04 was released, and use
FROM Alpine:3.5
in Dockerfile.Then
add tesseract-ocr
will add the only version available in that Alpine version.It seems that Alpine 3.5 or 3.4 should have Tesseract 3.04.
EDIT: I've run into a problem, which is that
FROM Alpine:3.5
fails. Investigating.EDIT 2: the earliest official version of Alpine available on Docker Hub is 3.12, which is why
alpine:3.5
fails. So...If anyone knows a better way, please shout it out!
It’s true that Alpine repositories only keep the latest version of each package, and that old package version may still be found in repositories of old Alpine versions. However, you don’t necessarily need to revert to an older Alpine version as the base image. Instead, you could instruct
apk
to install a specific package version from the desired repository. This will work if the installed package doesn’t have dependencies which conflict with other installed Alpine packages.tesseract-ocr
3.04 is available up to Alpine 3.6:https://pkgs.alpinelinux.org/package/v3.6/community/x86_64/tesseract-ocr
For installing it on a newer Alpine image, you should use apk’s
--repository
parameter for specifying the 3.6 community repository. Also, make sure to specify the full version name for tessaract, which is3.04.01-r1
. For locating the exact package version number, use Alpine’s package search.Putting it together: