I installed tesseract5 on WSL (Ubuntu 22.04.1LTS) and tried to detect numbers from images as follows, but Tesseract returned wrong answers. How can I get right answers?

My environment:

  • Windows 11 22H2
  • WSL2 Ubuntu 22.04.1LTS
  • tesseract 5.3.1-20-g58b7

I tried Tesseract like this

tesseract hoge.jpg output -l eng

and output.txt is


Here is hoge.jpg.

Thank you for helping in advance. I’m a Japanese student, so my English may be not so good. So if you think it’s not good English, please change this post to more readable.



  1. I’ve given this image an attempt with basic image manipulation in python with pytesseract with mixed results. There seems to be two challenges in this image: the noisy background and the slant of the numbers. Using thresholding to set pixels to either black and white was able to almost get the bottom number as "6/0", but the slant of the "1" keeps getting recognized as a "/". The top gets read as "SEF", and I haven’t figured out how to get a better result there.

    from PIL import Image
    import pytesseract as tess
    img ='zgKoF.jpg')
    img_arr = np.array(img)
    img_arr[img_arr > 150] = 255
    img_arr[img_arr < 100] = 0
  2. From bad picture you will never get good results. I played a bit and get this one:

    import subprocess
    import cv2
    import pytesseract
    # Image manipulation
    # Commands
    mag_img = r'D:ProgrammeImageMagicmagick.exe'
    con_bw = r"D:ProgrammeImageMagicconvert.exe" 
    in_file = r'ZZ_Numbers.jpg'
    out_file = r'ZZ_Numbers_bw.png'
    # Play with black and white and rotate for better results
    process =[con_bw, in_file, "-resize", "70%","-threshold","60%", "-rotate", "-17", "-brightness-contrast","-15x30",out_file])
    # Text ptocessing
    pytesseract.pytesseract.tesseract_cmd=r'C:Program FilesTesseract-OCRtesseract.exe'
    img = cv2.imread(out_file)
    # Parameters see tesseract doc 
    custom_config = r'--psm 11 --oem 3 tessedit_char_whitelist=0123456789' 
    tex = pytesseract.image_to_string(img, config=custom_config)
    with open("cartootn.txt", 'w') as f:

