skip to Main Content

I installed tesseract5 on WSL (Ubuntu 22.04.1LTS) and tried to detect numbers from images as follows, but Tesseract returned wrong answers. How can I get right answers?

My environment:

  • Windows 11 22H2
  • WSL2 Ubuntu 22.04.1LTS
  • tesseract 5.3.1-20-g58b7

I tried Tesseract like this

tesseract hoge.jpg output -l eng

and output.txt is

Fb¥
&/0

Here is hoge.jpg.

enter image description here

Thank you for helping in advance. I’m a Japanese student, so my English may be not so good. So if you think it’s not good English, please change this post to more readable.

2

Answers


  1. I’ve given this image an attempt with basic image manipulation in python with pytesseract with mixed results. There seems to be two challenges in this image: the noisy background and the slant of the numbers. Using thresholding to set pixels to either black and white was able to almost get the bottom number as "6/0", but the slant of the "1" keeps getting recognized as a "/". The top gets read as "SEF", and I haven’t figured out how to get a better result there.

    from PIL import Image
    import pytesseract as tess
    
    img = Image.open('zgKoF.jpg')
    
    img_arr = np.array(img)
    
    img_arr[img_arr > 150] = 255
    img_arr[img_arr < 100] = 0
    
    tess.image_to_string(img_arr)
    
    Login or Signup to reply.
  2. From bad picture you will never get good results. I played a bit and get this one:

    import subprocess
    import cv2
    import pytesseract
    
    # Image manipulation
    # Commands https://imagemagick.org/script/convert.php
    mag_img = r'D:ProgrammeImageMagicmagick.exe'
    con_bw = r"D:ProgrammeImageMagicconvert.exe" 
    
    in_file = r'ZZ_Numbers.jpg'
    out_file = r'ZZ_Numbers_bw.png'
    
    # Play with black and white and rotate for better results
    process = subprocess.run([con_bw, in_file, "-resize", "70%","-threshold","60%", "-rotate", "-17", "-brightness-contrast","-15x30",out_file])
    
    # Text ptocessing
    pytesseract.pytesseract.tesseract_cmd=r'C:Program FilesTesseract-OCRtesseract.exe'
    img = cv2.imread(out_file)
    
    # Parameters see tesseract doc 
    custom_config = r'--psm 11 --oem 3 tessedit_char_whitelist=0123456789' 
    
    tex = pytesseract.image_to_string(img, config=custom_config)
    print(tex)
    
    with open("cartootn.txt", 'w') as f:
        f.writelines(tex)
    
    cv2.imshow('image',img)
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    

    Output:
    enter image description here

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search