python - Improving pytesseract correct text recognition from image

Question

Welcome To Ask or Share your Answers For Others

python - Improving pytesseract correct text recognition from image

asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

I am trying to read captcha using pytesseract module. And it is giving accurate text most of the time, but not all the time.

This is code to read the image, manipulate the image and extract text from the image.

import cv2
import numpy as np
import pytesseract

def read_captcha():
    # opencv loads the image in BGR, convert it to RGB
    img = cv2.cvtColor(cv2.imread('captcha.png'), cv2.COLOR_BGR2RGB)

    lower_white = np.array([200, 200, 200], dtype=np.uint8)
    upper_white = np.array([255, 255, 255], dtype=np.uint8)

    mask = cv2.inRange(img, lower_white, upper_white)  # could also use threshold
    mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3)))  # "erase" the small white points in the resulting mask
    mask = cv2.bitwise_not(mask)  # invert mask

    # load background (could be an image too)
    bk = np.full(img.shape, 255, dtype=np.uint8)  # white bk

    # get masked foreground
    fg_masked = cv2.bitwise_and(img, img, mask=mask)

    # get masked background, mask must be inverted 
    mask = cv2.bitwise_not(mask)
    bk_masked = cv2.bitwise_and(bk, bk, mask=mask)

    # combine masked foreground and masked background 
    final = cv2.bitwise_or(fg_masked, bk_masked)
    mask = cv2.bitwise_not(mask)  # revert mask to original

    # resize the image
    img = cv2.resize(mask,(0,0),fx=3,fy=3)
    cv2.imwrite('ocr.png', img)

    text = pytesseract.image_to_string(cv2.imread('ocr.png'), lang='eng')

    return text

For manipulation of the image, I have got help from this stackoverflow post.

And this the original captcha image:

And this image is generated after the manipulation:

But, by using pytesseract, I am getting text: AX#7rL.

Can anyone guide me on how to improve the success rate to 100% here?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

978 views

1 Answer

深蓝 · Answer 1 · 2021-10-23T21:39:50+0000

Since there are tiny holes in your resulting image, morphological transformations, specifically cv2.MORPH_CLOSE, to close the holes and smooth the image should work here

Threshold to obtain a binary image (black and white)

Perform morphological operations to close small holes in the foreground

Inverse the image to get result

4X#7rL

Potentially a cv2.GaussianBlur() before inserting into tesseract would help too

import cv2
import pytesseract

# Path for Windows
pytesseract.pytesseract.tesseract_cmd = r"C:Program FilesTesseract-OCResseract.exe"

# Read in image as grayscale
image = cv2.imread('1.png',0)
# Threshold to obtain binary image
thresh = cv2.threshold(image, 220, 255, cv2.THRESH_BINARY)[1]

# Create custom kernel
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
# Perform closing (dilation followed by erosion)
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)

# Invert image to use for Tesseract
result = 255 - close
cv2.imshow('thresh', thresh)
cv2.imshow('close', close)
cv2.imshow('result', result)

# Throw image into tesseract
print(pytesseract.image_to_string(result))
cv2.waitKey()

Categories

python - Improving pytesseract correct text recognition from image

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags