How to get better OCR results (without confusing digits for letters)

16 次查看(过去 30 天)
Hello all,
I'm trying to use OCR to determine the axes scale on a graph:
(I want to be able to extract the numbers "0, 32000, 4000, etc." on the y-axis, and "-50, 50, 150, etc." on the x-axis)
My initial attempt is this code:
detect = ocr(justAxes, 'TextLayout', "Block");
Iocr = insertObjectAnnotation(justAxes, 'rectangle', ...
detect.WordBoundingBoxes, ...
detect.Words + " " + detect.WordConfidences);
figure; imshow(Iocr);
words_string = detect.Words;
Which gives me this result:
The results aren't bad, but I'm wondering if there is any preprocessing I can do to avoid the OCR misreading digits as letters (e.g. the '50' as 'so', the '8000' as 'sooo', and to '0' as 'o'). Can I somehow tilt the OCR to detect digits more than it detects letters? Or do I have to preprocess the image further in some way?

采纳的回答

Image Analyst
Image Analyst 2021-7-6
You need to have your digits be at least 20 pixels high, as stated in the help. I also had trouble with some that where the image chunk I gave it had the numbers that were only 10 or 12 pixels high and while a human could tell what they were, the ocr() function was misidentifying the numbers. I called imresize() on each image chunk to make the image 20 pixels high and then it properly identified the number. If that doesn't work, write back and attach your code and image.

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Image Processing and Computer Vision 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by