ocrText
Store OCR results
Description
The ocrText
object contains recognized text and metadata
collected during optical character recognition (OCR). You can locate text that matches a
specific pattern with the locateText
function.
Creation
Create an ocrText
object or array using the ocr
function.
Properties
Text
— Text recognized by OCR
cell array of characters
Text recognized by OCR, specified as a cell array of characters. The text includes white space and new line characters.
CharacterBoundingBoxes
— Character bounding box locations and sizes
M-by-4 matrix
Character bounding box locations and sizes, specified as an
M-by-4 matrix. Each row of the matrix is of the form,
[x
y
width
height]. The [x
y] elements correspond to the upper-left corner of the bounding box.
The [width
height] elements correspond to the horizontal and vertical size,
respectively, of the rectangular region in pixels. The bounding boxes enclose the text
found in an image by the ocr
function. Widths and heights of
bounding boxes that correspond to new line characters have values of zero. Bounding
boxes for character modifiers found in languages such as Hindi, Tamil, and Bengali, also
have values of zero.
CharacterConfidences
— Character recognition confidences
array
Character recognition confidence, specified as an array. Confidence values are in
the range [0, 1]. Interpret each confidence value set by the ocr
function, as a probability. The ocr
function sets confidence values for spaces between words and new line
characters to NaN
, as OCR does not explicitly recognize spaces and
new line characters. You can use the confidence values to identify the location of
misclassified text within the image by extracting characters with low confidence.
Words
— Recognized words
cell array of character vectors
Recognized words, specified as a cell array of character vectors.
WordBoundingBoxes
— Word bounding box location and size of words
N-by-4 matrix
Word bounding box locations and sizes, specified as an N-by-4 matrix. Each row of the matrix is of the form, [x y width height], and specifies the upper-left corner position and size of a rectangular region in pixels.
WordConfidences
— Word recognition confidences
vector
Word recognition confidences, specified as a vector of probability values in the
range [0,1]. The ocr
function sets confidence values for
spaces between words and new line characters to NaN
, as OCR does not
explicitly recognize spaces and new line characters. You can use confidence values to
identify the location of misclassified text within the image by extracting words with
low confidence.
TextLines
— Recognized text lines
cell array of character vectors
Recognized text lines, specified as a cell array of character vectors.
TextLineBoundingBoxes
— Text line bounding box location and size
N-by-4 matrix
Text line bounding box location and size, specified as an N-by-4 matrix. Each row of the matrix is of the form, [x y width height], and specifies the upper-left corner position and size of a rectangular region in pixels.
TextLineConfidences
— Text line confidences
vector
Text line confidences, specified as a vector of probability values in the range [0,1]. Use confidence values to identify the location of misclassified text lines within the image by extracting text lines with low confidence.
Object Functions
locateText | Locate text pattern |
Examples
Find and Highlight Text in Image
Load an image containing text into the workspace.
businessCard = imread("businessCard.png"); ocrResults = ocr(businessCard); bboxes = locateText(ocrResults,"Math",IgnoreCase=true); Iocr = insertShape(businessCard,"FilledRectangle",bboxes); figure imshow(Iocr)
Find Text Using Regular Expressions
Load an image containing text into the workspace.
businessCard = imread("businessCard.png"); ocrResults = ocr(businessCard); bboxes = locateText(ocrResults, "www.*com","UseRegexp", true); img = insertShape(businessCard, "FilledRectangle", bboxes); figure imshow(img)
Extended Capabilities
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
Usage notes and limitations:
Use in a MATLAB Function block is not supported.
The
Words
property cannot be accessed in code generation. Use theText
property in place of theWords
property to access the OCR results.
Version History
Introduced in R2014a
See Also
Apps
Objects
Functions
ocr
|insertShape
|regexp
|strfind
|quantizeOCR
|evaluateOCR
|trainOCR
|ocrTrainingData
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)