Wolfram Language & System Documentation Center

FindImageText

FindImageText[image]

detects text in image and returns a single bounding box.

FindImageText[image,level]

returns a list of bounding boxes at the specified structural level.

FindImageText[image,level,prop]

returns prop for text at the given level.

FindImageText[video,…]

detects text in frames of video.

Details and Options

FindImageText is used to detect the region of an image containing text. When asking for a specific structural level, it returns a list of bounding boxes, each given as a Parallelogram.

Coordinates {x,y} are assumed to be in the standard image coordinate system.
Use TextRecognize to recognize the content of the detected text.
Possible settings for level include:

	Automatic	text found in the whole image as a single result (default)
	"Block"	a list of results for each block of text
	"Line"	a list of results for each line
	"Word"	a list of results for each word
	"Character"	a list of results for each character

Possible settings for prop include:

	"AlignedImage"	cropped aligned image containing each detected text
	"BoundingBox"	bounding box around each detected text as a Rectangle
	"BoundingBoxArea"	area of the bounding box around each text
	"Confidence"	strength of the recognized text
	"DeskewAngle"	deskew angle of the detected text
	"Image"	cropped image containing the detected text
	"OrientedBoundingBox"	parallelogram around each detected text (default)
	"RegionCentroid"	centroid of bounding box around the text
	{prop₁,prop₂,…}	a list of properties

The following options can be specified:

AcceptanceThreshold	Automatic	detection acceptance threshold
MaxFeatures	All	number of text boxes to return
MaxOverlapFraction	Automatic	maximum allowed overlap fraction
Method	Automatic	method to use
PaddingSize	0	amount of padding around each detection

Possible settings for Method include:

	Automatic	automatic choice of the method
	"Document"	optimized for detection in scanned documents
	"NaturalScene"	optimized for detection in natural scene images
	detector	the text detection method to use

Possible setting for detector are:
"DBNet" differentiable binarization net

"Tesseract" Tesseract engine
FindImageText uses machine learning. Its methods, training sets and biases included therein may change and yield varied results in different versions of the Wolfram Language.
FindImageText may download resources that will be stored in your local object store at $LocalBase, and that can be listed using LocalObjects[] and removed using ResourceRemove.

Examples

open all close all

Basic Examples (2)

Detect text in an image:

Extract cropped images containing the detected words:

Scope (8)

Basic Uses (2)

Find text in an image:

Highlight the detected text in an image:

Level (1)

Detect the text content of an image:

Detect blocks of text:

Detect lines of text:

Detect single words:

Detect single characters:

Properties (5)

By default, an oriented bounding box is returned for every detection:

Return the standard axis-aligned bounding box:

Extract the portion of the image containing each detected word:

Align the crops to the image frame:

Extract the bounding box area of each detected word:

Region centroid of every word:

Compute and return multiple properties at once:

Options (7)

AcceptanceThreshold (1)

By default, an acceptance threshold of 0.5 is used:

Decreasing the acceptance threshold may help detect more text:

MaxFeatures (1)

By default, all detected text is returned:

Use MaxFeatures30 to return only the 30 strongest detections:

MaxOverlapFraction (1)

By default, boxes that are slightly overlapping are returned:

Specify a custom maximum overlap:

Return only non-overlapping boxes:

Method (3)

By default, FindImageText tries to pick the detection method more suitable to the image:

Specify a custom method:

Using an unsuitable method might not give a good result:

Use Method"NaturalScene" to detect text present in natural scenes:

Use Method"Document" for scanned documents:

PaddingSize (1)

Use PaddingSizes to specify a padding for the detected word bounding boxes:

Use different padding sizes along the two bounding box axes:

Use a relative padding size:

Use a negative padding to return tighter bounding boxes:

Applications (1)

It might be difficult to perform OCR on an image with a lot of non-textual content:

Use FindImageText first to preprocess the image:

Performing OCR on the crops yields better results:

Properties & Relations (3)

FindImageText can detect text regardless of the orientation:

FindImageText is used to detect text content within an image:

Use TextRecognize to perform OCR on the image content:

Use FindImageText to detect the license plate in the image:

Use TextRecognize to recognize the license plate and highlight it in the original image:

Possible Issues (4)

The detection is not optimized for handwritten text:

Depending on the text orientation, the detected bounding boxes may extend beyond the image borders:

The detection might fail for images with multiple text orientations:

Text in certain orientations might not be detected:

Top

More Learning

Tech Support

Wolfram Solutions

Wolfram Solutions For Education

Get Started

Grow Your Skills

Work with Us

Educational Programs for Adults

Educational Programs for Youth

Read

FindImageText

Details and Options

Examples

Basic Examples (2)

Scope (8)

Basic Uses (2)

Level (1)

Properties (5)

Options (7)

AcceptanceThreshold (1)

MaxFeatures (1)

MaxOverlapFraction (1)

Method (3)

PaddingSize (1)

Applications (1)

Properties & Relations (3)

Possible Issues (4)

Text

CMS

APA

BibTeX

BibLaTeX

	"DBNet"	differentiable binarization net
	"Tesseract"	Tesseract engine

FindImageText

Details and Options

Examples

Basic Examples (2)

Scope (8)

Basic Uses (2)

Level (1)

Properties (5)

Options (7)

AcceptanceThreshold (1)

MaxFeatures (1)

MaxOverlapFraction (1)

Method (3)

PaddingSize (1)

Applications (1)

Properties & Relations (3)

Possible Issues (4)

See Also

Related Guides

History

Text

CMS

APA

BibTeX

BibLaTeX