predict

Predict entities using named entity recognition (NER) model

Since R2023a

Syntax

tbl = predict(mdl,documents)

Description

The predict function detects named entities in text using a hmmEntityModel object.

To add entity details to documents using a custom NER model, use addDependencyDetails and set the Model option to the custom model.

tbl = predict(mdl,documents) predicts the named entities of the tokens in the specified documents using the NER model mdl.

example

Examples

collapse all

Make Predictions Using Custom HMM Entity Model

Open Live Script

Load the trained example hmmEntityModel object.

load exampleEntityModel
mdl

mdl = 
  hmmEntityModel with properties:

    Entities: [3x1 categorical]

Create a tokenized document object of text data.

str = "MathWorks develops MATLAB and Simulink.";
document = tokenizedDocument(str);

Make predictions using the predict function.

tbl = predict(mdl,document)

tbl=6×2 table
       Token           Entity    
    ___________    ______________

    "MathWorks"    B-organization
    "develops"     non-entity    
    "MATLAB"       B-product     
    "and"          non-entity    
    "Simulink"     B-product     
    "."            non-entity

Input Arguments

collapse all

`mdl` — Custom NER model
`hmmEntityModel` object

Custom NER model, specified as a hmmEntityModel object. To train a custom NER model, use the trainHMMEntityModel function.

For an example, see Train Custom Named Entity Recognition Model.

`documents` — Input documents
`tokenizedDocument` array

Input documents, specified as a tokenizedDocument array.

Output Arguments

collapse all

`tbl` — Predicted entities
table

Predicted entities, returned as a table with these variables:

Token — Input token
Entity — Predicted entity in IOB2 labeling scheme, for more information, see Inside, Outside, Beginning (IOB) Labeling Schemes.

Algorithms

collapse all

Inside, Outside, Beginning (IOB) Labeling Schemes

The inside, outside (IO) labeling scheme tags entities with "O" or prefixes the entities with "I". The tag "O" (outside) denotes nonentities. For each token in an entity, the tag is prefixed with "I-" (inside), which signifies that the token is part of an entity.

The IO labeling scheme does not specify entity boundaries between adjacent entities of the same type. The inside, outside, beginning (IOB) labeling scheme, also known as the beginning, inside, outside (BIO) labeling scheme, addresses this limitation by introducing a "beginning" prefix.

The IOB labeling scheme has two variants: IOB1 and IOB2.

IOB2 Labeling Scheme

For each token in an entity, the tag is prefixed with one of these values:

"B-" (beginning) — The token is a single-token entity or the first token of a multitoken entity.
"I-" (inside) — The token is a subsequent token of a multitoken entity.

For a list of entity tags Entity, the IOB labeling scheme helps identify boundaries between adjacent entities of the same type by using this logic:

If Entity(i) has the prefix "B-" and Entity(i+1) is "O" or has the prefix "B-", then Token(i) is a single entity.
If Entity(i) has the prefix "B-", Entity(i+1), ..., Entity(N) have the prefix "I-", and Entity(N+1) is "O" or has the prefix "B-", then the phrase Token(i:N) is a multitoken entity.

IOB1 Labeling Scheme

The IOB1 labeling scheme does not use the prefix "B-" when an entity token follows an "O-" prefix. In this case, an entity token that is the first token in a list or that follows a nonentity token is the first token of an entity. That is, if Entity(i) has the prefix "I-" and i is equal to 1 or Entity(i-1) has the prefix "O-", then Token(i) is a single-token entity or the first token of a multitoken entity.

Alternative Functionality

To add entity details to documents using a custom NER model, use addDependencyDetails and set the Model option to the custom model.

Version History

Introduced in R2023a

predict

Syntax

Description

Examples

Make Predictions Using Custom HMM Entity Model

Input Arguments

`mdl` — Custom NER model
`hmmEntityModel` object

`documents` — Input documents
`tokenizedDocument` array

Output Arguments

`tbl` — Predicted entities
table

Algorithms

Inside, Outside, Beginning (IOB) Labeling Schemes

Alternative Functionality

Version History

See Also

Topics

predict

Syntax

Description

Examples

Make Predictions Using Custom HMM Entity Model

Input Arguments

mdl — Custom NER model hmmEntityModel object

documents — Input documents tokenizedDocument array

Output Arguments

tbl — Predicted entities table

Algorithms

Inside, Outside, Beginning (IOB) Labeling Schemes

Alternative Functionality

Version History

See Also

Topics

`mdl` — Custom NER model
`hmmEntityModel` object

`documents` — Input documents
`tokenizedDocument` array

`tbl` — Predicted entities
table