bertDocumentClassifier

BERT document classifier

Since R2023b

Description

A Bidirectional Encoder Representations from Transformer (BERT) model is a transformer neural network that can be fine-tuned for natural language processing tasks such as document classification and sentiment analysis. The network uses attention layers to analyze text in context and capture long-range dependencies between words.

Creation

Description

mdl = bertDocumentClassifier creates a bertDocumentClassifier object.

example

mdl = bertDocumentClassifier(net,tokenizer) creates a bertDocumentClassifier object from the specified BERT neural network and tokenizer.

mdl = bertDocumentClassifier(___,Name=Value) sets the ClassNames property and additional options using one or more name-value arguments.

example

Input Arguments

expand all

`net` — BERT neural network
`dlnetwork` object

BERT neural network, specified as a dlnetwork (Deep Learning Toolbox) object.

If you specify the net argument, then you must not specify the Model argument. The network must have three sequence input layers with input sizes of one. The output size of the network must match the number of classes in the ClassNames property. The inputs in net.InputNames(1), net.InputNames(2), and net.InputNames(3) must be the inputs for the input data, the attention mask, and the segments, respectively.

`tokenizer` — BERT tokenizer
`bertTokenizer` object

BERT tokenizer, specified as a bertTokenizer object.

If you specify the tokenizer argument, then you must not specify the Model argument.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: bertDocumentClassifier(Model="tiny") creates a BERT-Tiny document classifier.

`Model` — BERT model
`"base"` (default) | `"tiny"` | `"mini"` | `"small"` | `"large"` | `"multilingual"`

BERT model, specified as one of these options:

"base" — BERT-Base model. This option requires the Text Analytics Toolbox™ Model for BERT-Base Network support package. This model has 108.8 million learnable parameters.
"tiny" — BERT-Tiny model. This option requires the Text Analytics Toolbox Model for BERT-Tiny Network support package. This model has 4.3 million learnable parameters.
"mini" — BERT-Mini model. This option requires the Text Analytics Toolbox Model for BERT-Mini Network support package. This model has 11.1 million learnable parameters.
"small" — BERT-Small model. This option requires the Text Analytics Toolbox Model for BERT-Small Network support package. This model has 28.5 million learnable parameters.
"large" — BERT-Large model. This option requires the Text Analytics Toolbox Model for BERT-Large Network support package. This model has 334 million learnable parameters.
"multilingual" — BERT-Base multilingual model. This option requires the Text Analytics Toolbox Model for BERT-Base Multilingual Cased Network support package. This model has 177.2 million learnable parameters.

If you specify the Model argument, then you must not specify the net and tokenizer arguments.

Tip

To customize the BERT neural network architecture, modify the dlnetwork (Deep Learning Toolbox) object output of the bert function and use the net and tokenizer arguments.

`DropoutProbability` — Probability of dropping out input elements in dropout layers
`0.1` (default) | scalar in the range [0, 1)

Probability of dropping out input elements in dropout layers, specified as a scalar in the range [0, 1).

When you train a neural network with dropout layers, the layer randomly sets input elements to zero using the dropout mask rand(size(X)) < p, where X is the layer input and p is the layer dropout probability. The layer then scales the remaining elements by 1/(1-p).

This operation helps to prevent the network from overfitting [2], [3]. A higher number results in the network dropping more elements during training. At prediction time, the output of the layer is equal to its input.

`AttentionDropoutProbability` — Probability of dropping out input elements in attention layers
`0.1` (default) | scalar in the range [0, 1)

Probability of dropping out input elements in attention layers, specified as a scalar in the range [0, 1).

When you train a neural network with attention layers, the layer randomly sets attention scores to zero using the dropout mask rand(size(scores)) < p, where scores is the layer input and p is the layer dropout probability. The layer then scales the remaining elements by 1/(1-p).

Properties

expand all

`Network` — Pretrained BERT model
`dlnetwork` object

This property is read-only.

Pretrained BERT model, specified as a dlnetwork (Deep Learning Toolbox) object corresponding to the net or Model argument.

`Tokenizer` — BERT tokenizer
`bertTokenizer` object

This property is read-only.

BERT tokenizer, specified as a bertTokenizer object corresponding to the tokenizer or Model argument.

`ClassNames` — Class names
`["positive" "negative"]` (default) | categorical vector | string array | cell array of character vectors

Class names, specified as a categorical vector, a string array, or a cell array of character vectors.

If you specify the net argument, then the output size of the network must match the number of classes.

To set this property, use the corresponding name-value argument when you create the bertDocumentClassifier object. After you create a bertDocumentClassifier object, this property is read-only.

Data Types: string | cell | categorical

Object Functions

classify Classify document using BERT document classifier

Examples

collapse all

Create BERT Document Classifier for Training

This example uses:

Open Live Script

Create a BERT document classifier that is ready for training.

mdl = bertDocumentClassifier

mdl = 
  bertDocumentClassifier with properties:

       Network: [1x1 dlnetwork]
     Tokenizer: [1x1 bertTokenizer]
    ClassNames: ["positive"    "negative"]

View the class names.

mdl.ClassNames

ans = 1x2 string
    "positive"    "negative"

Specify BERT Document Classifier Classes

This example uses:

Open Live Script

Create a BERT document classifier for the classes "Electrical Failure", "Leak", "Mechanical Failure", and "Software Failure".

classNames = ["Electrical Failure" "Leak" "Mechanical Failure" "Software Failure"];
mdl = bertDocumentClassifier(ClassNames=classNames)

mdl = 
  bertDocumentClassifier with properties:

       Network: [1x1 dlnetwork]
     Tokenizer: [1x1 bertTokenizer]
    ClassNames: ["Electrical Failure"    "Leak"    "Mechanical Failure"    "Software Failure"]

View the class names.

mdl.ClassNames

ans = 1x4 string
    "Electrical Failure"    "Leak"    "Mechanical Failure"    "Software Failure"

References

[1] Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. "BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding" Preprint, submitted May 24, 2019. https://doi.org/10.48550/arXiv.1810.04805.

[2] Srivastava, Nitish, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. "Dropout: A Simple Way to Prevent Neural Networks from Overfitting." The Journal of Machine Learning Research 15, no. 1 (January 1, 2014): 1929–58

[3] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "ImageNet Classification with Deep Convolutional Neural Networks." Communications of the ACM 60, no. 6 (May 24, 2017): 84–90. https://doi.org/10.1145/3065386.

Version History

Introduced in R2023b

bertDocumentClassifier

Description

Creation

Description

Input Arguments

`net` — BERT neural network
`dlnetwork` object

`tokenizer` — BERT tokenizer
`bertTokenizer` object

`Model` — BERT model
`"base"` (default) | `"tiny"` | `"mini"` | `"small"` | `"large"` | `"multilingual"`

`DropoutProbability` — Probability of dropping out input elements in dropout layers
`0.1` (default) | scalar in the range [0, 1)

`AttentionDropoutProbability` — Probability of dropping out input elements in attention layers
`0.1` (default) | scalar in the range [0, 1)

Properties

`Network` — Pretrained BERT model
`dlnetwork` object

`Tokenizer` — BERT tokenizer
`bertTokenizer` object

`ClassNames` — Class names
`["positive" "negative"]` (default) | categorical vector | string array | cell array of character vectors

Object Functions

Examples

Create BERT Document Classifier for Training

Specify BERT Document Classifier Classes

References

Version History

See Also

Topics

bertDocumentClassifier

Description

Creation

Description

Input Arguments

net — BERT neural network dlnetwork object

tokenizer — BERT tokenizer bertTokenizer object

Model — BERT model "base" (default) | "tiny" | "mini" | "small" | "large" | "multilingual"

DropoutProbability — Probability of dropping out input elements in dropout layers 0.1 (default) | scalar in the range [0, 1)

AttentionDropoutProbability — Probability of dropping out input elements in attention layers 0.1 (default) | scalar in the range [0, 1)

Properties

Network — Pretrained BERT model dlnetwork object

Tokenizer — BERT tokenizer bertTokenizer object

ClassNames — Class names ["positive" "negative"] (default) | categorical vector | string array | cell array of character vectors

Object Functions

Examples

Create BERT Document Classifier for Training

Specify BERT Document Classifier Classes

References

Version History

See Also

Topics

`net` — BERT neural network
`dlnetwork` object

`tokenizer` — BERT tokenizer
`bertTokenizer` object

`Model` — BERT model
`"base"` (default) | `"tiny"` | `"mini"` | `"small"` | `"large"` | `"multilingual"`

`DropoutProbability` — Probability of dropping out input elements in dropout layers
`0.1` (default) | scalar in the range [0, 1)

`AttentionDropoutProbability` — Probability of dropping out input elements in attention layers
`0.1` (default) | scalar in the range [0, 1)

`Network` — Pretrained BERT model
`dlnetwork` object

`Tokenizer` — BERT tokenizer
`bertTokenizer` object

`ClassNames` — Class names
`["positive" "negative"]` (default) | categorical vector | string array | cell array of character vectors