Incremental Anomaly Detection Overview

What Is Incremental Anomaly Detection?

Incremental anomaly detection is a branch of machine learning that involves processing incoming data from a data stream—continuously and in real time—and computing anomaly scores, possibly given little to no knowledge of the distribution of the predictor variables or sample size. Any observations above a score threshold are detected as anomalies.

Incremental learning algorithms are flexible, efficient, and adaptive. The following characteristics distinguish incremental learning from traditional machine learning:

An incremental model is fit to data quickly and efficiently, which means it can adapt to changes (or drifts) in the data distribution, in real time.
Little information might be known about the population before incremental learning starts. Therefore, the algorithm can run with a cold start. For example, the anomaly contamination fraction and score threshold might not be known until after the model processes observations. When enough information is known before learning begins, you can specify such information to provide the model with a warm start.
Because observations can arrive in a stream, the sample size is likely unknown and possibly large, which makes data storage inefficient or impossible. Therefore, the algorithm must process observations when they are available and before the system discards them. This incremental learning characteristic makes hyperparameter tuning difficult or impossible.

Suppose an incremental model is prepared to compute scores and detect anomalies. Given incoming chunks of observations, the incremental learning algorithm processes data in real time and does the following:

Detect anomalies — Identify observations with scores above the current score threshold as anomalies.
Train model — Update the model by training it on the incoming observations, computing scores, and updating the score threshold.

If insufficient information exists for an incremental model to generate predictions, or you do not want to track the predictive performance of the model because it has not been trained enough, you can include an optional initial step to find adequate values for hyperparameters (estimation period), or an initial training period before returning scores and identifying anomalies (score warm-up period).

Incremental Anomaly Detection with MATLAB

Statistics and Machine Learning Toolbox™ functionalities enable you to implement incremental anomaly detection on streaming data. As with other machine learning functionalities, the entry point into incremental anomaly detection is an incremental learning object, which you pass to functions with data to implement incremental anomaly detection. Unlike other machine learning functions, incrementalRobustRandomCutForest and incrementalOneClassSVM do not require data to create an incremental learning object. However, the incremental learning object specifies how to process incoming data, such as whether to standardize the predictor data and when to compute scores and identify anomalies. The object also specifies the parametric form of the model and problem-specific options.

Incremental Learning Model Objects

This table describes the available entry-point model objects for incremental anomaly detection.

Model Object	Model Type	Characteristics
`incrementalRobustRandomCutForest`	Robust random cut forest	Supports categorical predictors
`incrementalOneClassSVM`	One-class support vector machine (SVM)	Does not support categorical predictors

Properties of an incremental learning model object specify:

Data characteristics, such as the number of predictor variables NumPredictors and their first and second moments
Model characteristics, such as the number of trees and the number of training observations in each tree (for robust random cut forest models)
Training options, such as the objective solver Solver and solver-specific hyperparameters including the ridge penalty Lambda for standard and average stochastic gradient descent (for one-class SVM models)

Unlike when working with other machine learning model objects, you can create an incremental learning model by calling the object directly and specifying property values using name-value arguments. You do not need to fit a model to data to create an incremental learning model. This feature is convenient when you have little information about the data or model before training it. Depending on your specifications, the software can enforce estimation and score warm-up periods, during which incremental fitting functions infer data characteristics and then train the model for anomaly detection. By default, for one-class SVM models, the software solves the objective function using the adaptive scale-invariant solver, which does not require tuning and is insensitive to the predictor variable scales [3].

Alternatively, you can convert a traditionally trained model to a model for incremental learning by using the incrementalLearner function. For example, incrementalLearner converts a trained robust random cut forest model of type RobustRandomCutForest to an incrementalRobustRandomCutForest object. This table lists the convertible models and their conversion functions.

Traditionally Trained Convertible Model Object	Conversion Function	Model Object for Incremental Anomaly Detection
`RobustRandomCutForest`	`incrementalLearner`	`incrementalRobustRandomCutForest`
`OneClassSVM`	`incrementalLearner`	`incrementalOneClassSVM`

By default, the software considers converted models to be prepared for all aspects of incremental learning (converted models are warm). The incrementalLearner function transfers data characteristics (such as predictor names), the score threshold, and options available for incremental anomaly detection from the traditionally trained model being converted. For example:

For robust random cut forest models, incrementalLearner transfers all predictor names in the data expected during incremental learning, as well as the list of categorical predictors.
For one-class SVM models, if the objective solver of the traditionally trained model is SGD, incrementalLearner sets the incremental learning solver to SGD.

For more details, see the output argument descriptions on each incrementalLearner function page.

Incremental Anomaly Detection Functions

The incremental learning model object specifies all aspects of the incremental learning algorithm, from training and anomaly detection preparation through training and anomaly detection. To implement incremental anomaly detection, you pass the configured incremental learning model to an incremental fitting function (fit) or anomaly detection function (isanomaly). You can find the list of supported incremental learning functions in the Object Functions section of each incremental learning model object page.

Statistics and Machine Learning Toolbox incremental learning functions offer a workflow that is well suited for anomaly detection. For simplicity, the following workflow description assumes that the model is prepared to evaluate the model performance (in other words, the model is warm).

After you create an incremental learning model object, write a loop that implements incremental learning:

Read a chunk of observations from a data stream, when the chunk is available.
Overwrite the input model with the output model to perform incremental learning properly. For example:
```
[isanom,scores] = isanomaly(IncrementalMdl,X);
IncrementalMdl = fit(IncrementalMdl,X);
```
isanomaly calculates scores and identifies observations in the incoming data chunk with scores higher than the current score threshold. The fit function trains the model on the incoming data chunk and updates the score threshold. You can specify to check for anomalies and update the model in either order.

Incremental Learning Periods

Given incoming chunks of data, the actions performed by incremental learning functions depend on the current configuration or state of the model. This table describes the actions performed by incremental learning functions during each period.

Period Associated Model Properties Actions

Estimation

Period	Associated Model Properties	Actions
Estimation	`EstimationPeriod`	When required, fitting functions choose values for hyperparameters based on estimation period observations. Actions can include the following: Estimate the predictor means `Mu` and standard deviations `Sigma` for data standardization. Adjust the learning rate `LearnRate` for SGD solvers according to the learning rate schedule `LearnRateSchedule` (applies to one-class SVM models only). Store information buffers required for estimation. Update corresponding properties at the end of the period. For more details, see the Algorithms section of each object and `incrementalLearner` function page.
Score Warm-up	`ScoreWarmupPeriod`	When the property `IsWarm` is `false`, fitting functions perform the following actions: Fit the model to the incoming chunk of data. Update corresponding model properties and the score threshold after fitting the model. Return all scores as `NaN` and anomaly values as `false`. At the end of the period, the model is warm (the `IsWarm` property becomes `true`).

EstimationPeriod

When required, fitting functions choose values for hyperparameters based on estimation period observations. Actions can include the following:

Estimate the predictor means Mu and standard deviations Sigma for data standardization.
Adjust the learning rate LearnRate for SGD solvers according to the learning rate schedule LearnRateSchedule (applies to one-class SVM models only).
Store information buffers required for estimation.
Update corresponding properties at the end of the period.

For more details, see the Algorithms section of each object and incrementalLearner function page.

Score Warm-up

ScoreWarmupPeriod

When the property IsWarm is false, fitting functions perform the following actions:

Fit the model to the incoming chunk of data.
Update corresponding model properties and the score threshold after fitting the model.
Return all scores as NaN and anomaly values as false.
At the end of the period, the model is warm (the IsWarm property becomes true).

References

[1] Bartos, Matthew D., A. Mullapudi, and S. C. Troutman. "rrcf: Implementation of the Robust Random Cut Forest Algorithm for Anomaly Detection on Streams." Journal of Open Source Software 4, no. 35 (2019): 1336.

[2] Guha, Sudipto, N. Mishra, G. Roy, and O. Schrijvers. "Robust Random Cut Forest Based Anomaly Detection on Streams," Proceedings of The 33rd International Conference on Machine Learning 48 (June 2016): 2712–21.

[3] Kempka, Michał, Wojciech Kotłowski, and Manfred K. Warmuth. "Adaptive Scale-Invariant Online Algorithms for Learning Linear Models." Preprint, submitted February 10, 2019. https://arxiv.org/abs/1902.07528.