主要内容

DeepSORTVideoTracker

Deep Simple Online and Realtime (DeepSORT) video tracker

Since R2026a

Description

The DeepSORTVideoTracker System object™ is a tracker capable of processing detections of multiple targets from a video using the Deep Simple Online and Realtime (DeepSORT) algorithm[1]. The tracker initializes, confirms, corrects, predicts (performs coasting), and deletes tracks. Inputs to the tracker are bounding boxes and appearance vectors. The bounding boxes are in the form of [x y w h], where x and y define the upper left corner of the box, w and h define the width and height of the box, respectively. The tracker outputs tracks with the same bounding box definition. The appearance vector is a floating-point vector such as the ones reported by re-identification (reID) networks.

This tracker supports the outputs of object detectors and reID networks from the Computer Vision Toolbox™. For more information, see Choose an Object Detector (Computer Vision Toolbox) and reidentificationNetwork (Computer Vision Toolbox).

To track targets in a video using this object:

  1. Create the DeepSORTVideoTracker object and set its properties.

  2. Call the object with arguments, as if it were a function.

To learn more about how System objects work, see What Are System Objects?

Creation

To create a DeepSORTVideoTracker System object, use the videoTracker function with "deepsort" algorithm. For example:

tracker = videoTracker("deepsort")

Properties

expand all

Unless otherwise indicated, properties are nontunable, which means you cannot change their values after calling the object. Objects lock when you call them, and the release function unlocks them.

If a property is tunable, you can change its value at any time.

For more information on changing property values, see System Design in MATLAB Using System Objects.

Tracking frame size in pixels, specified as a row or column vector of the form [width height].

Data Types: single | double

Tracking frame rate in frames per second, specified as a positive scalar.

Data Types: single | double

Minimum intersection-over-union (IoU) ratio for track assignment, specified as a scalar in the range (0,1]. If the IoU between a measurement and a track is less than this threshold, the measurement is not considered a match to that track.

Tunable: Yes

Data Types: single | double

Minimum number of updates required for track confirmation, specified as a 1-by-2 vector, [M N]. A new track is confirmed if it is updated at least M times in the last N frames.

Data Types: single | double

Minimum number of misses required for track deletion, specified as a 1-by-2 vector, [P Q]. The tracker deletes an existing track if the track misses more than P updates in the last Q frames.

Data Types: single | double

Minimum appearance similarity for track assignment, specified as a scalar in the range (0,1]. If the appearance similarity between a measurement and a track is less than this threshold, the measurement is not considered a match to that track.

Tunable: Yes

Data Types: single | double

Maximum Mahalanobis distance for track assignment, specified as a positive scalar. If the Mahalanobis distance between a measurement and a track is less than this threshold, the measurement is not considered a match to that track. For more information on the Mahalanobis distance, see Gating.

Tunable: Yes

Data Types: single | double

Weight of appearance similarity relative to Mahalanobis distance, specified as a scalar in the range [0,1]. A value of 0 means the tracker ignores appearance similarity and relies entirely on Mahalanobis distance for association. A value of 1 means the tracker ignores Mahalanobis distance and relies entirely on appearance similarity. For intermediate values, the tracker combines both metrics, weighted by this property. The default value is 0, matching the appearance weight used in [1].

Tunable: Yes

Data Types: single | double

Tack appearance update method, specified as "Gallery", "EMA", or "EMAGallery".

MethodDescription
"Gallery"Maintains a gallery of past appearances vectors for each track.
"EMA"Uses an Exponential Moving Average (EMA) of the appearance vectors to update the track’s appearance.
"EMAGallery"Combines EMA with a gallery of past appearances vectors.

Number of appearance frames to keep in the gallery, specified as a positive integer.

Dependencies

To enable this argument, set the AppearanceUpdate property to "Gallery" or "EMAGallery".

Data Types: single | double

Weight of old appearance vectors in EMA update, specified as a scalar in the range [0,1]. The track updates its appearance vector at the k frame as:

TrackAppearancek=αTrackAppearancek1 + (1 – α)DetectionAppearance

Dependencies

To enable this argument, set the AppearanceUpdate property to "EMA" or "EMAGallery".

Tunable: Yes

Data Types: single | double

Usage

Description

confirmedTracks = tracker(bboxes,appearance) returns a list of confirmed tracks updated from a video based on the input bounding boxes and appearance vectors.

example

[confirmedTracks,tentativeTracks,allTracks] = tracker(___) also provides a list of tentative tracks and a list of all tracks. Tentative tracks are tracks that have not yet reached the threshold specified in the NumUpdatesForConfirmation property.

Input Arguments

expand all

Bounding boxes of detection within the video frame, specified as an N-by-M numerical matrix. N is the number of detected bounding boxes within the video frame. M is the appearance vector size.

You can use the output bounding boxes of object detectors from the Computer Vision Toolbox. For more information, see Choose an Object Detector (Computer Vision Toolbox).

Bounding boxes of detection within the video frame, specified as an N-by-M numerical matrix. N is the number of detected bounding boxes within the video frame. M is the appearance vector size.

You can use the reID networks from the Computer Vision Toolbox. For more information, see reidentificationNetwork (Computer Vision Toolbox) and extractReidentificationFeatures (Computer Vision Toolbox).

Output Arguments

expand all

Confirmed tracks, returned as a structure array containing these fields:

Field NameDescription
TrackIDUnique identifier for each track.
AgeNumber of times the track has been updated. When a track is initialized, its Age is equal to 1. Any subsequent update with a hit or miss increases the track Age by 1.
TimeTime at which the track was updated by the tracker.
BoundingBoxCurrent bounding box of the track, in the same format as the bboxes argument.
IsConfirmedLogical value indicating whether the track is confirmed.

A track becomes confirmed when it meets the confirmation threshold defined by the NumUpdatesForConfirmation property. In this case, the tracker logs the track in confirmedTracks and sets the IsConfirmed field to true.

Tentative tracks, returned as a structure array containing these fields:

Field NameDescription
TrackIDUnique identifier for each track.
AgeNumber of times the track has been updated. When a track is initialized, its Age is equal to 1. Any subsequent update with a hit or miss increases the track Age by 1.
TimeTime at which the track was updated by the tracker.
BoundingBoxCurrent bounding box of the track, in the same format as the bboxes argument.
IsConfirmedLogical value indicating whether the track is confirmed.

A track is tentative if it does not meet the confirmation threshold defined by the NumUpdatesForConfirmation property. In this case, the tracker logs the track in tentativeTracks and sets the IsConfirmed field to false.

All tracks, returned as a structure array containing these fields:

Field NameDescription
TrackIDUnique identifier for each track.
AgeNumber of times the track has been updated. When a track is initialized, its Age is equal to 1. Any subsequent update with a hit or miss increases the track Age by 1.
TimeTime at which the track was updated by the tracker.
BoundingBoxCurrent bounding box of the track, in the same format as the bboxes argument.
IsConfirmedLogical value indicating whether the track is confirmed.

allTracks consists of confirmed and tentative tracks.

Object Functions

To use an object function, specify the System object as the first input argument. For example, to release system resources of a System object named obj, use this syntax:

release(obj)

expand all

stepRun System object algorithm
releaseRelease resources and allow changes to System object property values and input characteristics
resetReset internal states of System object
isLockedDetermine if System object is in use
cloneCreate duplicate System object

Examples

collapse all

Create a DeepSORTVideoTracker object, specify the frame rate of the tracking video as 1 frames per second and the frame size as 800 by 600 pixels.

DSTracker = videoTracker("deepsort");
DSTracker.FrameSize = [800 600];
DSTracker.FrameRate = 1;

Specify the appearance update method as "Gallery", and the number of appearance frames to keep in the gallery as 50.

DSTracker.AppearanceUpdate = "Gallery";
DSTracker.NumAppearanceFrames = 50;

For how to use a DeepSORTVideoTracker object to track targets in a video, see Multi-Object Tracking with DeepSORT.

References

[1] Wojke, Nicolai, Alex Bewley, and Dietrich Paulus. "Simple online and realtime tracking with a deep association metric." In 2017 IEEE international conference on image processing (ICIP), pp. 3645-3649.

Version History

Introduced in R2026a