Classify Streaming Webcam Video Using SlowFast Video Classifier
This example shows how to classify a streaming video from a webcam using a pretrained SlowFast Video Classifier. To learn more about how to train a video classifier network for your dataset, see Gesture Recognition using Videos and Deep Learning.
Download Pretrained Video Classifier
Download the pretrained SlowFast video classifier.
downloadFolder = fullfile(tempdir,"gesture"); zipFile = "slowFastPretrained_fourClasses.zip"; if ~isfile(fullfile(downloadFolder,zipFile)) disp("Downloading the pretrained network..."); downloadURL = "https://ssd.mathworks.com/supportfiles/vision/data/" + zipFile; zipFile = fullfile(downloadFolder,zipFile); websave(zipFile,downloadURL); unzip(zipFile,downloadFolder); end
Load the pretrained SlowFast video classifier.
pretrainedDataFile = fullfile(downloadFolder,"slowFastPretrained_fourClasses.mat");
pretrained = load(pretrainedDataFile);
slowFastClassifier = pretrained.data.slowFast;
Display the class label names of the pretrained video classifier. Any gesture such as "clapping" and "wavingHello" on to the webcam will be recognized by the SlowFast Video Classifier.
classes = slowFastClassifier.Classes
classes = 4×1 categorical
clapping
noAction
somethingElse
wavingHello
Setup the Webcam and the Video Player
In this example, a webcam object is used to capture streaming video. A Video Player is used to display the streaming video along with the predicted class.
Create a webcam object using the webcam
function.
cam = webcam;
Create a Video Player using vision.VideoPlayer
function. Make sure to place the Video Player in a position where you can clearly see the streaming video when running the classification.
player = vision.VideoPlayer;
Classify the Webcam Streaming Video
Specify how frequently the classifier should be applied to incoming video frames.
classifyInterval = 10;
A value of 10 balances runtime performance against classification performance. Increase this value to improve runtime performance at the cost of missing gestures from the live video stream.
Obtain the sequence length of the SlowFast Video Classifier. Classify only after capturing at least sequenceLength
number of frames from the webcam.
sequenceLength = slowFastClassifier.InputSize(4);
Specify the maximum number of frames to capture in a loop using the maxNumFrames
variable. Make sure you wave one of your hands to recognize "wavingHello"
label, and clap using both your hands for the classifier to recognize "clapping"
label.
maxNumFrames = 280;
Capture the webcam snapshot in a loop. Update the streaming video sequence of the classifier using the updateSequence
method, and classify the streaming sequence using the classifySequence
method.
numFrames = 0; text = ""; while numFrames <= maxNumFrames frame = snapshot(cam); numFrames = numFrames + 1; slowFastClassifier = updateSequence(slowFastClassifier,frame); if mod(numFrames, classifyInterval) == 0 && numFrames >= sequenceLength [label,scores] = classifySequence(slowFastClassifier); if ~isempty(label) text = string(label) + "; " + num2str(max(scores), "%0.2f"); end end frame = insertText(frame,[30,30],text,'FontSize',18); step(player,frame); end