Main Content

Code Generation for Convolutional LSTM Network That Uses Intel MKL-DNN

This example shows how to generate a MEX function for a deep learning network containing both convolutional and bidirectional long short-term memory (BiLSTM) layers that uses the Intel Math Kernel Library for Deep Neural Networks (MKL-DNN). The generated MEX function reads the data from a specified video file as a sequence of video frames and outputs a label that classifies the activity in the video. For more information on the training of this network, see the example Classify Videos Using Deep Learning (Deep Learning Toolbox).

Third-Party Prerequisites

This example is supported on Mac®, Linux® and Windows® platforms and not supported for MATLAB Online.

Prepare Input

Read the video file pushup.mp4 by using the readvideo helper function included with this example in a supporting file. To view the video, loop over the individual frames of the video file and use the imshow function.

filename = "pushup.mp4";
video = readVideo(filename);
numFrames = size(video,4);
figure
for i = 1:numFrames
    frame = video(:,:,:,i);
    imshow(frame/255);
    drawnow
end

Figure contains an axes object. The axes object contains an object of type image.

Center-crop the input video frames to the input size of the trained network by using the centerCrop helper function attached as a supporting file.

inputSize = [224 224 3];
video = centerCrop(video,inputSize);

The video_classify Entry-Point Function

The video_classify.m entry-point function takes image sequences and passes it to a trained network for prediction. This function uses the convolutional LSTM network that is trained in the example Classify Videos Using Deep Learning (Deep Learning Toolbox). The function loads the network object from the file net.mat file into a persistent variable and then uses the classify (Deep Learning Toolbox) function to perform the prediction. On subsequent calls, the function reuses the already loaded persistent object.

type('video_classify.m')
function out = video_classify(in) %#codegen

% During the execution of the first function call, the network object is
% loaded in the persistent variable mynet. In subsequent calls, this loaded
% object is reused. 

persistent mynet;

if isempty(mynet)
    mynet = coder.loadDeepLearningNetwork('net.mat');
end

% Provide input and perform prediction
out = classify(mynet,in); 

Generate MEX

To generate a MEX function, create a coder.MexCodeConfig object cfg. Set the TargetLang property of cfg to C++. Use the coder.DeepLearningConfig function to create a deep learning configuration object for MKL-DNN. Assign it to the DeepLearningConfig property of the cfg.

cfg = coder.config('mex');
cfg.TargetLang = 'C++';
cfg.DeepLearningConfig = coder.DeepLearningConfig('mkldnn');

Run the getVideoClassificationNetwork helper function to download the video classification network and save the network in the MAT file net.mat.

getVideoClassificationNetwork();

Use the coder.typeof function to specify the type and size of the input argument to the entry-point function. In this example, the input is of double type with size [224 224 3] and a variable sequence length.

Input = coder.typeof(double(0),[224 224 3 Inf],[false false false true]);

Generate a MEX function by running the codegen command.

codegen -config cfg video_classify -args {Input} -report
Code generation successful: View report

Run generated MEX

Run the generated MEX function with center-cropped video input.

output = video_classify_mex(video)
output = categorical
     pushup 

Overlay the prediction on to the input video.

video = readVideo(filename);
numFrames = size(video,4);
figure
for i = 1:numFrames
    frame = video(:,:,:,i);
    frame = insertText(frame, [1 1], char(output), 'TextColor', [255 255 255],'FontSize',30, 'BoxColor', [0 0 0]);
    imshow(frame/255);
    drawnow
end

Figure contains an axes object. The axes object contains an object of type image.

See Also

| |

Related Topics