How to compute inference time (ms) to compare my Original, Projected and Fine Tuned models? Error For code generation of convolution1dLayer, when convolving over the time dime

Question

Silvia 2024-11-6，9:46

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2164325-how-to-compute-inference-time-ms-to-compare-my-original-projected-and-fine-tuned-models-error-fo

评论： Katja Mogalle 2024-11-14，12:17

I am trying to compute the inference time of three different models (Original, Projected, and Fine-Tuned) to compare their performances not only with my evaluation metrics and in terms of dimensions (number of learnable parameters) but also in terms of inference time. I am following this example: https://it.mathworks.com/help/deeplearning/ug/compress-network-for-estimating-soc.html. The architectures of my networks are as follows:

Original Net:

'input' Sequence Input Sequence input with 1 dimensions
'conv1' 1-D Convolution 10 8×1 convolutions with stride 1 and padding 'same'
'batchnorm1' Batch Normalization Batch normalization with 10 channels
'relu1' ReLU ReLU
'gru1' GRU GRU with 32 hidden units
'output' Fully Connected 1 fully connected layer

Projected and Fine Tuned Net:

'input' Sequence Input Sequence input with 1 dimensions
'conv1' 1-D Convolution 10 8×1 convolutions with stride 1 and padding 'same'
'batchnorm1' Batch Normalization Batch normalization with 10 channels
'relu1' ReLU ReLU
'gru1' Projected Layer Projected GRU with 32 hidden units
'output' Projected Layer Projected fully connected layer with output size 1

This is my code:

cfg = coder.config("mex");
cfg.TargetLang = "C++";
cfg.DeepLearningConfig = coder.DeepLearningConfig("none"); 
noisyInputType = coder.typeof('double', [Inf 1], [1 0]); 
codegen -config cfg FinalFineTuned_predict -args {noisyInputType}
codegen -config cfg FinalProjected_predict -args {noisyInputType}
codegen -config cfg FinalOriginal_predict -args {noisyInputType}

Where the functions are:

function out = FinalOriginal_predict(in) %#codegen
% A persistent object mynet is used to load the series network object.
% At the first call to this function, the persistent object is constructed and
% setup. When the function is called subsequent times, the same object is reused 
% to call predict on inputs, thus avoiding reconstructing and reloading the
% network object.
% Copyright 2019-2021 The MathWorks, Inc. 
persistent mynet;
if isempty(mynet)
    mynet = coder.loadDeepLearningNetwork('1DCNN_LSTM07.mat');
end
outDlarray = predict(mynet, dlarray(single(in), 'TCB'));
out = extractdata(outDlarray);
end

2nd function:

function out = FinalProjected_predict(in) %#codegen
persistent mynet;
if isempty(mynet)
    mynet = coder.loadDeepLearningNetwork('FinalProjected_unpacked.mat');
end
outDlarray = predict(mynet, dlarray(single(in), 'TCB'));
out = extractdata(outDlarray);
end

3rd function:

function out = FinalFineTuned_predict(in) %#codegen 
persistent mynet;
if isempty(mynet)
    mynet = coder.loadDeepLearningNetwork('FinalFineTuned_unpacked.mat');
end
outDlarray = predict(mynet, dlarray(single(in), 'TCB')); 
out = extractdata(outDlarray);
end

I had to unpacked the projected layers in both the Projected and Fine Tuned networks, otherwise I had an error while compiling.

In all the cases, the error I am encountering now is: "For code generation of convolution1dLayer, when convolving over the time dimension ('T'), the 'T' dimension of the input must be fixed size." Can you help me?

Thank you in advance,

Silvia

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Katja Mogalle 2024-11-7，9:55

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2164325-how-to-compute-inference-time-ms-to-compare-my-original-projected-and-fine-tuned-models-error-fo#answer_1542360

在 MATLAB Online 中打开

What the error message ("the 'T' dimension of the input must be fixed size.") is trying to say is that C/C++ code generation of networks containing a convolution 1D layer is not supported if your sequences have variable length. All sequences in your inference data must always have the same number of time steps.

So, let's assume all your sequences have 100 time steps, then you need to specify the input data to the codegen command as follows:

noisyInputType = coder.typeof('double', [100 1], [false false]); 
codegen -config cfg FinalOriginal_predict -args {noisyInputType}

If you indeed have variable length sequences, you'd have to cut off sequences either on the left or right side to make them all fixed length (or pad shorter sequences). If you want to do this in MATLAB, you can use the padsequences function.

Hope that helps.

5 个评论
显示 3更早的评论隐藏 3更早的评论

Silvia 2024-11-7，13:17

在 MATLAB Online 中打开

Thank you very much @Katja Mogalle for the help, I can finally generate the code!

But now I am encountering another problem: I have changed, as you suggested, the noisyInputType in this way:

noisyInputType = coder.typeof('double', [10000 1], [false false]);

And I would like to run the generated code as in the example (https://it.mathworks.com/help/deeplearning/ug/compress-network-for-estimating-soc.html):

PredOriginal = FinalOriginal_predict_mex(noisyInput);

Where my noisyInput is extracted from my test set in the following way:

dataDir = string(folder + "\dataset");
load testset2.mat testset;
testsetSize = size(testset,2);
if ~exist(fullfile(dataDir,"test2"),'dir')
    mkdir(fullfile(dataDir,"test2"))
    cnt = 0;
    % Test set generation
    for idx = 1:testsetSize
        cnt = cnt + 1;
        cleanSignal = testset(idx).cleansignal;
        noisySignal = testset(idx).noisysignal;
    
        save(fullfile(dataDir,"test2", ...
                "data_" + num2str(cnt) + ".mat"), ...
                "cleanSignal","noisySignal");
    end
    % Prepare Datastores to Consume Data
    ds_Test = signalDatastore(fullfile(dataDir,"test2"), ...
    SignalVariableNames=["noisySignal","cleanSignal"], ...
    ReadOutputOrientation="row");
else
    % Directly prepare Datastores to Consume Data (without training
    % generation)
    ds_Test = signalDatastore(fullfile(dataDir,"test2"), ...
    SignalVariableNames=["noisySignal","cleanSignal"], ...
    ReadOutputOrientation="row");
end
% Estraggo un solo segnale
firstdata = read(ds_Test);
noisyInput = firstdata{1};
cleanTarget = firstdata{2};

and noisyInput is 10000x1 double (as all the other data in ds_Test).

But when running:

PredOriginal = FinalOriginal_predict_mex(noisyInput);

I obtain this error:

"Incorrect class for expression 'in': expected 'char' but found 'double'.

Error in FinalOriginal_predict_mex". Do you know how can I fix this?

Thank you,

Silvia

Silvia 2024-11-13，21:12

Hello @Katja Mogalle.

I managed how to compute the inference time to compare my models.

What I see is that - counterintuitively - the fine-tuned model when compared to the original model results in an increase in the inference time calculated as in the example. What I have done differently is that, in passing the fine-tuned network to the coder, I had to take an extra step, namely to go and use the unpackProjectedLayers function in response to an error that would otherwise come up if I launched the codegen without unpacking the fine-tuned network. So I think it is the use of this function that causes an increase in inference time. Is this possible or is it unrelated?

Thank you as usual for your availability!

Silvia

Katja Mogalle 2024-11-14，12:17

I am certain the act of unpacking the projected layers is not the issue. Just make sure to save the "unpacked" network to the mat-file before re-generating the C/C+ code.

However, I do find inference speed is a tricky thing to measure and to understand. First, we need to make sure we have reliable measurements. This documentation example shows how to use timeit to measure inference speed and compare the original against the projected network: https://www.mathworks.com/help/deeplearning/ug/compress-network-for-estimating-soc.html#CompressNetworkForEstimatingBatteryStateOfChargeExample-13

Just to double, check, you are running the generated code on the CPU, not a GPU, correct? And you are not using any third-party deep learning libraries for codegen?

Would you be able to share your inference measurements (original network vs. projected network)?

The next thing we can look at is how much each layer was compressed using the projection technique. If a layer was not compressed very much, it can have a negative impact on inference speed as there are overheads associated with projection. If you are using MATLAB R2024a or newer, you can use the analyzeNetwork function to analyze the projected network (before unpacking). If you see any small values (or even negative) in the "Learnables Reduction" column of the layer analysis table, you should consider not projecting those layers (by utilizing the LayerNames argument in the compressNetworkUsingProjection function).

请先登录，再进行评论。

How to compute inference time (ms) to compare my Original, Projected and Fine Tuned models? Error For code generation of convolution1dLayer, when convolving over the time dime

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

5 个评论
显示 3更早的评论隐藏 3更早的评论

更多回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

How to compute inference time (ms) to compare my Original, Projected and Fine Tuned models? Error For code generation of convolution1dLayer, when convolving over the time dime

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

5 个评论 显示 3更早的评论隐藏 3更早的评论

更多回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

5 个评论
显示 3更早的评论隐藏 3更早的评论