How to create fitnet neural network via deep neural network designer

Question

Sixiong You 2022-5-4

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1711225-how-to-create-fitnet-neural-network-via-deep-neural-network-designer

评论： James Gross 2023-10-20

I want to reconstruct the fitnet nerual network via deep neural network designer. The following code is what I have tried. To be specific, I plan to generate a regresssion neural network with four hidden layers (6-12-12-6, where the activation functions are sigmoid function). First, I create the neural network via deep neural network designer. Then I create the neural network via fitnet. Finally, after I run this code, I have two quesitons here:

the resutls from the deep neural network designer will only return one signle value for all different inputs. (The returned Ypredict will be a 500X1 vector with the same value, which seems to be wierd.)
As I think both neural network has the same structure, I think their results should be the same. However, I found that the results from the deep neural network designer and fitnet are totally differnt.

I really have no idea why this happened, especially for the first one.

%% Regression using deep neural network designer
for i=1:3000
    Xfea(1:numFeatures,i)=randn(4,1);
end
for i=1:3000
    Ytrain(i,1)=rand; 
end
A_freq =2500;
dsfeature = arrayDatastore(Xfea(:,1:A_freq),'IterationDimension',2);
dsYTrain = arrayDatastore(Ytrain(1:A_freq));
dsTrain = combine(dsfeature,dsYTrain);%,
dsfeature2 = arrayDatastore(Xfea(:,A_freq+1:end),'IterationDimension',2);
dsYTrain2 = arrayDatastore(Ytrain(A_freq+1:end));
dsVal = combine(dsfeature2,dsYTrain2);%,
layers = [
    featureInputLayer(numFeatures)
    sigmoidLayer
    fullyConnectedLayer(6)
    sigmoidLayer
    fullyConnectedLayer(12)
    sigmoidLayer
    fullyConnectedLayer(12)
    sigmoidLayer
    fullyConnectedLayer(6)
    sigmoidLayer
    fullyConnectedLayer(1)
    regressionLayer];
lgraph = layerGraph(layers);
XValidation = Xfea(:,2500:end);
YValidation = Ytrain(2500:end);
options = trainingOptions("adam", ...
    'MaxEpochs',30, ...
    'InitialLearnRate',0.1, ...
    'ValidationData',dsVal, ...
    Plots="training-progress", ...
    Verbose=0);
net = trainNetwork(dsTrain,layers,options);
YPredicted = predict(net,dsfeature2);
%% Regression using fitnet
net1 = fitnet([6 12 12 6],'trainlm'); 
net1.trainParam.epochs=2000;
net1.trainParam.max_fail =20;
[net1,tr]=train(net1,Xfea(:,1:A_freq),Ytrain(1:A_freq)');
YPredicted2 = net1(Xfea(:,A_freq+1:end));

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Akshat 2023-10-20

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1711225-how-to-create-fitnet-neural-network-via-deep-neural-network-designer#answer_1337261

Hi Sixiong,

As per my understanding of the question you are curious why there are different results for the same network generated using two methods. Also, you are getting the same result array for all different inputs.

For the first part of the question, the results are different because of a very fundamental difference, the solver you have specified in the deep neural network designer implementation, is “Adam”.
On the other hand, for the fitnet implementation, you have used the train function, which has a training function of “Levenberg-Marquardt”, and can be changed to various other functions, but “Adam” isn’t one of them.
Refer this documentation for more information of the available options: https://www.mathworks.com/help/deeplearning/ref/fitnet.html
Now, in order to try to make the outputs same, you can use the “trainNetwork” function. Refer this page for more information: https://www.mathworks.com/help/deeplearning/ref/trainnetwork.html
Coming to the second part of the question, there is no reason for the result being the same array for every input, except one that I can think of; the use of “arrayDatastore” function.
This function loads every column in your input as a standalone observation and then trains the model.
You can try to remove the “combine” part, that might help.
To reproduce the error on my end, it’d be helpful if you provide the datasets used. This will help me provide the correct answer with higher confidence.

I hope this helps!

Regards

Akshat Wadhwa

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

James Gross 2023-10-20

Hello,

Just to add to the awesome answer already provided by Akshat, there are quite a few differences between fitnet and trainNetwork that will lead to different results, regardless of whether the architecture is the same.

One very big difference is that fitnet and trainNetwork are using different network initializations. That is, the actual parameters of the networks at the beginning of training are different, so the result of training the two will be different.

Moreover, as Akshat has already noted, the different solvers will lead to different results after training. Additionally, because Levenberg-Marquardt is a full-batch solver (i.e. it uses all training data at each iteration), while Adam is a mini-batch solver (i.e. it uses a subset of the training data at each iteration), the difference in results can be even greater. In particular, it is often observed that full-batch solvers outperform mini-batch solvers for small data regimes, particularly for tabular data, like the problem shown here.

We introduced support for L-BFGS, another full-batch solver, with our new training function trainnet in R2023b. You may find that using this solver gives better results than mini-batch solvers like Adam for these types of problems.

I hope this information helps!

James

请先登录，再进行评论。