NaN in Neural network training and simulation; tonndata

21 次查看(过去 30 天)
Hello,
I have two questions. Thank you very much for any inputs and ideas!!! 1) How to deal with NaN in Neural network training and simulation? The datasets I used as following. The input dataset is a 6*204 matrix with several NaNs. The output dataset is a 6*204 matrix with many NaNs. My simulation dataset is a 6*864000 matrix with many NaNs. I used nntool GUI to train network and do simulation. But the simulation results have numbers even for the simulation samples with all NaNs. I want to ask if there is a way I can set that NaN is not replaced by anything. Just keep it as NaN when do training and simulation.
2) When I try cell array generated by tonndata, the neural network treat the cell array as one sample, so there is no way to separate all samples into training data, test data, validation data. Anyone can share me why using cell array in neural network?
I googled but could not find a good answer or document about these two issues. Thank you very much for any inputs and help!
  3 个评论
Rong Yu
Rong Yu 2015-6-19
Hi Greg,
Thank you very much for your comments!
An example of data can be as below. No values are in input, target, and simulation datasets. So I am wondering if neural network will take NaNs as no values for training and simulation and if the trained neural network can predict no values since there are lots of no values in my target datasets.
INPUT = [47.1166687000000 12.5424995400000 12.4942998900000 35.6557006800000 37.4290008500000; 970 5 38 1200 545; 0.308835652000000 NaN 0.340448478000000 NaN 0.361919623000000; 920.754849700000 1479.82682300000 1509.00637800000 834.223571300000 765.186007600000;
141.011858700000 240.689134100000 244.049134300000 195.169579100000 173.674052000000; 3.24222217100000 27.4396667500000 27.5893885300000 12.0933616600000 13.2053058200000];
TARGET = [2.40500000000000 NaN NaN NaN NaN; NaN NaN 1.20000000000000 NaN NaN; NaN NaN 0.676000000000000 0.949000000000000 1.55900000000000; NaN NaN NaN -1.60500000000000 NaN; NaN 1.50000000000000 NaN NaN NaN; 1.06700000000000 NaN NaN NaN NaN];
SIMU_DATA = [[30.1250038146973,30.3750038146973,30.6250038146973;921.869995117188,994.147766113281,1023.59332275391;0.0893665217391298,NaN,NaN;139.424161800494,138.530641879061,141.447721067883;242.061606131660,242.608550919427,243.153662321303;19.1725284152561,18.4053618537055,18.0085001627604]]
Thank you very much!!!~~~
Rong Yu
Rong Yu 2015-6-19
Thank you Greg for your time and help! Do you mind to try it again. I copied INPUT, TARGET, and SIMU_DATA from here and pasted them to matlab command window directly. And they work fine. INPUT and TARGET are 6*5 double matrix; SIMU_DATA is 6*3 double matrix.
Thank you!~~~

请先登录,再进行评论。

采纳的回答

Greg Heath
Greg Heath 2015-6-22
Do not refer to NaN as " No value ". It stands for "Not a Number" and is just referred to as NaN pronounced as en-ay-en.
close all, clear all, clc
[ I N ] = size(x) % [ 6 5 ]
[ O N ] = size(t) % [ 6 5 ]
net = fitnet; % H=10
net.divideFcn = 'dividetrain'; % Not much data
Hub = -1+ceil((N*O-O)/(I+O+1)) % 1 H= 10 is overfitting: need overtraining mitigation
rng('default')
for i = 1:20
net = configure(net,x,t);
[ net tr ] = trainbr(net,x,t); %mitigate overtraining
y = net(x)
stopcrit{i,1} = tr.stop;
MSE(i,1) = mse(t-y);
end
lasty = y
% = [ 2.405 2.405 2.405 2.405 2.405
% 1.2 1.2 1.2 1.2 1.2
% 1.6819 NaN 0.676 NaN 1.559
% -1.605 -1.605 -1.605 -1.605 -1.605
% 1.5 1.5 1.5 1.5 1.5
% 1.067 NaN 1.9648 NaN 1.1768 ]
stopcrit1 = stopcrit{1}
% = Minimum gradient reached.
stopcrit = stopcrit % repmat(stopcrit1,20,1)
MSEp = MSE'
% MSEp = e-17 x [ ...
% 104.51 6.77 8.44 103.22 200.35 0.30 0.14
% 131.41 177.22 195.06 832.83 0.55 0.22 0.89
% 0.87 0.85 583.46 430.18 0.54 0.33 ]
  3 个评论
Greg Heath
Greg Heath 2015-6-23
Don't skip the calculation. It helps set an upper bound on the search for an optimal H. For example, it is desirable to have H << Hub.

请先登录,再进行评论。

更多回答(1 个)

Eric Lin
Eric Lin 2015-6-19
  1 个评论
Rong Yu
Rong Yu 2015-6-19
Thank you, Eric! You are right. The link you shared answered part of my question. It looks that the Neural Network Toolbox fills the missing values in the input data with values calculated by a specific function (such as averages). Do you know if they deal with the missing values in the target data in the same way? Thank you!

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Sequence and Numeric Feature Data Workflows 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by