what is the best size of input data for neural network?

Question

Daud 2012-9-8

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/47663-what-is-the-best-size-of-input-data-for-neural-network

编辑： Greg Heath 2016-1-22

i am doing a speech recognition project; after performing MFC i got a almost a huge matrix 4903x1;(for a single sample).

so i decided to downsample the by factor 19; which yields a 91x1 dimension matrix.Than i am using 91 neurons in hidden layer (nprtool)

i tried to keep the input matrix dim. less than 100

Now my queries is:

1)Is it effecting my network performance for speech recognition?; because my network is not giving good result for untrained or testing speech data.

2)How many neurons should be used in hidden layers in relation to input data dimension.

3)what is the difference between weight layer and node layer?

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Greg Heath 2012-9-8

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/47663-what-is-the-best-size-of-input-data-for-neural-network#answer_58275

% i am doing a speech recognition project; after performing MFC i got a almost a huge matrix 4903x1;(for a single sample).

Statistical terminology: A sample is a selected GROUP of data (typically not a single data point) .

Geometrical fact: The minimum number of points needed to define an I-dimensional input space is min(N) = I+1. Typically desire N >> I + 1

How many measurements do you have?

[ I N ] = size(input)

[ O N ] = size(target) % columns of eye(O) ; O = No. of speech categories

Neq = N*O % Number of training equations

What is the rank and condition number of your input matrix?

rankinput = rank(input)

condinput = cond(input)

%so i decided to downsample the by factor 19; which yields a 91x1 dimension matrix.

Better to obtain the 19 from MFCC ranking than just 'downsampling'. You may be choosing the worst 19 variables.

% I am using 91 neurons in hidden layer (nprtool i tried to keep the input matrix dim. less than 100

H = 91 is probably very excessive. See below

% Now my queries is: 1) Is it effecting my network performance for speech recognition?; because my network is not giving good result for untrained or testing speech data.

Most likely

%2)How many neurons should be used in hidden layers in relation to input data dimension.

It depends on the training algorithm. If you are not using validation set stopping or regularization via trainbr it is wise to have many more training equations than unknown weights to be estimated. For an I-H-O node topology

Neq >= Nw % Required ( H <= (Neq-O)/(I+O+1) )

Neq >> Nw % Desired ( H << (Neq-O)/(I+O+1) )

Nw = (I+1)*H+(H+1) = O+(I+O+1)*H % No. of estimated weights

% 3)what is the difference between weight layer and node layer?

There are 3 node layers: input, hidden and output

There are 2 weight layers: input-to-hidden and hidden-to-output

Any mention of layers in MATLLAB documentation refers to weight layers.

Hope this helps.

Greg

3 个评论
显示 1更早的评论隐藏 1更早的评论

Daud 2012-9-9

What do u mean by training equation?

input size = 91x88; target = 4x88;

what are the significance of rank or cond?

are u saying that after rank or cond i should downsample?

thanks and i highly appreciate ur help.

Greg Heath 2012-9-9

The default configuration is represented by the matrix equation

y = b2 + LW*tansig( b1 + IW*x)

Substituting training input x=xtrn and training target y = ttrn, the matrix equation can be decomposed into Ntrn equations for each of the O outputs.

If Ntrn > I, rank(xtrn) <= I indicates the practical dimension of the input space. Then removing redundant or irrelevant rows in x to reduce I decreases the number of unknown weights and, generally, leads to an improved design.

rank(xtrn) indicates thr "true" dimensionality of the input data. Therefore it is an indication of how many input variables rows can be deleted.

Similar reductions can be made to the output dimensionality if rank(ttrn) < O.

Downsampling means reducing Ntrn.You want to decrease I. I call it input variable reduction.

请先登录，再进行评论。

Answer 2

Jigar Gada 2012-9-8

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/47663-what-is-the-best-size-of-input-data-for-neural-network#answer_58259

What kind of network are you using. If you re using back propagation network, then

1. Back propogation gives good results only if it is trained with proper training data.

2. If the activation function can vary with the function, then it can be seen that a n-input, m output function requires at most 2n+1 hidden units. If more number of hidden layers are present, then the calculation for the delta's are repeated for each additional hidden layer present, summing all the deltas’s for units present in the previous layer that is fed into the current layer for which is being calculated.

3 个评论
显示 1更早的评论隐藏 1更早的评论

Lxs____ 2016-1-22

What is your "proper training data" mean? Is it proper data size or others?

Greg Heath 2016-1-22

编辑：Greg Heath 2016-1-22

在 MATLAB Online 中打开

I can only guess what you mean. Next time PLEASE put more beef in your question.

Proper training data has to at least span the input space of all (trn+val+tst) the data . Therefore, in addition to

Ntrneq >> Nw

you would like to have

rank(input) = I

If it is less, you can reduce the number of inputs.

Hope this helps.

Greg

请先登录，再进行评论。

what is the best size of input data for neural network?

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

3 个评论
显示 1更早的评论隐藏 1更早的评论

更多回答（1 个）

3 个评论
显示 1更早的评论隐藏 1更早的评论

另请参阅

类别

标签

产品

Community Treasure Hunt

what is the best size of input data for neural network?

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

3 个评论 显示 1更早的评论隐藏 1更早的评论

更多回答（1 个）

3 个评论 显示 1更早的评论隐藏 1更早的评论

另请参阅

类别

标签

产品

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

3 个评论
显示 1更早的评论隐藏 1更早的评论

3 个评论
显示 1更早的评论隐藏 1更早的评论