Neural network work better with small dataset than largest one ?

1 次查看(过去 30 天)
Hi,i create neural network using nprtool at the begining i used input matrix with 9*981 but i got accuracy in the confusion matrix of 65% then i reduced the samples and i used input matrix with 9*102 and i got accuracy of 94.1% . So is this possible and correct ? and i want to know what's the reason for that.
Thanks

采纳的回答

Jeong_evolution
Jeong_evolution 2017-6-7
编辑:Jeong_evolution 2017-6-7
If the Input parameter in historical dataset(9*102) is highly correlated(important) with the target, it is possible. And I think historical dataset(9*981) is increased, but it seems to be decreases in correlation or Importance to the target.
  3 个评论
Jeong_evolution
Jeong_evolution 2017-6-7
编辑:Jeong_evolution 2017-6-7
Input parameter = Input
target = output
historical dataset = Input+Output(=all dataset)
If you let me know the characteristic of dataset, I will let you know as far as I know.
afef
afef 2017-6-10
I have some statistical feature extracted from EEG signal to detect epileptic seizure and this is a part of the input and target that i used

请先登录,再进行评论。

更多回答(2 个)

Jeong_evolution
Jeong_evolution 2017-6-7
Add, you have to select Input parameters that is more related with target before using NN.

Greg Heath
Greg Heath 2017-6-10
With respect to the original question:
You really cannot deduce anything worthwhile about performance on the N = 981 dataset by using one subset of n = 102. Also, it is not clear if the 102 are all training data or are divided into trn/val/tst subsets.
A more rigorous approach would be to use m-fold cross validation which uses data RANDOMLY divided into m subsets of size M ~= 981/m. This can be repeated as many times as you want because all of the data is randomly distributed. In particular you can optimize m and separate the 3 trn/val/tst performances.
Note that this is different from traditional stratified m-fold crossval where each point is only in one of the m subsets. However, it is MUCH easier to implement and can be repeated as many times as needed to reduce prediction uncertainties.
Hope this helps.
Thank you for formally accepting my answer
Greg
  1 个评论
afef
afef 2017-6-11
I used at first a dataset with N= 981 and because i didn't get a good accuracy so i tried a small dataset with N= 102 to see if the performance is better . Concerning the m-fold cross validation how could i do it please?

请先登录,再进行评论。

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by