Neural network work better with small dataset than largest one ?

Question

afef 2017-6-7

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/343715-neural-network-work-better-with-small-dataset-than-largest-one

评论： afef 2017-6-11

Hi,i create neural network using nprtool at the begining i used input matrix with 9*981 but i got accuracy in the confusion matrix of 65% then i reduced the samples and i used input matrix with 9*102 and i got accuracy of 94.1% . So is this possible and correct ? and i want to know what's the reason for that.

Thanks

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Jeong_evolution 2017-6-7

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/343715-neural-network-work-better-with-small-dataset-than-largest-one#answer_269878

编辑：Jeong_evolution 2017-6-7

If the Input parameter in historical dataset(9*102) is highly correlated(important) with the target, it is possible. And I think historical dataset(9*981) is increased, but it seems to be decreases in correlation or Importance to the target.

3 个评论
显示 1更早的评论隐藏 1更早的评论

Jeong_evolution 2017-6-7

编辑：Jeong_evolution 2017-6-7

Input parameter = Input

target = output

historical dataset = Input+Output(=all dataset)

If you let me know the characteristic of dataset, I will let you know as far as I know.

afef 2017-6-10

I have some statistical feature extracted from EEG signal to detect epileptic seizure and this is a part of the input and target that i used

请先登录，再进行评论。

Answer 2

Jeong_evolution 2017-6-7

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/343715-neural-network-work-better-with-small-dataset-than-largest-one#answer_269880

Add, you have to select Input parameters that is more related with target before using NN.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Answer 3

Greg Heath 2017-6-10

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/343715-neural-network-work-better-with-small-dataset-than-largest-one#answer_270244

With respect to the original question:

You really cannot deduce anything worthwhile about performance on the N = 981 dataset by using one subset of n = 102. Also, it is not clear if the 102 are all training data or are divided into trn/val/tst subsets.

A more rigorous approach would be to use m-fold cross validation which uses data RANDOMLY divided into m subsets of size M ~= 981/m. This can be repeated as many times as you want because all of the data is randomly distributed. In particular you can optimize m and separate the 3 trn/val/tst performances.

Note that this is different from traditional stratified m-fold crossval where each point is only in one of the m subsets. However, it is MUCH easier to implement and can be repeated as many times as needed to reduce prediction uncertainties.

Hope this helps.

Thank you for formally accepting my answer

Greg

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

afef 2017-6-11

I used at first a dataset with N= 981 and because i didn't get a good accuracy so i tried a small dataset with N= 102 to see if the performance is better . Concerning the m-fold cross validation how could i do it please?

请先登录，再进行评论。

Neural network work better with small dataset than largest one ?

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

3 个评论
显示 1更早的评论隐藏 1更早的评论

更多回答（2 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

另请参阅

类别

标签

产品

Community Treasure Hunt

Neural network work better with small dataset than largest one ?

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

3 个评论 显示 1更早的评论隐藏 1更早的评论

更多回答（2 个）

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

另请参阅

类别

标签

产品

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

3 个评论
显示 1更早的评论隐藏 1更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

1 个评论
显示 -1更早的评论隐藏 -1更早的评论