How to select the number of samples to train a Machine Learning algorithm?
2 次查看(过去 30 天)
显示 更早的评论
I working in a dataset of 12000 samples concerning about 5 years of an industrial process.
It is likely that during this time the plant has undergone changes (equipments, the performance drop itself, chemical products).
Is there a tool for identifying the best subset of this data? In my view, a temporal cut in the data could increase the quality of the models created.
3 个评论
Greg Heath
2019-2-4
As a common sense rule of thumb I try to use at least 10 to 30 times as many training points as unknown parameters that have to be estimated.
In addition I use 10 to 20 sets of random initial weights.
I assume , of course, that you ave examined plots of the data to initialize your common sense.
Hope this Helps
Greg
回答(1 个)
BERGHOUT Tarek
2019-2-3
u can use deep belif networks ; they are the best for feature sellection and mapping; and train you network by driven chunks of data "by randomly chosing a pairs of (inputs,targets)" and in the same time pire attention to your approximation function you must keep your error function in its local minimam. deep belif nets depands on a set of stacked auto_encoders that allows to tune all the parameters of the networks with small amount of training data
0 个评论
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!