- Split data into train and test sets.
- K-Fold cross-validate on the training data to estimate generalization error.
- Select the model with least generalization error.
- Train it from scratch with the entire training data
- Test it on the test data.
How to select samples to leave out when building my regression model and automate it
1 次查看(过去 30 天)
显示 更早的评论
I am using fitrsvm (Support Vector Regression) on my data matrix (I have attached an excel example version of it).
I have about 22 blocks with all the intensities from their samples from column 4 to the end. The response values are on column 3 and the blocks are on the first column
I use 80% of the data to train and 20% to test
I leave out all samples form a block or more each time I build/train a model until I end up with the best model
Eact time I leave out the samples I also need them to match the response value column
I have been doing this manually but it is not a trivial process
I would like a code to automate this process bearing in mind that fitrsvm does not work with non numeric values like blocks column in the training data matrix
Can you help me the code to automate the process to leave out one or more blocks each time I build a model, please?.
Thank You
0 个评论
采纳的回答
Nayan
2023-3-8
As described above, I would suggest you to perform k-Fold Cross Validation. K-Fold Cross Validation helps in splitting the data into K-blocks, training the Model on "K-1" folds and testing the trained model on the left out Kth fold ( to report the generalization error). You should be performing the following steps : -
These steps can help you to automate the process.
You can go through the following link to know more about cross-validation and ease of coding your model.
0 个评论
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Gaussian Process Regression 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!