- When you select cross-validation in the app, it internally uses the "cvpartition" function with the response variable (class labels) as the grouping variable.
- In "cvpartition" when a grouping variable is provided, stratification is automatically applied. This ensures that each fold maintains the same class distribution as the original dataset.
- Even in the generated code (e.g., crossval(trainedClassifier.ClassificationEnsemble, 'KFold', 5)), the crossval function uses the model’s response variable behind the scenes, which results in stratified partitioning.
Does k-fold cross validation in the Classification Learner app stratify the data by default?
4 次查看(过去 30 天)
显示 更早的评论
I would like to know whether when using k-fold cross validation in the Classification Learner app, the data are stratified by default or not.
If one selects the "Generate Function" option in the app, the resulting script uses the next function for cross validation:
%Perform cross-validation
partitionedModel = crossval(trainedClassifier.ClassificationEnsemble, 'KFold', 5);
According to Mathworks official resources for the function "crossval", an alternative to the argument 'KFold' is 'Stratify' which would perform a stratification of the dataset explicitly. Does anybody know how this works in the Classification Learner app internally?
Thanks
0 个评论
回答(1 个)
Sameer
2025-6-30
Hi @Juan
Yes, the Classification Learner app in MATLAB uses stratified k-fold cross-validation by default when performing classification tasks.
Here’s how it works:
This behavior is specific to classification problems. For regression tasks, stratification is not applied because there are no discrete class labels to stratify by.
For more details, please refer to the MathWorks documentation:
Hope this helps!
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Classification Learner App 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!