Does k-fold cross validation in the Classification Learner app stratify the data by default?

4 次查看(过去 30 天)
I would like to know whether when using k-fold cross validation in the Classification Learner app, the data are stratified by default or not.
If one selects the "Generate Function" option in the app, the resulting script uses the next function for cross validation:
%Perform cross-validation
partitionedModel = crossval(trainedClassifier.ClassificationEnsemble, 'KFold', 5);
According to Mathworks official resources for the function "crossval", an alternative to the argument 'KFold' is 'Stratify' which would perform a stratification of the dataset explicitly. Does anybody know how this works in the Classification Learner app internally?
Thanks

回答(1 个)

Sameer
Sameer 2025-6-30
Yes, the Classification Learner app in MATLAB uses stratified k-fold cross-validation by default when performing classification tasks.
Here’s how it works:
  • When you select cross-validation in the app, it internally uses the "cvpartition" function with the response variable (class labels) as the grouping variable.
  • In "cvpartition" when a grouping variable is provided, stratification is automatically applied. This ensures that each fold maintains the same class distribution as the original dataset.
  • Even in the generated code (e.g., crossval(trainedClassifier.ClassificationEnsemble, 'KFold', 5)), the crossval function uses the model’s response variable behind the scenes, which results in stratified partitioning.
This behavior is specific to classification problems. For regression tasks, stratification is not applied because there are no discrete class labels to stratify by.
For more details, please refer to the MathWorks documentation:
Hope this helps!

类别

Help CenterFile Exchange 中查找有关 Classification Learner App 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by