Automated hyperparameter optimization with costmatrix

1 次查看(过去 30 天)
Hello everyone,
I am a student and new to the subject of machine learning. I need help with the following binary classification problem:
(training data = 50,000, test data = 480,000)
I am using an Ensemble Boosted Trees model with a cost matrix to avoid critical false negative errors as much as possible.
costMatrix = [0 1;100 0];
function Mdl = trainBoostedTrees(trainData, trainLabels, costMatrix)
% Function to train an Ensemble Boosted Tree model
% Train base model
template = templateTree('MaxNumSplits', 20, 'NumVariablesToSample', 'all');
% Create ROC curve for the base model
Mdl = fitcensemble(trainData, trainLabels, 'Method', 'AdaBoostM1', ...
'NumLearningCycles', 30, 'Learners', template, 'LearnRate', 0.1, ...
'Cost', costMatrix);
end
When I test the model withe the Testdata, I get the following overall accuracy and critical error rate:
accuracy = 0.72
critical_FN_error = 71
Now, I want to improve the results concerning the accuracy and the number of critical errors by performing an automated hyperparameter optimization for this model.
function Mdl = Hp_OP_BoostedTrees(trainData, trainLabels, costMatrix)
% Function to train an Ensemble Boosted Tree model
template = templateTree('NumVariablesToSample', 'all');
% Train the Ensemble Boosted Tree model
Optimize = {'NumLearningCycles','MinLeafSize','MaxNumSplits'};
options = struct('Optimizer','bayesopt','MaxObjectiveEvaluations',25,'UseParallel',true,...
'ShowPlots', true );
Mdl = fitcensemble(trainData, trainLabels, 'Method', 'AdaBoostM1', 'Learners', template, 'Cost', costMatrix, 'OptimizeHyperparameters', Optimize, 'HyperparameterOptimizationOptions', options);
end
However, it seems that when I do this with the following code, my stored cost matrix is no longer considered. With the optimized parameters, I get the following results:
accuracy = 0.85
critical_FN_error = 1547
Can someone please explane this to me why I get worse results regarding the critical_FN_error?
How can I get Improve my modell?
  1 个评论
the cyclist
the cyclist 2023-8-7
I can't explain what you are seeing, so I won't post an "answer". But I can make a couple suggestions about how I would proceed.
First, you mention that you got a lot higher value for critical_FN_error. And then you infer that your stored cost matrix is no longer considered. You could be right about that inference, but I'm not sure.
With 480,000 observatations in your test data, 1547 is still a relatively small number of false negatives (depending on how imbalanced your true positive/negative case ratio is). So, the first thing I would try would be to remove that cost matrix from the code, and see what happens. It could be that you'll get vastly more false negatives. That is something of a test of whether the cost matrix is ignored, as you surmise.
(You could also breakpoint the code just before the call to fitcensemble, and step line-by-line through the code, and see if it actually uses Cost. That's pretty tedious, though, because calculations are often deeply nested.)
If it turns out that Cost is being used, and you are getting an unacceptably higher number of false negatives, then I guess you just need to penalize them even more, e.g. by a factor of 1000 instead of 100.

请先登录,再进行评论。

回答(1 个)

Udit06
Udit06 2023-9-5
Hi Lars,
I understand that you are obtaining more false negative errors after you perform automated hyperparameter optimization. You can try out the following approaches to resolve the issue.
1) Instead of using “AdaBoostM1” as an ensemble method, you can try using “Bag” as an ensemble method. The key difference between bagging and boosting method is that the boosting automatically assigns higher weights to the misclassified samples of the previous iterations whereas bagging assigns equal weights to all the samples. Thus, the automatic assignment of weights to the misclassified points could create a problem in your case. You can refer to below MathWorks documentation to understand more about how boosting algorithm works.
2) The size of the training dataset (50,000) is much smaller than the size of the testing dataset (480,000), which is generally not the case while training machine learning models. Hence, you can try increasing the size of the training data.
3) As suggested in the comments, you can try penalizing the false negative more by increasing the value corresponding to the false negatives in the costMatrix variable.
I hope this helps

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by