Do Catboost in Matlab for high dimensional dataset

28 次查看(过去 30 天)
Dear friend,
Currently, I am trying various approaches to improve the performance of my model on a high dimensional spectrometry dataset for binary classification. My aim is to improve upon python's lightGBM's 0.74 AUC for this dataset. However, I am struggling to get anywhere close to this to this using the matlab packages for variable selection and stats ml modelling packages. Is there a possibility to provide Catboost for matlab or a model that would perform better than lightGBM for a high dimensional dataset (e,g, with 6000 variables spectrometry dataset) ?
Thanks,
s0810110

回答(1 个)

Shubham
Shubham 2024-1-18
Hi Tim,
There isn't a direct implementation of CatBoost for MATLAB. However, there are a few strategies you could consider to potentially improve the performance of your models on high-dimensional data in MATLAB:
Feature Selection/Reduction:
  • Use MATLAB's built-in functions for feature selection, such as sequentialfs (sequential feature selection), relieff (ReliefF algorithm), or fscmrmr (Minimum Redundancy Maximum Relevance). Refer to this documentation link: https://in.mathworks.com/help/stats/sequentialfs.html
  • Consider dimensionality reduction techniques like PCA (pca function) or t-SNE (tsne function) to reduce the number of variables while retaining most of the variance in the data. Refer to this documentation link: https://in.mathworks.com/help/stats/tsne.html
Ensemble Methods:
Hyperparameter Optimization:
Advanced Preprocessing:
Deep Learning:
AUC is a good metric for binary classification problems, but you should also consider others such as accuracy, precision, recall, and F1-score for a comprehensive evaluation.

类别

Help CenterFile Exchange 中查找有关 Dimensionality Reduction and Feature Extraction 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by