Multi-Cost-SVM

版本 1.0.0 (3.4 MB) 作者: Alberto Carlevaro
Multi cost SVM and probabilistic safety regions for exponential distributions.
23.0 次下载
更新时间 2024/5/21

Multi-Cost-SVM (and Probabilistic Safety Regions for exponential distributions)

Multi Cost SVM (MC-SVM) is a variant of Support Vector Machines (SVM) designed to accommodate multiple cost scenarios. By introducing multiple weighting parameters <math-renderer class="js-inline-math" style="display: inline" data-static-url="https://github.githubassets.com/static" data-run-id="5d26f3cd5177cee34992c9bdd39339ab">$\tau$</math-renderer>, MC-SVM adapts the cost function to balance false positive and false negative errors, enhancing the model's robustness across diverse scenarios. The result is a separation hyperplane indipendent from the sample probability of the data.

This algorithm was inspired by the concept of Probabilistic Safety Region (PSR)

i.e., the region where in high probability is possible to observe the event <math-renderer class="js-inline-math" style="display: inline" data-static-url="https://github.githubassets.com/static" data-run-id="5d26f3cd5177cee34992c9bdd39339ab">$S$</math-renderer>, that, we can suppose, represents a "safe" situation. It is interesting to note, and these considerations are reported in the code, that for exponential distributions the PSR takes the interesting form of a radius controllable set:

Key Features:

Parameterized Cost Function: MC-SVM incorporates a parameter <math-renderer class="js-inline-math" style="display: inline" data-static-url="https://github.githubassets.com/static" data-run-id="5d26f3cd5177cee34992c9bdd39339ab">$\tau$</math-renderer> to influence the cost function's behavior towards different types of errors. This parameterization allows to weight the SVMs with different weighting parameters, reducing the unbalanceness of the data and helping training a more robust algorithm.

System of SVMs: The algorithm constructs a system of <math-renderer class="js-inline-math" style="display: inline" data-static-url="https://github.githubassets.com/static" data-run-id="5d26f3cd5177cee34992c9bdd39339ab">$m$</math-renderer> SVMs using the same dataset but varying weights and offsets. Each SVM corresponds to a different value of <math-renderer class="js-inline-math" style="display: inline" data-static-url="https://github.githubassets.com/static" data-run-id="5d26f3cd5177cee34992c9bdd39339ab">$\tau$</math-renderer>, enabling the model to adapt to various cost scenarios.

The optimization problem is solved in its dual form

leading to the separation hyperplane

The error in the prediction (false or negative ratio) is then controlled using the following algorithm, based on the quantile regression idea that, discarding the regularization parameter (possible because we computed an independent hyperplane with the algorithm above), the weighting parameter corresponds to the false negative ratio:

Usage:

To utilize MC-SVM in your projects, follow these steps:

Download the Code: Clone the repository containing the MC-SVM implementation.

Configure Parameters: Adjust the value of <math-renderer class="js-inline-math" style="display: inline" data-static-url="https://github.githubassets.com/static" data-run-id="5d26f3cd5177cee34992c9bdd39339ab">$\tau$</math-renderer>, the kernels and other parameters according to your application requirements.

Train the Model: Provide your dataset and train the MC-SVM model using the provided training algorithm.

Evaluate Performance: Evaluate the model's performance on your test dataset and analyze its behavior under different cost scenarios.

Example:

Matlab

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

For a dataset composed by data sampled with different probabilities

Tau = rand(1,9); m = size(Tau,2);

kernel = 'polynomial';

param = 3;

eta = .001;

alpha_bar = MCSVM_Train(Xtr, Ytr, kernel, param, Tau, eta); # best hyperplane common to all the data

Specializing to a dataset with a known (or estimated) sample probability

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

tau = 1-epsilon; # to control the false positives

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

alpha_c = SSVM_Train_c(Xtr, Ytr, Xcl_p, Ycl_p, kernel, param, tau, eta, alpha_bar);

b = offset_c(Xtr, Ytr, Xcl_p, Ycl_p, alpha_c, kernel, param, eta, tau, alpha_bar); # best offset that realizes the control of the false positive ration on the desired (calibration) set.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

test

y_pred_ts = SSVM_Test(Xtr, Ytr, Xts_p, alpha_bar, b, 0, kernel, param, eta);

[TPR_SSVM, FPR_SSVM, TNR_SSVM, FNR_SSVM, F1_SSVM, ACC_SSVM] = ConfusionMatrix(Yts_p, y_pred_ts,'on');

disp(['False positive rate:',num2str(FPR_SSVM)])

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Contributions and Feedback:

Contributions to the MC-SVM algorithm are welcome! Feel free to submit bug reports, feature requests, or pull requests to improve the algorithm's functionality and usability.

References:

引用格式

Alberto Carlevaro (2025). Multi-Cost-SVM (https://github.com/AlbiCarle/Multi-Cost-SVM), GitHub. 检索时间: .

MATLAB 版本兼容性
创建方式 R2024a
与 R2021a 及更高版本兼容
平台兼容性
Windows macOS Linux

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

无法下载基于 GitHub 默认分支的版本

版本 已发布 发行说明
1.0.0

要查看或报告此来自 GitHub 的附加功能中的问题,请访问其 GitHub 仓库
要查看或报告此来自 GitHub 的附加功能中的问题,请访问其 GitHub 仓库