Probablity of outputs of binary classification in matlab

Question

Jack 2014-4-23

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/126834-probablity-of-outputs-of-binary-classification-in-matlab

评论： Greg Heath 2014-5-8

Hi

I have a binary classification problem and using neural network and SVM for it. So I choose a threshold (For instance 0.5) for output of neural network. If output is greater than 0.5 it belongs to class 1 and if it is smaller than 0.5 it belongs to class2. After training the network, for out sample data how can I calculate probability of outputs? For out sample data I use same criteria (0.5) and find the class out new these new data? Can we say if output of neural network is 1, the probability of belonging to class 1 (greater than 1 is class 1) is higher than for instance 0.55 ? ( I used tansig transfer function for output layer of neural network).

Another question: How can I find probability (or possibility) of belonging to each class in SVM?

I want find how much an out sample belong to a specific class. Can I do that? IS any function in Matlab for it?

Thanks.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Greg Heath 2014-4-23

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/126834-probablity-of-outputs-of-binary-classification-in-matlab#answer_134287

编辑：Greg Heath 2014-4-25

在 MATLAB Online 中打开

If you use columns of eye(2) for targets, the outputs will be consistent (i.e., as N-> inf) estimates of the input-conditional posterior class probabilities provided the correct objective function is used.

Typically purelin, logsig and softmax are used as output transfer functions. Although MSE is reasonablr for the first two, crossentropy should be used for the latter.

In spite of being consistent estimates purelin does not enforce [0,1] and logsig does not enforce sum(estimates) = 1.

The conversion between the probability targets and estimates and the class indices is obtained using the functions ind2vec and vec2ind.

 help ind2vec, doc ind2vec
 nelp vec2ind, doc vec2ind

Hope this helps.

Thank you for formally accepting my answer

Greg

PS search using

 greg patternnet vec2ind
 greg patternnet ind2vec

2 个评论
显示无隐藏无

Jack 2014-5-7

编辑：Jack 2014-5-7

Hi Greg.

A brief question about your answer. As you said the only way that I can have a neural network that the outputs are probabilities of membership in every class is using ‘softmax’ for output transfer function and cross-entropy for output performance?

Can I use only ‘softmax’ for output layer? For instance ‘softmax’ for output layer, ‘tansig’ for hidden layer, ‘mse’ for performance measurement and ‘trainlm’ for training function because ‘trainlm’ has best classification accuracy in my case and I can’t use it with cross entropy.

You said "In spite of being consistent estimates ‘purelin’ does not enforce [0,1] and ‘logsig’ does not enforce sum(estimates) = 1." so by using ‘logsig’ or ‘purelin’ we can’t obtain probabilities. Is this true? For example by using “MSE”, “Trainlm” and logsig for output in a binary classification problem I can have 0.7 and 1.3 for outputs! Is this true? How can I describe these numbers?

Thanks

Greg Heath 2014-5-8

在 MATLAB Online 中打开

>Can I use only ‘softmax’ for output layer? For instance ‘softmax’ for output layer, ‘tansig’ for hidden layer, ‘mse’ for performance measurement and ‘trainlm’ for training function because ‘trainlm’ has best classification accuracy in my case and I can’t use it with cross entropy.

>You said "In spite of being consistent estimates ‘purelin’ does not enforce [0,1] and ‘logsig’ does not enforce sum(estimates) = 1." so by using ‘logsig’ or ‘purelin’ we can’t obtain probabilities. Is this true? For example by using “MSE”, “Trainlm” and logsig for output in a binary classification problem I can have 0.7 and 1.3 for outputs! Is this true? How can I describe these numbers?

 Why would you want to use trainlm? trainscg is the default for patternnet.
 Logsig cannot yield 1.3. 
 Purelin can yield 1.3, BUT NOT 1.3 AND 0.7.  
 I did not say XENT/SOFTMAX "is the only way"
 I did not say outputs "ARE probabilities"
 I did say outputs will be CONSISTENT ESTIMATES

The MOST INPORTANT THING is that the correct class corresponds to the largest output. The relative values reflect the confidence in the estimate.

The following is only a rough remembrance from more precise posts of mine in comp.ai.neural-nets and comp.soft-sys.matlab. They can be found using the search keywords

greg softmax

If you use columns of eye(c) for targets of c classes and MSE, XENT1 (non-mutually exclusive classes) or XENT2 (mutually exclusive classes) as the minimization objective function, the outputs y(i), 1<=i<=c, will be CONSISTENT (i.e., as N-> inf) ESTIMATES of the input-conditional posterior class probabilities P(i)*p(i|x), 1<= i <= c.

There are 3 traditional canonical objective-function/transfer-function pairs with the following properties at the objective function minimum

MSE/PURELIN     sum(outputs) = 1
XENT1/LOGSIG    0 < outputs < 1
XENT2/SOFTMAX   0 < outputs < 1, sum(outputs) = 1

Using the canonical pairs leads to simple expressions for the objective function derivatives.

If multiple classes are mutually exclusive, softmax is the most reasonable choice because the estimates always sum to 1. However, often logsig is used and after convergence, the outputs are divided by the sum. I don't recall the latter as being that reliable.

Unfortunately, classes with many more training examples than the others will adversely effect results. Therefore some correction methods discussed in previous posts sould be considered. Search using

greg unbalanced (yea, I know)

请先登录，再进行评论。

Probablity of outputs of binary classification in matlab

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

2 个评论
显示无隐藏无

更多回答（0 个）

另请参阅

类别

标签

产品

Community Treasure Hunt

Probablity of outputs of binary classification in matlab

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

2 个评论 显示 无隐藏 无

更多回答（0 个）

另请参阅

类别

标签

产品

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

2 个评论
显示无隐藏无