主要内容

risk.validation.bayesianErrorRate

Compute Bayesian error rate

Since R2026a

    Description

    berValue = risk.validation.bayesianErrorRate(Score,BinaryResponse) returns the Bayesian error rate, berValue, where Score is a numeric vector that represents quantities such as rankings, predictions, probability of default (PD), or loss given default (LGD) estimates. The values in Score can represent individual credit scores or other credit data. BinaryResponse specifies the target state of each value in Score.

    berValue = risk.validation.bayesianErrorRate(Score,BinaryResponse,SortDirection=sortdir) optionally specifies the sorting direction of the unique values in Score.

    [berValue,Output] = risk.validation.bayesianErrorRate(___) also returns Output, a structure containing fields with additional metrics related to berValue: Threshold, TruePositiveRate, FalsePositiveRate, ErrorRate, and ReferenceErrorRate. Use this syntax when you need detailed performance metrics across multiple threshold levels.

    example

    Examples

    collapse all

    Compute the Bayesian error rate of credit scores by using a credit validation data set contained in creditValidationData.mat. This data set includes a table, ScorecardValidationData, that contains credit scores and their corresponding default status.

    Load the credit validation data and display the table.

    load CreditValidationData.mat
    head(ScorecardValidationData)
        CreditScore      PD       Default
        ___________    _______    _______
    
          579.86       0.14182       0   
          563.65       0.17143       0   
          549.52       0.20106       0   
          546.25       0.20845       0   
          485.34       0.37991       0   
          482.07       0.39065       0   
          579.86       0.14182       1   
          451.73         0.494       0   
    

    Extract the variables CreditScore and Default from the table ScorecardValidationData. Use Default as the BinaryResponse input argument.

    Scores = ScorecardValidationData.CreditScore;
    BinaryResponse = ScorecardValidationData.Default;

    Sort the credit scores in ascending order and compute the Bayesian error rate. For credit models, sorting scores from low to high ranks individuals from higher to lower risk.

    [berValue,Output] = risk.validation.bayesianErrorRate(Scores,BinaryResponse,SortDirection="ascending")
    berValue = 
    0.3222
    
    Output = struct with fields:
                 BayesianErrorRate: 0.3222
                      MeanResponse: 0.3500
        ReferenceBayesianErrorRate: 0.4115
                           Metrics: [107×5 table]
    
    

    Display the metrics contained in Output.

    head(Output.Metrics)
        Threshold    TruePositiveRate    FalsePositiveRate    ErrorRate    ReferenceErrorRate
        _________    ________________    _________________    _________    __________________
    
         408.99                 0                   0             0.35              0.5      
         408.99          0.071429            0.012821          0.33333           0.4707      
         410.12          0.079365            0.017094          0.33333          0.46886      
         430.66          0.087302            0.017094          0.33056           0.4649      
         435.52          0.087302            0.025641          0.33611          0.46917      
         436.65           0.10317            0.029915          0.33333          0.46337      
         439.33           0.11905            0.029915          0.32778          0.45543      
         440.45           0.13492            0.029915          0.32222           0.4475      
    

    Input Arguments

    collapse all

    Score values, specified as a numeric vector representing quantities such as rankings, predictions, PD, or LGD estimates.

    Binary response, specified as a numeric or logical 1 (true) or 0 (false) vector. This vector represents the target state for each value in Score. For example, you can use BinaryResponse to represent a discretized LGD target, where a 1 indicates a high LGD value.

    Sorting direction of unique values in Score, specified as "descending" or "ascending". If Score contains credit scores where lower values indicate higher risk, set sortdir to "ascending" to ensure that TruePositiveRate represents the proportion of defaulters. If Score contains PD values where higher values indicate higher risk, sorting in descending order is common practice.

    Output Arguments

    collapse all

    Bayesian error rate for the values in Score, returned as a numeric scalar. berValue is defined as the minimum of ErrorRate.

    Output metrics, returned as a structure containing the following fields:

    • BayesianErrorRate — Same value as berValue.

    • MeanResponse — Mean of BinaryResponse, used to compute ErrorRate and BayesianErrorRate.

    • ReferenceBayesianErrorRate — Bayesian error rate computed with probability of 0.5.

    • Metrics — Table with columns:

      • Threshold — Unique score values sorted by sortdir.

      • TruePositiveRate — True positive rates for unique scores in Thresholds. In credit scoring models, this metric represents the proportion of defaulters.

      • FalsePositiveRate — False positive rates for unique scores in Threshold. In credit scoring models, this metric corresponds to the proportion of nondefaulters, also known as the false alarm rate, fallout, or 1 - specificity, where specificity is the true negative rate.

      • ErrorRate — Vector defined by MeanResponse*(1 - TruePositiveRate) + (1 - MeanResponse)*FalsePositiveRate.

      • ReferenceErrorRate — Vector defined by 0.5*(1 - TruePositiveRate) + 0.5*FalsePositiveRate.

    Version History

    Introduced in R2026a