risk.validation.kendallTau

Compute Kendall tau correlation

Since R2026a

Syntax

kendallValue = risk.validation.kendallTau(X,Y)

kendallValue = risk.validation.kendallTau(X,Y,Type=variant)

[kendallValue,Output] = risk.validation.kendallTau(___)

Description

kendallValue = risk.validation.kendallTau(X,Y) returns the Kendall tau correlation between X and Y, where each input can represent quantities such as rankings or predictions, probability of default (PD), or loss given default (LGD) estimates. For example, in credit scoring models, you can use this metric to measure the correlation between predicted and observed credit rating grades.

example

kendallValue = risk.validation.kendallTau(X,Y,Type=variant) optionally specifies the Kendall tau variant used in the correlation computation.

"a" — tau-a variant
"b" — tau-b variant (default)
"c" — tau-c variant

For information about choosing a Kendall tau variant, see variant.

example

[kendallValue,Output] = risk.validation.kendallTau(___) also returns Output, a structure containing fields with additional details about the correlation:

Type — Variant used in correlation computation
PValue — Statistical significance of correlation

Use this syntax to confirm the applied variant and evaluate the statistical significance of the correlation.

Examples

collapse all

Compute Kendall tau Correlation of Credit Rating Data

Open Live Script

Load the LGD data, which contains the expected and realized LGD grades for 200 defaulted customers, as well as the year the recovery process was closed.

lgdData = readtable("LGDRatingGradeData.csv")

lgdData=200×5 table
    ExpectedLGD    ExpectedLGDGrade    RealizedLGD    RealizedLGDGrade    RecoveryClosedYear
    ___________    ________________    ___________    ________________    __________________

      0.21067              4             0.11221              3                  2023       
      0.68432              8             0.90534             11                  2023       
       0.4815              6             0.35169              5                  2023       
      0.86389             10             0.91321             11                  2023       
      0.85797             10             0.84803             10                  2023       
      0.29985              4             0.24432              4                  2023       
      0.30411              5                   0              1                  2023       
      0.88206             10             0.65058              8                  2023       
        1.054             12             0.73065              9                  2023       
      0.96353             11             0.61439              8                  2023       
       0.3936              5            0.061432              2                  2023       
      0.55109              7             0.59951              7                  2023       
      0.75181              9             0.55973              7                  2023       
      0.78397              9             0.68201              8                  2023       
      0.40728              6             0.27822              4                  2023       
      0.61732              8             0.38996              5                  2023       
      ⋮

The ExpectedLGD and RealizedLGD table variables contain values from 1 to 12, which correspond to 12 rating grades. The values in ExpectedLGDGrade were calculated based on the characteristics of the borrower before the default event and are predictions for the rating grade at the end of the recovery period. RealizedLGDGrade contains the actual grades of the portfolios at the end of the observation period.

Compute the Kendall tau correlation for the grades.

X = lgdData.ExpectedLGDGrade;
Y = lgdData.RealizedLGDGrade;
[kendallValue,Output] = risk.validation.kendallTau(X,Y)

kendallValue = 
0.5789

Output = struct with fields:
    KendallTau: 0.5789
          Type: "b"
        PValue: 2.2079e-28

The output indicates a moderate to strong positive association between the expected and realized grades. This result is consistent with the low p-value under the null hypothesis of no association.

Input Arguments

collapse all

`X` — Input vector
numeric vector | logical vector

Input vector, specified as a numeric or logical vector that contains values such as rankings or predictions, PD, or LGD estimates. X must be the same size as Y.

`Y` — Input vector
numeric vector | logical vector

Input vector, specified as a numeric or logical vector that contains values such as rankings or predictions, PD, or LGD estimates. Y must be the same size as X.

`variant` — Variant option
`"a"` | `"b"` (default) | `"c"`

Variant option to compute the Kendall tau correlation, specified as "a", "b", or "c".

"a" — Applies the tau-a variant, appropriate when both X and Y have unique ranks with no ties.
"b" — Applies the tau-b variant, an adjusted measure that accounts for ties in X, Y, or both (also called Kendall's correlation).
"c" — Applies the tau-c variant, an adjusted measure that addresses ties and accounts for cases where the number of unique values of X and Y differ.

For definitions of the different variants, see Kendall tau Correlation.

Output Arguments

collapse all

`kendallValue` — Kendall tau correlation
scalar in the interval `[-1,1]`

Kendall tau correlation, returned as a scalar in the interval [-1,1].

-1 — indicates complete disagreement between X and Y
1 — indicates complete agreement
0 — indicates no association

`Output` — Output metrics
structure

Output metrics, returned as a structure with the following fields:

KendallTau — Represents the same value as kendallValue.
Type — Variant option used to compute kendallValue.
PValue — p-value for the hypothesis test where the null hypothesis states that there is no association between X and Y. A small p-value indicates that kendallValue differs significantly from zero.

More About

collapse all

Kendall tau Correlation

Compute the Kendall tau correlation between two input variables X and Y with one of three variants: tau-a, tau-b, or tau-c.

The tau-a variant is defined by:

$τ_{a} = \frac{n_{c} - n_{d}}{n_{o}}$

where n₀ is the total number of pairs given by n₀=n(n-1)/2, where n is the length of X, n_c is the number of concordant pairs (x_i,y_i) and (x_j,y_j) in X and Y, and n_d is the total number of discordant pairs.

A pair is concordant if one of the following conditions is met.

x_i<x_j and y_i<y_j
x_j<x_i and y_j<y_i

A pair is discordant if one of the following conditions is met:

x_i<x_j and y_i>y_j
x_j<x_i and y_i<y_j

The tau-b variant is also known as Kendall's correlation and is defined by:

$τ_{b} = \frac{n_{c} - n_{d}}{\sqrt{(n_{o} - n_{1}) (n_{o} - n_{2})}}$

where n₁ is an adjustment for ties in X and n₂ is an adjustment for ties in Y. Kendall's correlation is supported in the corr function in Statistics and Machine Learning Toolbox™.

The tau-c variant adjusts for the difference in the number of unique values in X and Y and is given by:

$τ_{c} = \frac{2 (n_{c} - n_{d})}{n^{2} \frac{(m - 1)}{m}} = τ_{a} \frac{n - 1}{n} \frac{m}{m - 1}$

where m is the minimum of the number of unique values in X and the number of unique values in Y.

Since the Kendall tau value is rarely exactly 0, it is important to assess whether the observed correlation is statistically significant. A common approach is to compute a p-value under the null hypothesis of no association. A small p-value indicates that the correlation differs significantly from zero.

References

[1] Basel Committee on Banking Supervision, “Studies on the Validation of the Internal Rating Systems.” May, 2005. https://www.bis.org/publ/bcbs_wp14.htm.

[2] Göktas, A., & Isçi, Ö. (2011). "A comparison of the most commonly used measures of association for doubly ordered square contingency tables via simulation." Metodoloski Zvezki, 8(1), 17–37.

[3] Baesens, Bart, et al. "Credit Risk Analytics: Measurement Techniques, Applications, and Examples in SAS." 1st ed., Wiley, 2016.

Version History

Introduced in R2026a

risk.validation.kendallTau

Syntax

Description

Examples

Compute Kendall tau Correlation of Credit Rating Data

Input Arguments

X — Input vector numeric vector | logical vector

Y — Input vector numeric vector | logical vector

variant — Variant option "a" | "b" (default) | "c"

Output Arguments

kendallValue — Kendall tau correlation scalar in the interval [-1,1]

Output — Output metrics structure

More About

Kendall tau Correlation

References

Version History

See Also

`X` — Input vector
numeric vector | logical vector

`Y` — Input vector
numeric vector | logical vector

`variant` — Variant option
`"a"` | `"b"` (default) | `"c"`

`kendallValue` — Kendall tau correlation
scalar in the interval `[-1,1]`

`Output` — Output metrics
structure