Main Content

transprobprep

Preprocess credit ratings data to estimate transition probabilities

Description

[prepData] = transprobprep(data) preprocesses credit ratings historical data (that is, credit migration data) for the subsequent estimation of transition probabilities.

example

[prepData] = transprobprep(___,Name,Value) adds optional name-value pair arguments.

example

Examples

collapse all

Load input data from the file Data_TransProb.mat and display the first ten rows. In this example, the inputs are provided in character vector format.

load Data_TransProb
  
% Preprocess credit ratings data.
prepData = transprobprep(data)
prepData = struct with fields:
           idStart: [1506x1 double]
      numericDates: [4315x1 double]
    numericRatings: [4315x1 double]
     ratingsLabels: {'AAA'  'AA'  'A'  'BBB'  'BB'  'B'  'CCC'  'D'}
           weights: []

Estimate transition probabilities with the default settings.

transMat = transprob(prepData)
transMat = 8×8

   93.1170    5.8428    0.8232    0.1763    0.0376    0.0012    0.0001    0.0017
    1.6166   93.1518    4.3632    0.6602    0.1626    0.0055    0.0004    0.0396
    0.1237    2.9003   92.2197    4.0756    0.5365    0.0661    0.0028    0.0753
    0.0236    0.2312    5.0059   90.1846    3.7979    0.4733    0.0642    0.2193
    0.0216    0.1134    0.6357    5.7960   88.9866    3.4497    0.2919    0.7050
    0.0010    0.0062    0.1081    0.8697    7.3366   86.7215    2.5169    2.4399
    0.0002    0.0011    0.0120    0.2582    1.4294    4.2898   81.2927   12.7167
         0         0         0         0         0         0         0  100.0000

Estimate transition probabilities with the 'cohort' algorithm.

transMatCoh = transprob(prepData,'algorithm','cohort')
transMatCoh = 8×8

   93.1345    5.9335    0.7456    0.1553    0.0311         0         0         0
    1.7359   92.9198    4.5446    0.6046    0.1560         0         0    0.0390
    0.1268    2.9716   91.9913    4.3124    0.4711    0.0544         0    0.0725
    0.0210    0.3785    5.0683   89.7792    4.0379    0.4627    0.0421    0.2103
    0.0221    0.1105    0.6851    6.2320   88.3757    3.6464    0.2873    0.6409
         0         0    0.0761    0.7230    7.9909   86.1872    2.7397    2.2831
         0         0         0    0.3094    1.8561    4.5630   80.8971   12.3743
         0         0         0         0         0         0         0  100.0000

Input Arguments

collapse all

Using transprob to estimate transition probabilities given credit ratings historical data (that is, credit migration data), the data input can be one of the following:

  • An nRecords-by-3 MATLAB® table containing the historical credit ratings data of the form:

     ID          Date          Rating  
    __________  _____________  ______ 
    '00010283'  '10-Nov-1984'  'CCC'  
    '00010283'  '12-May-1986'  'B'    
    '00010283'  '29-Jun-1988'  'CCC'  
    '00010283'  '12-Dec-1991'  'D'    
    '00013326'  '09-Feb-1985'  'A'    
    '00013326'  '24-Feb-1994'  'AA'   
    '00013326'  '10-Nov-2000'  'BBB'  
    '00014413'  '23-Dec-1982'  'B'    
    Or an nRecords-by-4 MATLAB table containing weights and the historical credit ratings data of the form:
     ID          Date          Rating    Weight
    __________  _____________  ______    _____
    '00010283'  '10-Nov-1984'  'CCC'       1
    '00010283'  '12-May-1986'  'B'       1.4
    '00010283'  '29-Jun-1988'  'CCC'     1.8
    '00010283'  '12-Dec-1991'  'D'       0.2
    '00013326'  '09-Feb-1985'  'A'         0
    '00013326'  '24-Feb-1994'  'AA'        2
    '00013326'  '10-Nov-2000'  'BBB'     1.7
    '00014413'  '23-Dec-1982'  'B'       1.1    
    where each row contains an ID (column 1), a date (column 2), a credit rating (column 3), and an optional weight (column 4). Column 3 is the rating assigned to the corresponding ID on the corresponding date. All information corresponding to the same ID must be stored in contiguous rows. Sorting this information by date is not required, but recommended for efficiency. When using a MATLAB table input, the names of the columns are irrelevant, but the ID, date, rating information, and weights are assumed to be in the first, second, third, and fourth columns, respectively. Also, when using a table input, the first and third columns can be categorical arrays, and the second can be a datetime array. The following summarizes the supported data types for table input:

    Data Input TypeID (1st Column)Date (2nd Column)Rating (3rd Column)Weight (Optional 4th Column)
    Table

    • Numeric array

    • Cell array of character vectors

    • String array

    • Categorical array

    • Numeric array

    • Cell array of character vectors

    • String array

    • Datetime array

    • Numeric array

    • Cell array of character vectors

    • String array

    • Categorical array

    • Numeric array with nonnegative values

    Note

    For an example of using the data input argument with an optional fourth column for Weight, see Create Exposure-Based Transition Matrix From Historical Data of Credit Ratings with Exposures. If no weights are provided in a fourth column of the data, the default is to set all weights equal to 1. In this case, the weighted transition matrix output agrees with the ordinary, count-based transition matrix.

  • An nRecords-by-3 cell array of character vectors with the historical credit ratings data of the form:

    '00010283'  '10-Nov-1984'  'CCC' 
    '00010283'  '12-May-1986'  'B' 
    '00010283'  '29-Jun-1988'  'CCC'
    '00010283'  '12-Dec-1991'  'D'  
    '00013326'  '09-Feb-1985'  'A'  
    '00013326'  '24-Feb-1994'  'AA' 
    '00013326'  '10-Nov-2000'  'BBB' 
    '00014413'  '23-Dec-1982'  'B'  
    Or an nRecords-by-4 cell array of character vectors if weights are included with the historical credit ratings data of the form:
    '00010283'  '10-Nov-1984'  'CCC'  '1.2'
    '00010283'  '12-May-1986'  'B'    '1'
    '00010283'  '29-Jun-1988'  'CCC'  '1.2'
    '00010283'  '12-Dec-1991'  'D'    '0.2'
    '00013326'  '09-Feb-1985'  'A'    '1.7'
    '00013326'  '24-Feb-1994'  'AA'   '1.3'
    '00013326'  '10-Nov-2000'  'BBB'  '1'
    '00014413'  '23-Dec-1982'  'B'    '1.8'
    where each row contains an ID (column 1), a date (column 2), a credit rating (column 3), and an optional weight (Column 4). Column 3 is the rating assigned to the corresponding ID on the corresponding date. All information corresponding to the same ID must be stored in contiguous rows. Sorting this information by date is not required, but recommended for efficiency. IDs, dates, and ratings are stored in character vector format, but they can also be entered in numeric format. The following summarizes the supported data types for cell array input:

    Data Input TypeID (1st Column)Date (2nd Column)Rating (3rd Column)Weight (Optional 4th Column)
    Cell

    • Numeric elements

    • Character vector elements

    • Numeric elements

    • Character vector elements

    • Numeric elements

    • Character vector elements

    • Numeric elements with nonnegative values

    Note

    If no weights are provided in a fourth column of the data, the default is to set all weights equal to 1. In this case, the weighted transition matrix output agrees with the ordinary, count-based transition matrix.

Data Types: table | cell | struct

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: prepData = transprobprep(data,'labels',{'AAA','AA','A','BBB','BB','B','CCC','F'})

Credit-rating scale, specified as the comma-separated pair consisting of 'labels' and a nRatings-by-1, or 1-by-nRatings cell array of character vectors.

labels must be consistent with the ratings labels used in the third column of data. Use a cell array of numbers for numeric ratings, and a cell array for character vectors for categorical ratings.

Data Types: cell

Output Arguments

collapse all

Summary where the credit ratings information corresponding to each company starts and ends, returned as a structure with the following fields:

  • idStart — Array of size (nIDs+1)-by-1, where nIDs is the number of distinct IDs in column 1 of data. This array summarizes where the credit ratings information corresponding to each company starts and ends. The dates and ratings corresponding to company j in data are stored from row idStart(j) to row idStart(j+1)−1 of numericDates and numericRatings.

  • numericDates — Array of size n Records-by-1, containing the dates in column 2 of data, in numeric format.

  • numericRatings — Array of size nRecords-by-1, containing the ratings in column 3 of data, mapped into numeric format.

  • weights — (Optional) If weights are provided in the data, array of size nRecords-by-1, containing the weights in column 4 of data. If weights are not provided in data, prepData does not have a Weights field.

  • ratingsLabels — Cell array of size1-by-nRatings, containing the credit rating scale.

Version History

Introduced in R2011b

expand all