## Credit Scorecard Modeling Using Observation Weights

When creating a `creditscorecard`

object, the table used
for the input `data`

argument either defines or does not define
observational weights. If the `data`

does not use weights, then the
"counts" for `Good`

, `Bad`

, and
`Odds`

are used by credit score card functions. However, if the
optional `WeightsVar`

argument is specified when creating a
`creditscorecard`

object, then the "counts" for
`Good`

, `Bad`

, and `Odds`

are the
sum of weights.

For example, here is a snippet of an input table that does not define observational weights:

If you bin the customer age predictor data, with customers up to 45 years old in one bin, and 46 and up in another bin, you get these statistics:

`Good`

means the total number of rows with a `0`

value in the `status`

response variable. `Bad`

the
number of `1`

’s in the `status`

column.
`Odds`

is the ratio of `Good`

to
`Bad`

. The `Good`

, `Bad`

, and
`Odds`

is reported for each bin. This means that there are 381
people in the sample who are 45 and under who paid their loans, 241 in the same age
range who defaulted, and therefore, the odds of being good for that age range is
`1.581`

.

Suppose that the modeler thinks that people 45 and younger are underrepresented in
this sample. The modeler wants to give all rows with ages up to 45 a higher weight.
Assume that the modeler thinks the up to 45 age group should have 50% more weight than
rows with ages 46 and up. The table data is expanded to include the observation weights.
A `Weight`

column is added to the table, where all rows with ages 45
and under have a weight of `1.5`

, and all other rows a weight of
`1`

. There are other reasons to use weights, for example, recent
data points may be given higher weights than older data points.

If you bin the weighted data based on age (45 and under, versus 46 and up) the
expectation is that each row with age 45 and under must count as 1.5 observations, and
therefore the `Good`

and `Bad`

“counts” are increased
by 50%:

The “counts” are now “weighted frequencies” and are no longer integer values. The
`Odds`

do not change for the first bin. The particular weights
given in this example have the effect of scaling the total `Good`

and
`Bad`

counts in the first bin by the same scaling factor, therefore
their ratio does not change. However, the `Odds`

value of the total
sample does change; the first bin now carries a higher weight, and because the odds in
that bin are lower, the total `Odds`

are now lower, too. Other credit
scorecard statistics not shown here, such as `WOE`

and
`Information Value`

are affected in a similar way.

In general, the effect of weights is not simply to scale frequencies in a particular bin, because members of that bin will have different weights. The goal of this example is to demonstrate the concept of switching from counts to the sum of weights.

## See Also

`creditscorecard`

| `autobinning`

| `bininfo`

| `fitmodel`

| `validatemodel`

## Related Examples

- Bias Mitigation in Credit Scoring by Reweighting (Risk Management Toolbox)