# Cox Proportional Hazards Model for Censored Data

This example shows how to construct a Cox proportional hazards model, and assess the significance of the predictor variables.

### Step 1. Load sample data.

`load readmissiontimes`

The response variable is `ReadmissionTime`, which shows the readmission times for 100 patients. The predictor variables are `Age`, `Sex`, `Weight`, and the smoking status of each patient, `Smoker`. 1 indicates the patient is a smoker, and 0 indicates that the patient does not smoke. The column vector `Censored` has the censorship information for each patient, where 1 indicates censored data, and 0 indicates the exact readmission times are observed. This is simulated data.

### Step 2. Fit Cox proportional hazards function.

Fit a Cox proportional hazard function with the variable Sex as the predictor variable, taking the censoring into account.

```X = Sex; [b,logl,H,stats] = coxphfit(X,ReadmissionTime,'censoring',Censored);```

Assess the statistical significance of the term Sex.

`stats`
```stats = struct with fields: covb: 0.1016 beta: -1.7642 se: 0.3188 z: -5.5335 p: 3.1392e-08 csres: [100x1 double] devres: [100x1 double] martres: [100x1 double] schres: [100x1 double] sschres: [100x1 double] scores: [100x1 double] sscores: [100x1 double] ```

The $p$-value, `p`, indicates that the term Sex is statistically significant.

Save the loglikelihood value with a different name. You will use this to assess the significance of the extended models.

`loglSex = logl`
```loglSex = -262.1365 ```

### Step 3. Add Age and Weight to the model.

Fit a Cox proportional hazards model with the variables Sex, Age, and Weight.

```X = [Sex Age Weight]; [b,logl,H,stats] = coxphfit(X,ReadmissionTime,'censoring',Censored);```

Assess the significance of the terms.

`stats.beta`
```ans = 3×1 -0.5441 0.0143 0.0250 ```
`stats.p`
```ans = 3×1 0.4953 0.3842 0.0960 ```

None of the terms, adjusted for others, is statistically significant.

Assess the significance of the terms using the log likelihood ratio. You can assess the significance of the new model using the likelihood ratio statistic. First find the difference between the log-likelihood statistic of the model without the terms Age and Weight and the log-likelihood of the model with Sex, Age, and Weight.

`-2*[loglSex - logl]`
```ans = 3.6705 ```

Now, compute the $p$-value for the likelihood ratio statistic. The likelihood ratio statistic has a Chi-square distribution with a degrees of freedom equal to the number of predictor variables being assessed. In this case, the degrees of freedom is 2.

`p = 1 - cdf('chi2',3.6705,2)`
```p = 0.1596 ```

The $p$-value of 0.1596 indicates that the terms Age and Weight are not statistically significant, given the term Sex in the model.

### Step 4. Add Smoker to the model.

Fit a Cox proportional hazards model with the variables Sex and Smoker.

```X = [Sex Smoker]; [b,logl,H,stats] = coxphfit(X,ReadmissionTime,... 'censoring',Censored);```

Assess the significance of the terms in the model.

`stats.p`
```ans = 2×1 0.0000 0.0148 ```

Compare this model to the first model where Sex is the only term.

` -2*[loglSex - logl]`
```ans = 5.5789 ```

Compute the $p$-value for the likelihood ratio statistic. The likelihood ratio statistic has a Chi-square distribution with a degree of freedom of 1.

`p = 1 - cdf('chi2',5.5789,1)`
```p = 0.0182 ```

The $p$-value of 0.0182 indicates that Sex and Smoker are statistically significant given the other is in the model. The model with Sex and Smoker is a better fit compared to the model with only Sex.

Request the coefficient estimates.

` stats.beta`
```ans = 2×1 -1.7165 0.6338 ```

The default baseline is the mean of `X`, so the final model for the hazard ratio is

`$HR=\frac{{h}_{X}\left(t\right)}{{h}_{\underset{}{\overset{‾}{X}}}\left(t\right)}=\mathrm{exp}\left[{\beta }_{s}\left({X}_{s}-{\underset{}{\overset{‾}{X}}}_{s}\right)+{\beta }_{\alpha }\left({X}_{\alpha }-{\underset{}{\overset{‾}{X}}}_{\alpha }\right)\right].$`

Fit a Cox proportional hazards model with a baseline of 0.

```X = [Sex Smoker]; [b,logl,H,stats] = coxphfit(X,ReadmissionTime,... 'censoring',Censored,'baseline',0);```

The model for the hazard ratio is

`$HR=\frac{{h}_{X}\left(t\right)}{{h}_{0}\left(t\right)}=\mathrm{exp}\left[{\beta }_{s}{X}_{s}+{\beta }_{\alpha }{X}_{\alpha }\right].$`

Request the coefficient estimates.

` stats.beta`
```ans = 2×1 -1.7165 0.6338 ```

The coefficients are not affected, but the hazard rate differs from when the baseline is the mean of `X`.