## Fully Independent Conditional Approximation for GPR Models

The fully independent conditional (FIC) approximation[1] is a way of systematically approximating the true GPR kernel function in a way that avoids the predictive variance problem of the SR approximation while still maintaining a valid Gaussian process. You can specify the FIC method for parameter estimation by using the `'FitMethod','fic'` name-value pair argument in the call to `fitrgp`. For prediction using FIC, you can use the `'PredictMethod','fic'` name-value pair argument in the call to `fitrgp`.

### Approximating the Kernel Function

The FIC approximation to $k\left({x}_{i},{x}_{j}|\theta \right)$ for active set $\mathcal{A}\subset \mathcal{N}=\left\{1,2,...,n\right\}$ is given by:

`$\begin{array}{l}{\stackrel{^}{k}}_{FIC}\left({x}_{i},{x}_{j}|\theta ,\mathcal{A}\right)={\stackrel{^}{k}}_{SR}\left({x}_{i},{x}_{j}|\theta ,\mathcal{A}\right)+{\delta }_{ij}\left(k\left({x}_{i},{x}_{j}|\theta \right)-{\stackrel{^}{k}}_{SR}\left({x}_{i},{x}_{j}|\theta ,\mathcal{A}\right)\right),\\ \text{ }\text{ }\text{ }\text{ }\text{ }\text{ }{\delta }_{ij}=\left\{\begin{array}{ll}1,\hfill & \text{if}\text{\hspace{0.17em}}i=j,\hfill \\ 0\hfill & \text{if}\text{\hspace{0.17em}}i\ne j.\hfill \end{array}\end{array}$`

That is, the FIC approximation is equal to the SR approximation if $i\ne j$. For $i=j$, the software uses the exact kernel value rather than an approximation. Define an n-by-n diagonal matrix $\Omega \left(X|\theta ,\mathcal{A}\right)$ as follows:

`$\begin{array}{ll}{\left[\Omega \left(X|\theta ,\mathcal{A}\right)\right]}_{ij}\hfill & ={\delta }_{ij}\left(k\left({x}_{i},{x}_{j}|\theta \right)-{\stackrel{^}{k}}_{SR}\left({x}_{i},{x}_{j}|\theta ,\mathcal{A}\right)\right)\hfill \\ \hfill & =\left\{\begin{array}{ll}k\left({x}_{i},{x}_{j}|\theta \right)-{\stackrel{^}{k}}_{SR}\left({x}_{i},{x}_{j}|\theta ,\mathcal{A}\right)\hfill & \text{if}\text{\hspace{0.17em}}i=j,\hfill \\ 0\hfill & \text{if}\text{\hspace{0.17em}}i\ne j.\hfill \end{array}\hfill \end{array}$`

The FIC approximation to $K\left(X,X|\theta \right)$ is then given by:

### Parameter Estimation

Replacing $K\left(X,X|\theta \right)$ by ${\stackrel{^}{K}}_{FIC}\left(X,X|\theta ,\mathcal{A}\right)$ in the marginal log likelihood function produces its FIC approximation:

`$\begin{array}{ll}\mathrm{log}{P}_{FIC}\left(y|X,\beta ,\theta ,{\sigma }^{2},\mathcal{A}\right)=\hfill & -\frac{1}{2}{\left(y-H\beta \right)}^{T}{\left[{\stackrel{^}{K}}_{FIC}\left(X,X|\theta ,\mathcal{A}\right)+{\sigma }^{2}{I}_{n}\right]}^{-1}\left(y-H\beta \right)\hfill \\ \hfill & -\frac{N}{2}\mathrm{log}2\pi -\frac{1}{2}\mathrm{log}|{\stackrel{^}{K}}_{FIC}\left(X,X|\theta ,\mathcal{A}\right)+{\sigma }^{2}{I}_{n}|.\hfill \end{array}$`

As in the exact method, the software estimates the parameters by first computing $\stackrel{^}{\beta }\left(\theta ,{\sigma }^{2}\right)$, the optimal estimate of $\beta$, given $\theta$ and ${\sigma }^{2}$. Then it estimates $\theta$, and ${\sigma }^{2}$ using the $\beta$-profiled marginal log likelihood. The FIC estimate to $\beta$ for given $\theta$, and ${\sigma }^{2}$ is

`$\begin{array}{l}*={H}^{T}\Lambda {\left(\theta ,{\sigma }^{2},\mathcal{A}\right)}^{-1}H-{H}^{T}\Lambda {\left(\theta ,{\sigma }^{2},\mathcal{A}\right)}^{-1}K\left(X,{X}_{\mathcal{A}}|\theta \right){B}_{\mathcal{A}}^{-1}K\left({X}_{\mathcal{A}},X|\theta \right)\Lambda {\left(\theta ,{\sigma }^{2},\mathcal{A}\right)}^{-1}H,\\ **={H}^{T}\Lambda {\left(\theta ,{\sigma }^{2},\mathcal{A}\right)}^{-1}y-{H}^{T}\Lambda {\left(\theta ,{\sigma }^{2},\mathcal{A}\right)}^{-1}K\left(X,{X}_{\mathcal{A}}|\theta \right){B}_{\mathcal{A}}^{-1}K\left({X}_{\mathcal{A}},X|\theta \right)\Lambda {\left(\theta ,{\sigma }^{2},\mathcal{A}\right)}^{-1}y,\\ {B}_{\mathcal{A}}=K\left({X}_{\mathcal{A}},{X}_{\mathcal{A}}|\theta \right)+K\left({X}_{\mathcal{A}},X|\theta \right)\Lambda {\left(\theta ,{\sigma }^{2},\mathcal{A}\right)}^{-1}K\left(X,{X}_{\mathcal{A}}|\theta \right),\\ \Lambda \left(\theta ,{\sigma }^{2},\mathcal{A}\right)=\Omega \left(X|\theta ,\mathcal{A}\right)+{\sigma }^{2}{I}_{n}.\end{array}$`

Using ${\stackrel{^}{\beta }}_{FIC}\left(\theta ,{\sigma }^{2},\mathcal{A}\right)$, the $\beta$-profiled marginal log likelihood for FIC approximation is:

`$\begin{array}{l}\mathrm{log}{P}_{FIC}\left(y|X,{\stackrel{^}{\beta }}_{FIC}\left(\theta ,{\sigma }^{2},\mathcal{A}\right),\theta ,{\sigma }^{2},\mathcal{A}\right)=\\ \begin{array}{ll}\hfill & -\frac{1}{2}{\left(y-H{\stackrel{^}{\beta }}_{FIC}\left(\theta ,{\sigma }^{2},\mathcal{A}\right)\right)}^{T}{\left({\stackrel{^}{K}}_{FIC}\left(X,X|\theta ,\mathcal{A}\right)+{\sigma }^{2}{I}_{N}\right)}^{-1}\left(y-H{\stackrel{^}{\beta }}_{FIC}\left(\theta ,{\sigma }^{2},\mathcal{A}\right)\right)\hfill \\ \hfill & -\frac{N}{2}\mathrm{log}2\pi -\frac{1}{2}\mathrm{log}|{\stackrel{^}{K}}_{FIC}\left(X,X|\theta ,\mathcal{A}\right)+{\sigma }^{2}{I}_{N}|,\hfill \end{array}\end{array}$`

where

`$\begin{array}{l}{\left({\stackrel{^}{K}}_{FIC}\left(X,X|\theta ,\mathcal{A}\right)+{\sigma }^{2}{I}_{N}\right)}^{-1}\\ \text{ }\text{ }\text{ }=\Lambda {\left(\theta ,{\sigma }^{2},\mathcal{A}\right)}^{-1}-\Lambda {\left(\theta ,{\sigma }^{2},\mathcal{A}\right)}^{-1}K\left(X,{X}_{\mathcal{A}}|\theta \right){B}_{\mathcal{A}}^{-1}K\left({X}_{\mathcal{A}},X|\theta \right)\Lambda {\left(\theta ,{\sigma }^{2},\mathcal{A}\right)}^{-1},\\ \mathrm{log}|{\stackrel{^}{K}}_{FIC}\left(X,X|\theta ,\mathcal{A}\right)+{\sigma }^{2}{I}_{N}|=\mathrm{log}|\Lambda \left(\theta ,{\sigma }^{2},\mathcal{A}\right)|+\mathrm{log}|{B}_{\mathcal{A}}|-\mathrm{log}|K\left({X}_{\mathcal{A}},{X}_{\mathcal{A}}|\theta \right)|.\end{array}$`

### Prediction

The FIC approximation to the distribution of ${y}_{new}$ given $y$, $X$, ${x}_{new}$ is

`$\begin{array}{ll}P\left({y}_{new}|y,X,{x}_{new}\right)\hfill & =\mathcal{N}\left({y}_{new}|h{\left({x}_{new}\right)}^{T}\beta +{\mu }_{FIC},{\sigma }_{new}^{2}+{\Sigma }_{FIC}\right)\hfill \end{array},$`

where ${\mu }_{FIC}$ and ${\Sigma }_{FIC}$ are the FIC approximations to $\mu$ and $\Sigma$ given in prediction using exact GPR method. As in the SR case, ${\mu }_{FIC}$ and ${\Sigma }_{FIC}$ are obtained by replacing all occurrences of the true kernel with its FIC approximation. The final forms of ${\mu }_{FIC}$ and ${\Sigma }_{FIC}$ are as follows:

`$\begin{array}{ll}{\Sigma }_{FIC}\hfill & =k\left({x}_{new},{x}_{new}|\theta \right)-K\left({x}_{new}^{T},{X}_{\mathcal{A}}|\theta \right)K{\left({X}_{\mathcal{A}},{X}_{\mathcal{A}}|\theta \right)}^{-1}K\left({X}_{\mathcal{A}},{x}_{new}^{T}|\theta \right)\hfill \\ \hfill & +K\left({x}_{new}^{T},{X}_{\mathcal{A}}|\theta \right){B}_{\mathcal{A}}^{-1}K\left({X}_{\mathcal{A}},{x}_{new}^{T}|\theta \right),\hfill \end{array}$`

where

`$\begin{array}{l}{B}_{\mathcal{A}}=K\left({X}_{\mathcal{A}},{X}_{\mathcal{A}}|\theta \right)+K\left({X}_{\mathcal{A}},X|\theta \right)\Lambda {\left(\theta ,{\sigma }^{2},\mathcal{A}\right)}^{-1}K\left(X,{X}_{\mathcal{A}}|\theta \right),\\ \Lambda \left(\theta ,{\sigma }^{2},\mathcal{A}\right)=\Omega \left(X|\theta ,\mathcal{A}\right)+{\sigma }^{2}{I}_{n}.\end{array}$`

## References

[1] Candela, J. Q. A Unifying View of Sparse Approximate Gaussian Process Regression. Journal of Machine Learning Research. Vol 6, pp. 1939–1959, 2005.