Traditional experimental designs (Full Factorial Designs, Fractional Factorial Designs, and Response Surface Designs) are appropriate for calibrating
linear models in experimental settings where factors are relatively
unconstrained in the region of interest. In some cases, however, models
are necessarily nonlinear. In other cases, certain treatments (combinations
of factor levels) may be expensive or infeasible to measure. *D-optimal
designs* are model-specific designs that address these
limitations of traditional designs.

A D-optimal design is generated by an iterative search algorithm
and seeks to minimize the covariance of the parameter estimates for
a specified model. This is equivalent to maximizing the determinant *D* =
|*X*^{T}*X*|,
where *X* is the design matrix of model terms (the
columns) evaluated at specific treatments in the design space (the
rows). Unlike traditional designs, D-optimal designs do not require
orthogonal design matrices, and as a result, parameter estimates may
be correlated. Parameter estimates may also be locally, but not globally,
D-optimal.

There are several Statistics and Machine Learning Toolbox™ functions for generating D-optimal designs:

Function | Description |
---|---|

`candexch` | Uses a row-exchange algorithm to generate a D-optimal
design with a specified number of runs for a specified model and a
specified candidate set. This is the second component of the algorithm
used by |

`candgen` | Generates a candidate set for a specified model. This
is the first component of the algorithm used by |

`cordexch` | Uses a coordinate-exchange algorithm to generate a D-optimal design with a specified number of runs for a specified model. |

`daugment` | Uses a coordinate-exchange algorithm to augment an existing D-optimal design with additional runs to estimate additional model terms. |

`dcovary` | Uses a coordinate-exchange algorithm to generate a D-optimal design with fixed covariate factors. |

`rowexch` | Uses a row-exchange algorithm to generate a D-optimal
design with a specified number of runs for a specified model. The
algorithm calls |

The following sections explain how to use these functions to generate D-optimal designs.

The Statistics and Machine
Learning Toolbox function `rsmdemo`

generates
simulated data for experimental settings specified by either the user
or by a D-optimal design generated by `cordexch`

.
It uses the `rstool`

interface
to visualize response surface models fit to the data, and it uses
the `nlintool`

interface to visualize
a nonlinear model fit to the data.

Two Statistics and Machine Learning Toolbox algorithms generate D-optimal designs:

Both `cordexch`

and `rowexch`

use iterative search algorithms.
They operate by incrementally changing an initial design matrix *X* to
increase *D* = |*X*^{T}*X*|
at each step. In both algorithms, there is randomness built into the
selection of the initial design and into the choice of the incremental
changes. As a result, both algorithms may return locally, but not
globally, D-optimal designs. Run each algorithm multiple times and
select the best result for your final design. Both functions have
a `'tries'`

parameter that automates this repetition
and comparison.

At each step, the row-exchange algorithm exchanges an entire
row of *X* with a row from a design matrix *C* evaluated
at a *candidate
set* of feasible treatments. The `rowexch`

function
automatically generates a *C* appropriate for a specified
model, operating in two steps by calling the `candgen`

and `candexch`

functions in sequence. Provide
your own *C* by calling `candexch`

directly.
In either case, if *C* is large, its static presence
in memory can affect computation.

The coordinate-exchange algorithm, by contrast, does not use
a candidate set. (Or rather, the candidate set is the entire design
space.) At each step, the coordinate-exchange algorithm exchanges
a single element of *X* with a new element evaluated
at a neighboring point in design space. The absence of a candidate
set reduces demands on memory, but the smaller scale of the search
means that the coordinate-exchange algorithm is more likely to become
trapped in a local minimum than the row-exchange algorithm.

For example, suppose you want a design to estimate the parameters in the following three-factor, seven-term interaction model:

$$y={\beta}_{0}+{\beta}_{1}x{}_{1}+{\beta}_{2}x{}_{2}+{\beta}_{3}x{}_{3}+{\beta}_{12}x{}_{1}x{}_{2}+{\beta}_{13}x{}_{1}x{}_{3}+{\beta}_{23}x{}_{2}x{}_{3}+\epsilon $$

Use `cordexch`

to generate
a D-optimal design with seven runs:

nfactors = 3; nruns = 7; [dCE,X] = cordexch(nfactors,nruns,'interaction','tries',10) dCE = -1 1 1 -1 -1 -1 1 1 1 -1 1 -1 1 -1 1 1 -1 -1 -1 -1 1 X = 1 -1 1 1 -1 -1 1 1 -1 -1 -1 1 1 1 1 1 1 1 1 1 1 1 -1 1 -1 -1 1 -1 1 1 -1 1 -1 1 -1 1 1 -1 -1 -1 -1 1 1 -1 -1 1 1 -1 -1

Columns of the design matrix `X`

are the model
terms evaluated at each row of the design `dCE`

.
The terms appear in order from left to right:

Constant term

Linear terms (1, 2, 3)

Interaction terms (12, 13, 23)

Use `X`

in a linear regression model fit to
response data measured at the design points in `dCE`

.

Use `rowexch`

in a similar
fashion to generate an equivalent design:

[dRE,X] = rowexch(nfactors,nruns,'interaction','tries',10) dRE = -1 -1 1 1 -1 1 1 -1 -1 1 1 1 -1 -1 -1 -1 1 -1 -1 1 1 X = 1 -1 -1 1 1 -1 -1 1 1 -1 1 -1 1 -1 1 1 -1 -1 -1 -1 1 1 1 1 1 1 1 1 1 -1 -1 -1 1 1 1 1 -1 1 -1 -1 1 -1 1 -1 1 1 -1 -1 1

In practice, you may want to add runs to a completed experiment
to learn more about a process and estimate additional model coefficients.
The `daugment`

function uses a
coordinate-exchange algorithm to augment an existing D-optimal design.

For example, the following eight-run design is adequate for estimating main effects in a four-factor model:

dCEmain = cordexch(4,8) dCEmain = 1 -1 -1 1 -1 -1 1 1 -1 1 -1 1 1 1 1 -1 1 1 1 1 -1 1 -1 -1 1 -1 -1 -1 -1 -1 1 -1

To estimate the six interaction terms in the model, augment the design with eight additional runs:

dCEinteraction = daugment(dCEmain,8,'interaction') dCEinteraction = 1 -1 -1 1 -1 -1 1 1 -1 1 -1 1 1 1 1 -1 1 1 1 1 -1 1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 1 1 1 -1 -1 -1 -1 1 -1 1 -1 1 1 -1 1 -1 1 1 -1 1 1 -1 -1 1 -1 1 1 1 1 1 -1

The augmented design is full factorial, with the original eight runs in the first eight rows.

The `'start'`

parameter of the `candexch`

function provides the same functionality
as `daugment`

, but uses a row exchange algorithm
rather than a coordinate-exchange algorithm.

In many experimental settings, certain factors and their covariates
are constrained to a fixed set of levels or combinations of levels.
These cannot be varied when searching for an optimal design. The `dcovary`

function allows you to specify
fixed covariate factors in the coordinate exchange algorithm.

For example, suppose you want a design to estimate the parameters in a three-factor linear additive model, with eight runs that necessarily occur at different times. If the process experiences temporal linear drift, you may want to include the run time as a variable in the model. Produce the design as follows:

time = linspace(-1,1,8)'; [dCV,X] = dcovary(3,time,'linear') dCV = -1.0000 1.0000 1.0000 -1.0000 1.0000 -1.0000 -1.0000 -0.7143 -1.0000 -1.0000 -1.0000 -0.4286 1.0000 -1.0000 1.0000 -0.1429 1.0000 1.0000 -1.0000 0.1429 -1.0000 1.0000 -1.0000 0.4286 1.0000 1.0000 1.0000 0.7143 -1.0000 -1.0000 1.0000 1.0000 X = 1.0000 -1.0000 1.0000 1.0000 -1.0000 1.0000 1.0000 -1.0000 -1.0000 -0.7143 1.0000 -1.0000 -1.0000 -1.0000 -0.4286 1.0000 1.0000 -1.0000 1.0000 -0.1429 1.0000 1.0000 1.0000 -1.0000 0.1429 1.0000 -1.0000 1.0000 -1.0000 0.4286 1.0000 1.0000 1.0000 1.0000 0.7143 1.0000 -1.0000 -1.0000 1.0000 1.0000

The column vector `time`

is a fixed factor,
normalized to values between ±`1`

. The number
of rows in the fixed factor specifies the number of runs in the design.
The resulting design `dCV`

gives factor settings
for the three controlled model factors at each time.

Categorical factors take values in a discrete set of levels.
Both `cordexch`

and `rowexch`

have a `'categorical'`

parameter
that allows you to specify the indices of categorical factors and
a `'levels'`

parameter that allows you to specify
a number of levels for each factor.

For example, the following eight-run design is for a linear additive model with five factors in which the final factor is categorical with three levels:

dCEcat = cordexch(5,8,'linear','categorical',5,'levels',3) dCEcat = -1 -1 1 1 2 -1 -1 -1 -1 3 1 1 1 1 3 1 1 -1 -1 2 1 -1 -1 1 3 -1 1 -1 1 1 -1 1 1 -1 3 1 -1 1 -1 1

The row-exchange algorithm exchanges rows of an initial design
matrix *X* with rows from a design matrix *C* evaluated
at a candidate set of feasible treatments. The `rowexch`

function
automatically generates a *C* appropriate for a specified
model, operating in two steps by calling the `candgen`

and `candexch`

functions in sequence. Provide
your own *C* by calling `candexch`

directly.

For example, the following uses `rowexch`

to
generate a five-run design for a two-factor pure quadratic model using
a candidate set that is produced internally:

dRE1 = rowexch(2,5,'purequadratic','tries',10) dRE1 = -1 1 0 0 1 -1 1 0 1 1

The same thing can be done using `candgen`

and `candexch`

in sequence:

[dC,C] = candgen(2,'purequadratic') % Candidate set, C dC = -1 -1 0 -1 1 -1 -1 0 0 0 1 0 -1 1 0 1 1 1 C = 1 -1 -1 1 1 1 0 -1 0 1 1 1 -1 1 1 1 -1 0 1 0 1 0 0 0 0 1 1 0 1 0 1 -1 1 1 1 1 0 1 0 1 1 1 1 1 1 treatments = candexch(C,5,'tries',10) % D-opt subset treatments = 2 1 7 3 4 dRE2 = dC(treatments,:) % Display design dRE2 = 0 -1 -1 -1 -1 1 1 -1 -1 0

You can replace `C`

in this example with a
design matrix evaluated at your own candidate set. For example, suppose
your experiment is constrained so that the two factors cannot have
extreme settings simultaneously. The following produces a restricted
candidate set:

constraint = sum(abs(dC),2) < 2; % Feasible treatments my_dC = dC(constraint,:) my_dC = 0 -1 -1 0 0 0 1 0 0 1

Use the `x2fx`

function to
convert the candidate set to a design matrix:

my_C = x2fx(my_dC,'purequadratic') my_C = 1 0 -1 0 1 1 -1 0 1 0 1 0 0 0 0 1 1 0 1 0 1 0 1 0 1

Find the required design in the same manner:

my_treatments = candexch(my_C,5,'tries',10) % D-opt subset my_treatments = 2 4 5 1 3 my_dRE = my_dC(my_treatments,:) % Display design my_dRE = -1 0 1 0 0 1 0 -1 0 0