Factorizations
Introduction
All three of the matrix factorizations discussed in this section make use of triangular matrices, where all the elements either above or below the diagonal are zero. Systems of linear equations involving triangular matrices are easily and quickly solved using either forward or back substitution.
Cholesky Factorization
The Cholesky factorization expresses a symmetric matrix as the product of a triangular matrix and its transpose
A = R′R,
where R is an upper triangular matrix.
Not all symmetric matrices can be factored in this way; the matrices that have such a factorization are said to be positive definite. This implies that all the diagonal elements of A are positive and that the off-diagonal elements are “not too big.” The Pascal matrices provide an interesting example. Throughout this chapter, the example matrix A
has been the 3-by-3 Pascal matrix. Temporarily switch to the 6-by-6:
A = pascal(6) A = 1 1 1 1 1 1 1 2 3 4 5 6 1 3 6 10 15 21 1 4 10 20 35 56 1 5 15 35 70 126 1 6 21 56 126 252
The elements of A
are binomial coefficients. Each element is the sum of its north and west neighbors. The Cholesky factorization is
R = chol(A) R = 1 1 1 1 1 1 0 1 2 3 4 5 0 0 1 3 6 10 0 0 0 1 4 10 0 0 0 0 1 5 0 0 0 0 0 1
The elements are again binomial coefficients. The fact that R'*R
is equal to A
demonstrates an identity involving sums of products of binomial coefficients.
Note
The Cholesky factorization also applies to complex matrices. Any complex matrix that has a Cholesky factorization satisfies
A′ = A
The Cholesky factorization allows the linear system
Ax = b
to be replaced by
R′Rx = b.
Because the backslash operator recognizes triangular systems, this can be solved in the MATLAB® environment quickly with
x = R\(R'\b)
If A
is n-by-n, the computational complexity of chol(A)
is O(n3), but the complexity of the subsequent backslash solutions is only O(n2).
LU Factorization
LU factorization, or Gaussian elimination, expresses any square matrix A as the product of a permutation of a lower triangular matrix and an upper triangular matrix
A = LU,
where L is a permutation of a lower triangular matrix with ones on its diagonal and U is an upper triangular matrix.
The permutations are necessary for both theoretical and computational reasons. The matrix
cannot be expressed as the product of triangular matrices without interchanging its two rows. Although the matrix
can be expressed as the product of triangular matrices, when ε is small, the elements in the factors are large and magnify errors, so even though the permutations are not strictly necessary, they are desirable. Partial pivoting ensures that the elements of L are bounded by one in magnitude and that the elements of U are not much larger than those of A.
For example:
[L,U] = lu(B) L = 1.0000 0 0 0.3750 0.5441 1.0000 0.5000 1.0000 0 U = 8.0000 1.0000 6.0000 0 8.5000 -1.0000 0 0 5.2941
The LU factorization of A
allows the linear system
A*x = b
to be solved quickly with
x = U\(L\b)
Determinants and inverses are computed from the LU factorization using
det(A) = det(L)*det(U)
and
inv(A) = inv(U)*inv(L)
You can also compute the determinants using det(A) = prod(diag(U))
, though the signs of the determinants might be reversed.
QR Factorization
An orthogonal matrix, or a matrix with orthonormal columns, is a real matrix whose columns all have unit length and are perpendicular to each other. If Q is orthogonal, then
QTQ = I,
where I is the identity matrix.
The simplest orthogonal matrices are two-dimensional coordinate rotations:
For complex matrices, the corresponding term is unitary. Orthogonal and unitary matrices are desirable for numerical computation because they preserve length, preserve angles, and do not magnify errors.
The orthogonal, or QR, factorization expresses any rectangular matrix as the product of an orthogonal or unitary matrix and an upper triangular matrix. A column permutation might also be involved:
A = QR
or
AP = QR,
where Q is orthogonal or unitary, R is upper triangular, and P is a permutation.
There are four variants of the QR factorization—full or economy size, and with or without column permutation.
Overdetermined linear systems involve a rectangular matrix with more rows than columns, that is m-by-n with m > n. The full-size QR factorization produces a square, m-by-m orthogonal Q and a rectangular m-by-n upper triangular R:
C=gallery('uniformdata',[5 4], 0); [Q,R] = qr(C) Q = 0.6191 0.1406 -0.1899 -0.5058 0.5522 0.1506 0.4084 0.5034 0.5974 0.4475 0.3954 -0.5564 0.6869 -0.1478 -0.2008 0.3167 0.6676 0.1351 -0.1729 -0.6370 0.5808 -0.2410 -0.4695 0.5792 -0.2207 R = 1.5346 1.0663 1.2010 1.4036 0 0.7245 0.3474 -0.0126 0 0 0.9320 0.6596 0 0 0 0.6648 0 0 0 0
In many cases, the last m – n columns of Q are not needed because they are multiplied by the zeros in the bottom portion of R. So the economy-size QR factorization produces a rectangular, m-by-n Q with orthonormal columns and a square n-by-n upper triangular R. For the 5-by-4 example, this is not much of a saving, but for larger, highly rectangular matrices, the savings in both time and memory can be quite important:
[Q,R] = qr(C,0) Q = 0.6191 0.1406 -0.1899 -0.5058 0.1506 0.4084 0.5034 0.5974 0.3954 -0.5564 0.6869 -0.1478 0.3167 0.6676 0.1351 -0.1729 0.5808 -0.2410 -0.4695 0.5792 R = 1.5346 1.0663 1.2010 1.4036 0 0.7245 0.3474 -0.0126 0 0 0.9320 0.6596 0 0 0 0.6648
In contrast to the LU factorization, the QR factorization does not require any pivoting or permutations. But an optional column permutation, triggered by the presence of a third output argument, is useful for detecting singularity or rank deficiency. At each step of the factorization, the column of the remaining unfactored matrix with largest norm is used as the basis for that step. This ensures that the diagonal elements of R occur in decreasing order and that any linear dependence among the columns is almost certainly be revealed by examining these elements. For the small example given here, the second column of C
has a larger norm than the first, so the two columns are exchanged:
[Q,R,P] = qr(C) Q = -0.3522 0.8398 -0.4131 -0.7044 -0.5285 -0.4739 -0.6163 0.1241 0.7777 R = -11.3578 -8.2762 0 7.2460 0 0 P = 0 1 1 0
When the economy-size and column permutations are combined, the third output argument is a permutation vector, rather than a permutation matrix:
[Q,R,p] = qr(C,0) Q = -0.3522 0.8398 -0.7044 -0.5285 -0.6163 0.1241 R = -11.3578 -8.2762 0 7.2460 p = 2 1
The QR factorization transforms an overdetermined linear system into an equivalent triangular system. The expression
norm(A*x - b)
equals
norm(Q*R*x - b)
Multiplication by orthogonal matrices preserves the Euclidean norm, so this expression is also equal to
norm(R*x - y)
where y = Q'*b
. Since the last m-n rows of R are zero, this expression breaks into two pieces:
norm(R(1:n,1:n)*x - y(1:n))
and
norm(y(n+1:m))
When A
has full rank, it is possible to solve for x
so that the first of these expressions is zero. Then the second expression gives the norm of the residual. When A
does not have full rank, the triangular structure of R
makes it possible to find a basic solution to the least-squares problem.
Using Multithreaded Computation for Factorization
MATLAB software supports multithreaded computation for a number of linear algebra and element-wise numerical functions. These functions automatically execute on multiple threads. For a function or expression to execute faster on multiple CPUs, a number of conditions must be true:
The function performs operations that easily partition into sections that execute concurrently. These sections must be able to execute with little communication between processes. They should require few sequential operations.
The data size is large enough so that any advantages of concurrent execution outweigh the time required to partition the data and manage separate execution threads. For example, most functions speed up only when the array contains several thousand elements or more.
The operation is not memory-bound; processing time is not dominated by memory access time. As a general rule, complicated functions speed up more than simple functions.
lu
and qr
show significant increase in speed on large double-precision arrays (on order of 10,000 elements).