12.4.6. Schur Decomposition by the Iterative Francis Algorithm¶
The distinctive feature of the Francis algorithm is the implementation of the shift. Like the QR algorithm, the Wilkinson shift helps find complex eigenvalues and facilitates convergence. The first similarity transform incorporates the shift into a bulge in the first column of the matrix. Then, a sequence of similarity transforms chases the bulge down the diagonal and out of the matrix. The Schur vector matrix is the cumulative product of unitary transformation matrices. This method contrasts the QR algorithm, where the shift is applied and later removed external to the algorithm’s application of similarity transforms. The distinction of the shift inducement into the matrix is why the Francis algorithm finds the eigenvectors more efficiently than the QR algorithm. The Francis algorithm finds factors of the eigenvectors concurrent with finding the eigenvalues.
Note
Each introduction of a bulge and chasing the bulge out of the matrix is called a Francis step. Each Francis step diminishes the values of the elements along the subdiagonal. The accomplishment of a Francis step is comparable to a QR step from the QR algorithm, which is why the algorithm has been called the implicitly shifted QR algorithm. The “implicitly shifted” label refers to the implicit scheme for introducing the shifts into the matrix. This is a poor name for the algorithm because the algorithm never performs QR decomposition. The confusion caused by the name prompted David Watkins with his book on matrix computations [WATKINS10] to start a movement to rename the algorithm to Francis’s algorithm after its inventor, John Francis.
12.4.6.1. Implicit Q Theorem¶
The implicit Q theorem establishes that the Francis algorithm has the same potential to converge as the QR algorithm (Iterative QR Algorithm). There are multiple ways to state the implicit Q theorem. One can see with careful analysis that its different expressions are the same. It is listed here as it is essentially stated in the texts of Stewart [STEWART73] and Demmel [DEMMEL97], which seem to state it as simply as possible.
(Implicit Q Theorem)
Proof. The key aspect of the Francis algorithm that makes the implicit Q theorem true comes from using matrices in upper Hessenberg form. Each iteration begins with a unitary similarity transform, , designed to operate on the first column of . The similarity transformation of creates the bulge in the first column below the subdiagonal. Then, the subsequent similarity transforms, (), push the bulge from column to column, always below the subdiagonal, eventually pushing the bulge out of .
Removing the iteration count for clarity, let , , and introduce a new matrix, . We can find an expression for columns 2 through of by examining each column of , which we define as . Each element of comes from an abbreviated sum (stopping at ) since and are both in Hessenberg form.
Column of is then
(12.8)¶
The final term of equation (12.8) uniquely determines column of for from the first column of , , and columns 1 to of .
Each Francis step achieves the same effect as a QR step from the QR algorithm because both the implicit bulge and the explicit shift are derived from .
12.4.6.2. Francis Steps of Degree One¶
The schurFrancis
function finds the Schur decomposition [2] of a matrix
typically using Francis steps of degree one. The matrix is converted to
Hessenberg form before the iterative algorithm begins. Note that because of the
similarity requirements, Hessenberg form is the closest to upper triangular
form possible from a direct algorithm. Since the algorithm works along the
diagonal, chasing the bulge, diminishing the subdiagonal, and placing the
eigenvalues on the diagonal, the unitary transformations are Givens rotation
matrices (Givens Rotation Matrices). Note that the Francis algorithm does not use Givens
rotations for degree two Francis steps where the bulge is larger.
The first similarity transform creates the bulge in position . Then, the following similarity transforms chase the bulge down the diagonal, always two positions below the diagonal. The final similarity transform of the Francis step pushes the bulge below the last row of the matrix. The illustration in the example below shows the bulge locations with plus signs after each similarity transform on a non-symmetric matrix in Hessenberg form. The transform from the left moves the bulge to the right of the diagonal. But then the transposed transform from the right moves the bulge below the diagonal and one column to the right from its previous position.
As with the QR algorithm, the algorithm has converged to find the
eigenvalue at T(k, k)
when the subdiagonal
element, T(k, k-1)
, converges to a value close to zero. Deflation
then reduces the size of the matrix for future eigenvalue calculations.
12.4.6.3. Francis Steps of Degree Two¶
Complex numbers are not a problem for modern computer systems. However, that was not always the case. Due to the limitations of his computer, Francis developed a variant algorithm to minimize the need for storing or processing complex numbers. The algorithm uses a degree two Francis step, which takes advantage of complex eigenvalues in conjugate pairs. Complex conjugate pairs become real when they are either multiplied or added together. As with the degree one Francis step, the Wilkinson shift is calculated from the two eigenvalues of the lower-right elements of the real, upper Hessenberg matrix . A similarity transform from a Householder reflection matrix derived from creates a three-element bulge. The function yields a real vector, so the similarity transforms and are always real.
Since the calculations in the Francis step of degree two use only real
numbers, it is not possible to find complex eigenvalues on the diagonal
of the upper triangular factor. The algorithm leaves real eigenvalues on
the diagonal. However, complex eigenvalues are represented with
real submatrices. The lower-left element of each
submatrix is below the diagonal. So, we do not have a
proper Schur decomposition in the form of an upper triangular matrix
with the eigenvalues on the diagonal. With
complex eigenvalues and a non-triangular Schur decomposition, finding
the eigenvectors would require additional computation. Each eigenvector
could come from the null solution of a full characteristic equation from
the eigenvalues, which is the approach taken with the QR algorithm. In
addition to not requiring complex number storage and arithmetic, the
degree two step converges quickly because each step is the equivalent of
two degree one steps. Also, complex conjugate eigenvalue pairs are
always next to each other, which is not always true with the QR
algorithm or degree one Francis steps. For this reason, the schurFrancis
function uses the degree two algorithm first even when complex eigenvalues
are needed. The degree one algorithm is used after the degree two algorithm
to put the complex eigenvalues on the diagonal, which has minimal work to do
because most of the iterative computations were completed by the degree two
algorithm.
The degree two algorithm option
is also well suited to cases where only the eigenvalues are needed because
the eigenvalue solver can easily find the eigenvalues from the
submatrices.
The following example shows the output from the Schur decomposition with
the mode
option set to ’real’
so that degree two Francis steps
are used for complex eigenvalues. The example shows a quasitriangular
matrix with only real values. Notice the nonzero element on the
subdiagonal indicating a two complex conjugate
eigenvalues. The decomposition is still valid. That is, if the
Schur-vectors, are also requested, then
. Test the Schur decomposition with the
MATLAB command norm(A*Q - Q*T, ’fro’)
, which will return a number
close to zero if the decomposition is correct.
>> T = schurFrancis(A, 'real')
T =
27.1312 -8.5259 -11.5434 -3.4973 -5.0473
0 10.8960 9.3448 5.5289 -10.6843
0 0 -3.3198 17.2476 2.9627
0 0 -8.1861 -0.6561 -7.7879
0 0 0 0 -1.0513
>> eigs = eigFrancis(A)
eigs =
27.1312 + 0.0000i
10.8960 + 0.0000i
-1.9879 +11.8075i
-1.9879 -11.8075i
-1.0513 + 0.0000i
12.4.6.4. Detecting Convergence¶
Convergence to an eigenvalue with degree one Francis steps occurs when the value at is near zero. However, convergence to a pair of complex conjugate eigenvalues using the degree two procedure occurs when the value at is near zero. The locations of eigenvalues relative to zeros on the subdiagonal stem from our original definition of finding eigenvalues from the roots of the polynomial found from the determinant of the characteristic equation (see Finding Eigenvalues and Eigenvectors). We already know from the characteristic equation that every value on the diagonal of a strictly upper triangle matrix is an eigenvalue, which extends to block diagonal matrices (see Eigenvalues of a triangular block matrix).
Note
Some convergence points that the schurFrancis
function encounters
are stubborn and, although small, they do not go below the threshold
for detecting a convergence. So, we also detect convergence when the
subdiagonal value does not change between Francis steps.
12.4.6.5. Zeros on the Subdiagonal¶
The Francis algorithm aims to transition the matrix to the proper
upper triangular form. However, a mixture of zero and nonzero values on
the subdiagonal is problematic. The Francis algorithm requires that the
matrix be properly Hessenberg, which means
that the matrix has all nonzero elements on the subdiagonal. The
implicit Q theorem (Implicit Q Theorem) shows that the bulge in the
first column is applied to the remaining columns because the matrix is
properly Hessenberg. A zero on the subdiagonal prevents moving the bulge
below the subdiagonal when completing the similarity transform. If there
is no bulge, the next transform matrix is an identity matrix, and
convergence stops. The schurFrancis
function splits the matrix into
properly Hessenberg submatrix regions by changing to the variables s
and k
that control the columns where the algorithm starts and ends
the Francis iterations. The similarity transforms must operate on the
entire matrix to maintain the integrity of the Schur vectors so that we
can later use them to build the eigenvectors.
12.4.6.6. The schurFrancis
Function¶
function [Q, T] = schurFrancis(A, mode)
% SCHURFRANCIS Schur decomposition.
%
% Francis's (implicit shift) iterative Schur decomposition
% algorithm.
% [Q,T] = schurFrancis(A, mode);
%
% A - Input Square matrix
% mode - Optional input
%
% If mode is "real" and A is real, degree two steps are
% used for complex eigenvalues, otherwise the degree one
% steps are used.
%
% Q - Schur-Vectors, Unitary matrix, often complex
% T - Upper-triangular with the eigenvalues on the diagonal
% for all real eigenvalues or no 'real' option.
% - Quasitriangular if some eigenvalues are complex and
% mode == 'real'.
% - T is similar to A in the sense of same eigenvalues.
% A = Q*T*Q', likewise T = Q'*A*Q
%
% eigFrancis uses this function to find the eigen-decomposition.
% NOTE: This code is for educational purposes.
% For other purposes, use MATLAB's schur function.
%
% See also: SCHUR, EIG22, EIGFRANCIS, GIVENS, HOUSEHOLDER
[m, n] = size(A);
if m ~= n
error("Matrix must be square for Schur decomposition")
end
% Start with upper Hessenberg - zeros below first subdiagonal,
[Q, H] = hess(A);
% Call degree 2 algorithm first even for complex output because it
% puts complex conjugate eigenvalues next to each other
if isreal(A)
[Q, T] = schurHelper(H, Q, n, true);
if ~(exist('mode', 'var') && isequal(mode, 'real'))
[Q, T] = schurHelper(T, Q, n, false);
end
else
[Q, T] = schurHelper(H, Q, n, false);
end
if nargout <= 1, Q = T; end
end
function [Q, T] = schurHelper(T, Q, n, degree2)
% Implementation of Francis algorithm for Schur decomposition.
% A helper function to do the real work simplifies the implementation.
tol = eps(4); % 1e-14;
% Test subdiagonal for not properly Hessenberg
zeroSub = abs(diag(T, -1)) < tol;
if any(zeroSub)
if all(zeroSub) % upper triangular
return;
else
k = find(~zeroSub, 1, 'last') + 1;
s = find(zeroSub(1:k-2), 1, 'last') + 1;
notProper = true;
end
else
k = n; s = 1; notProper = false;
end
if degree2, p = zeros(n, 1); end
while k > s
[rho1, rho2] = eig22(T(k-1:k,k-1:k));
if degree2 && k >= 2 && ~isreal(rho2) % degree 2 step
if k == 2, break; end % finished
if abs(T(k-1,k-2)) < tol*(abs(T(k,k)) + abs(T(k-1,k-1)))
T(k-1,k-2) = 0;
k = k - 2; % deflation
if notProper && s > 1 && k <= s
s = nextStart(zeroSub(1:k-2), k);
end
continue;
end
lastSub = abs(T(k-1,k-2));
p(1:3) = [(T(s,s)-rho1)*(T(s,s)-rho2) + T(s,s+1)*T(s+1,s); ...
T(s+1,s)*((T(s,s) + T(s+1,s+1)) - (rho1 + rho2)); ...
T(s+2,s+1)*T(s+1,s)];
Hr = householder(real(p(1:k-s+1)), k-s+1, 0); % p is real
H = eye(n); H(s:k,s:k) = Hr;
T = H*T*H'; % create bulge with similarity transform
% Chase the bulge, put T back to Hessenberg form
[P, T] = hess(T); Q = Q*H'*P;
if abs(abs(T(k,k-2)) - lastSub) < tol
T(k-1,k-2) = 0; k = k - 2;
if notProper && s > 1 && k <= s
s = nextStart(zeroSub(1:k-2), k);
end
end
else % degree one step
lastSub = abs(T(k,k-1));
if abs(T(k,k-1)) < tol*(abs(T(k,k)) + abs(T(k-1,k-1)))
T(k, k-1) = 0;
if abs(imag(T(k,k))) < tol
T(k,k) = real(T(k,k)); % no imag round-off
end
k = k - 1; % deflation
if notProper && s > 1 && k <= s
s = nextStart(zeroSub(1:k-2), k);
end
continue;
end
if abs(T(k,k)-rho1) < abs(T(k,k)-rho2)
shift = rho1; else, shift = rho2;
end
G = givens((T(s,s) - shift), T(s+1,s), s, s+1, n);
T = G*T*G'; % create bulge with similarity transform
Q = Q*G'; % due to shift T(2,1) is not 0.
% chase the bulge
for c = s:k-2 % c = current column
G = givens(T(c+1,c), T(c+2,c), c+1, c+2, n);
T = G*T*G'; T(c+2,c) = 0; % no round-off error
Q = Q*G'; % Q = G0'*G1'*G2'*...*Gn'
end
if abs(abs(T(k,k-1)) - lastSub) < tol
T(k,k-1) = 0; k = k - 1;
if notProper && s > 1 && k <= s
s = nextStart(zeroSub(1:k-2), k);
end
end
end
end
end
function s = nextStart(zeroSub, k)
if any(zeroSub)
if all(zeroSub), s = k; return; end
if k > 2, s = find(zeroSub, 1, 'last') + 1;
else, s = 1;
end
else
s = 1;
end
end
[1] | Unreduced simply means that the elements on the subdiagonal are nonzero. |
[2] | MATLAB’s schur function also finds the Schur decomposition. Pass
the ’complex’ option to get results suitable for finding the
eigendecomposition, otherwise degree two Francis steps are used
(Francis Steps of Degree Two). |