12.3.2. The Schur Decomposition

In a 1909 paper, Issai Schur established the existence of a matrix decomposition satisfying the similarity requirement with factors of a unitary matrix, its transpose, and an upper triangular matrix [SCHUR09]. His paper was the genesis of orthogonal matrix factoring methods.

Who was Issai Schur?

Picture of Issai Schur

Issai Schur (1875–1941) was from Mogilev, which is in Belarus but was then within the borders of Russia. He spoke both Russian and German from an early age. He studied and worked in Germany from 1894 until 1939. He considered himself to be a German, instead of Russian, because he became a German citizen who spoke German without an accent and had lived in Germany for a long time. His doctorate was from the University of Berlin where, except for a brief time at the University of Bonn, he also taught until the Nazi persecution of Jews forced him to leave Germany.

Schur was highly regarded for his research and teaching. Although we are interested in his contributions to linear algebra, he is best known for his fundamental work on the representation theory of groups. He also contributed to several other branches of mathematics.

As mandated by a law passed by the Nazi government, he was forced to retire in 1935, but the outcry from students and fellow faculty was so strong that the university revoked his dismissal. He had employment invitations from universities in the United States and Britain, but he declined all of them. Unable to understand how Jewish Germans were not welcome in Germany, he continued as the only remaining Jewish professor at the University of Berlin. He was forced out again in 1938. Schur left Germany for Palestine in 1939, broken in mind and body, having the final humiliation of being forced to find a sponsor to pay the Reich’s flight tax to allow him to leave Germany. Without sufficient funds to live in Palestine, he sold his academic books to the Institute for Advanced Study at Princeton. He died two years later on his 66{\text{th}} birthday [OCONNOR98].

12.3.2.1. Schur Triangularization Theorem

Theorem 12.1 (Schur Triangularization Theorem)

If \mathbf{A} \in \mathbb{C}^{n{\times}n}, then there exists a unitary \mathbf{Q} \in \mathbb{C}^{n{\times}n} such that

\mathbf{A} = \mathbf{Q\,T\,Q}^H \quad \text{and} \quad
    \mathbf{T} = \mathbf{Q}^H \mathbf{A\,Q}

where \m{T} is similar to \m{A} (same eigenvalues) and \m{T} is upper triangular with the eigenvalues on its diagonal. Further, if \bf{A} is Hermitian symmetric then \bf{T} is diagonal and the eigenvectors of \bf{A} are held in the columns of \bf{Q}.

Proof. Matrix \m{T} is similar to \m{A} by Similar Matrices. We show here that \mathbf{T_n} = \mathbf{Q}^H \mathbf{A\,Q} is upper triangular with the eigenvalues of \bf{A} on its diagonal.

First, note the eigensystem equation for the first eigenpair, \mathbf{A}\,\bm{x}_1 = \lambda_1\,\bm{x}_1. Let \mathbf{Q}^H be a unitary matrix such that \mathbf{Q}_1^H \bm{x}_1 = k\,\bm{e}_1. A Householder reflector for vector \bm{x}_1 fits the requirement for \mathbf{Q}_1^H.

(12.4)\begin{array}{ll}
        \mathbf{Q}_1^H \mathbf{A}\,\bm{x}_1 = \m{Q}_1^H \lambda_1\,\bm{x}_1,
        &\mathbf{Q}_1^H \mathbf{A} \left(\m{Q}_1 \mathbf{Q}_1^H\right) \bm{x}_1
        = \lambda_1\,\mathbf{Q}_1^H \bm{x}_1 \\

         \left(\m{Q}_1^H \m{A}\,\m{Q}_1\right) k\,\bm{e}_1 = \lambda_1\,k\,
          \bm{e}_1, & \text{which establishes the first column of } \m{T}_1. \\
        \mathbf{T}_1 = \mathbf{Q}_1^H \mathbf{A}\,\mathbf{Q}_1 =
            \mat{\lambda_1, \text{*}; \mathbf{0}, \mathbf{B}},
            & \mathbf{B} \in \mathbb{C}^{(n-1){\times}(n-1)}.
    \end{array}

By similarity, \bf{A} and \mathbf{T}_1 have the same eigenvalues, therefore the eigenvalues of \bf{B} are \lambda_2, \ldots, \lambda_n. Next, repeating the steps outlined in equation (12.4). \mathbf{Q}_2 and \mathbf{T}_2 are found using \mathbf{B}, instead of \bf{A}.

\mathbf{Q}_2^H \mathbf{B}\,\mathbf{Q}_2 =
        \mat{\lambda_2, \text{*}; \mathbf{0}, \mathbf{C}},
            \quad \mathbf{C} \in \mathbb{C}^{(n-2){\times}(n-2)}

\mathbf{T}_2 replaces \bf{B} from \mathbf{T}_1 with \mathbf{Q}_2^H \mathbf{B}\,\mathbf{Q}_2.

\mathbf{T}_2 =
        \mat{\lambda_1, \text{*}, \text{*}; 0, \lambda_2, \text{*};
            \mathbf{0}, \mathbf{0}, \mathbf{C}}

The procedure continues until upper triangular \mathbf{T}_n is found.

\mathbf{T}_n =
        \mat{\lambda_1, \text{*}, \text{*}, \text{*};
              0, \lambda_2, \text{*}, \text{*};
              \vdots, \vdots, \ddots, \vdots;
              0, 0, \cdots, \lambda_n}

\bf{Q} is the product of matrices formed from identity matrices and the unitary \mathbf{Q}_i matrices. Note that \mathbf{Q}_n is just 1.

\mathbf{Q} = \mathbf{Q}_1 \times
        \mat{1, \mathbf{0}; \mathbf{0}, \mathbf{Q}_2} \times
            \mat{1, 0, \mathbf{0}; 0, 1, \mathbf{0};
            \mathbf{0}, \mathbf{0}, \mathbf{Q}_3} \times \cdots
            \mat{1, 0, 0, \cdots, \mathbf{0}; 0, 1, 0, \cdots, \mathbf{0};
                 0, 0, 1, \cdots, \mathbf{0}; \vdots, {}, {}, \ddots, \vdots;
                 \mathbf{0}, \mathbf{0}, \mathbf{0}, \cdots, \mathbf{Q}_{n-1}}

The construction of \mathbf{T}_n is a repetitive calculation from n = k to n = 1. Since the procedure is the same for each calculation, then if we can find \mathbf{T}_{n = k}, we can find \mathbf{T}_{n = k+1}. The proof is established by induction.

The distinction regarding the symmetric case is a symptom of symmetry causing the transformations to produce the same results on the rows to the right of the diagonal as to the columns below the diagonal. Proofs relating to the Schur decomposition of symmetric matrices are given in Real Eigenvalues of a Symmetric Matrix and Spectral Decomposition Theorem.

\qed

You may have observed that the procedure in the proof showing the structure of \mathbf{T}_n and \bf{Q} shows a direct sequence of calculations. However, we required several eigenvalues and eigenvectors to complete the process. If we know the eigenpairs, then we don’t need the Schur decomposition. A proof is not a calculation. Converting a matrix from the upper Hessenberg form to the Schur triangular form requires an iterative algorithm.

A couple of corollaries to the Schur Triangularization Theorem give interesting properties about the eigenvalues of a matrix.

12.3.2.1.1. Sum of Eigenvalues

Corollary 12.1

The trace of a matrix (sum of the diagonal elements) is the sum of the eigenvalues. Further, similarity transformations preserve both the eigenvalues and the trace of a matrix.

Proof. There are two ways to prove this corollary. We will first show the proof that relates to Schur Triangularization Theorem. Then, the second proof derives from the determinant of the characteristic matrix.

  1. By the Schur Triangularization Theorem, \m{A} = \m{Q\, T\, Q}^H, where the matrices are the same as in Schur Triangularization Theorem. Thus \m{A\, Q} = \m{Q\, T}, and \mathrm{trace}(\m{A\, Q}) = \mathrm{trace}(\m{Q\, T}). Due to the cyclic property of traces of matrix products, \mathrm{trace}(\m{A\, Q}) = \mathrm{trace}(\m{Q\, A})
= \mathrm{trace}(\m{Q\, T}). So, \mathrm{trace}(\m{A}) = \mathrm{trace}(\m{T}). Since the eigenvalues are on the diagonal of \m{T},

    \mathrm{trace}(\m{A})
            = \mathrm{trace}(\m{T}) = \sum_{i = 1}^n \lambda_i.

  2. An alternate expression of the characteristic equation is cp(t) = \det(t\m{I} - \m{A}) = t^n + c_{n-1}t^{n-1} + \cdots + c_1t + c_0. The factoring of cp reveals the eigenvalues of \m{A}.

    cp(t) = (t - \lambda_1)(t - \lambda_2)\ldots(t - \lambda_n)

    We find the c_{n-1} term of cp(t) by multiplying the factors, c_{n-1} = -(\lambda_1 + \lambda_2 + \cdots + \lambda_n). Then the Laplace expansion of \det(t\m{I} - \m{A}) shows c_{n-1} = -(a_{11} + a_{22} + \cdots + a_{nn}), which is the negative of the trace of \m{A}. By equating terms we have,

    \mathrm{trace}(\m{A}) = \sum_{i = 1}^n \lambda_i.

Likewise, if \m{C_T} = \m{T} - \m{\Lambda} is the characteristic matrix of \m{T}, then \mathrm{trace}(\m{C_T}) = \mathrm{trace}(\m{T}) - \mathrm{trace}(\m{\Lambda}) = 0 and \mathrm{trace}(\m{C_B}) = 0, where \m{B} is any matrix that is similar to \m{T}, which is a property of \m{C_B} being singular. The trace of all singular matrices is zero, but a trace of zero is insufficient proof that a matrix is singular. Only a zero determinant or a zero singular value proves singularity.

\qed

12.3.2.1.2. Product of Eigenvalues

Corollary 12.2

The determinant of a matrix is the product of the eigenvalues. Further, similarity transformations preserve both the eigenvalues and the determinant of a matrix. [th-productOfEigs]

Proof. By the Schur Triangularization Theorem, \m{A} = \m{Q\, T\, Q}^H, where the matrices are the same as in Schur Triangularization Theorem. The determinant of \m{A} is the product of the determinants of its factors.

\det(\m{A}) = \det(\m{Q})\, \det(\m{T})\, \det(\m{Q}^H)

Because \m{Q} is unitary, \det(\m{Q}) is either 1 or -1, regardless, \det(\m{Q})\,\det(\m{Q}^H) = 1. Therefore, \det(\m{A}) = \det(\m{T}) = \det(\m{B}), where \m{B} is any matrix that is similar to \m{A} and \m{T}. Since \m{T} is upper triangular, its determinant is the product of the eigenvalues.

\det(\m{A}) = \det(\m{T})= \prod_{i=1}^n \lambda_i

\qed

Schur’s paper proved the existence of the decomposition but did not provide algorithms for finding the unitary factors. It would be more than fifty years before algorithms would be available to quickly and reliably find the unitary and upper triangular factors of the Schur decomposition.