12.3.2. The Schur Decomposition

In a 1909 paper, Issai Schur established the existence of a matrix decomposition satisfying the similarity requirement with factors of a unitary matrix, its transpose, and an upper triangular matrix [SCHUR09]. His paper was the genesis of orthogonal matrix factoring methods.

12.3.2.1. Schur Triangularization Theorem

Theorem 12.1 (Schur Triangularization Theorem)

If \mathbf{A} \in \mathbb{C}^{n{\times}n}, then there exists a unitary \mathbf{Q} \in \mathbb{C}^{n{\times}n} such that

\mathbf{A} = \mathbf{Q\,T\,Q}^H \quad \text{and} \quad
    \mathbf{T} = \mathbf{Q}^H \mathbf{A\,Q}

where \m{T} is similar to \m{A} (same eigenvalues) and \m{T} is upper triangular with the eigenvalues on its diagonal. Further, if \bf{A} is Hermitian symmetric then \bf{T} is diagonal and the eigenvectors of \bf{A} are held in the columns of \bf{Q}.

Proof. Matrix \m{T} is similar to \m{A} by Similar Matrices. We show here that \mathbf{T_n} = \mathbf{Q}^H \mathbf{A\,Q} is upper triangular with the eigenvalues of \bf{A} on its diagonal.

First, note the eigensystem equation for the first eigenpair, \mathbf{A}\,\bm{x}_1 = \lambda_1\,\bm{x}_1. Let \mathbf{Q}^H be a unitary matrix such that \mathbf{Q}_1^H \bm{x}_1 = k\,\bm{e}_1. A Householder reflector for vector \bm{x}_1 fits the requirement for \mathbf{Q}_1^H.

(12.4)\begin{array}{ll}
        \mathbf{Q}_1^H \mathbf{A}\,\bm{x}_1 = \m{Q}_1^H \lambda_1\,\bm{x}_1,
        &\mathbf{Q}_1^H \mathbf{A} \left(\m{Q}_1 \mathbf{Q}_1^H\right) \bm{x}_1
        = \lambda_1\,\mathbf{Q}_1^H \bm{x}_1 \\

         \left(\m{Q}_1^H \m{A}\,\m{Q}_1\right) k\,\bm{e}_1 = \lambda_1\,k\,
          \bm{e}_1, & \text{which establishes the first column of } \m{T}_1. \\
        \mathbf{T}_1 = \mathbf{Q}_1^H \mathbf{A}\,\mathbf{Q}_1 =
            \mat{\lambda_1, \text{*}; \mathbf{0}, \mathbf{B}},
            & \mathbf{B} \in \mathbb{C}^{(n-1){\times}(n-1)}.
    \end{array}

By similarity, \bf{A} and \mathbf{T}_1 have the same eigenvalues, therefore the eigenvalues of \bf{B} are \lambda_2, \ldots, \lambda_n. Next, repeating the steps outlined in equation (12.4). \mathbf{Q}_2 and \mathbf{T}_2 are found using \mathbf{B}, instead of \bf{A}.

\mathbf{Q}_2^H \mathbf{B}\,\mathbf{Q}_2 =
        \mat{\lambda_2, \text{*}; \mathbf{0}, \mathbf{C}},
            \quad \mathbf{C} \in \mathbb{C}^{(n-2){\times}(n-2)}

\mathbf{T}_2 replaces \bf{B} from \mathbf{T}_1 with \mathbf{Q}_2^H \mathbf{B}\,\mathbf{Q}_2.

\mathbf{T}_2 =
        \mat{\lambda_1, \text{*}, \text{*}; 0, \lambda_2, \text{*};
            \mathbf{0}, \mathbf{0}, \mathbf{C}}

The procedure continues until upper triangular \mathbf{T}_n is found.

\mathbf{T}_n =
        \mat{\lambda_1, \text{*}, \text{*}, \text{*};
              0, \lambda_2, \text{*}, \text{*};
              \vdots, \vdots, \ddots, \vdots;
              0, 0, \cdots, \lambda_n}

\bf{Q} is the product of matrices formed from identity matrices and the unitary \mathbf{Q}_i matrices. Note that \mathbf{Q}_n is just 1.

\mathbf{Q} = \mathbf{Q}_1 \times
        \mat{1, \mathbf{0}; \mathbf{0}, \mathbf{Q}_2} \times
            \mat{1, 0, \mathbf{0}; 0, 1, \mathbf{0};
            \mathbf{0}, \mathbf{0}, \mathbf{Q}_3} \times \cdots
            \mat{1, 0, 0, \cdots, \mathbf{0}; 0, 1, 0, \cdots, \mathbf{0};
                 0, 0, 1, \cdots, \mathbf{0}; \vdots, {}, {}, \ddots, \vdots;
                 \mathbf{0}, \mathbf{0}, \mathbf{0}, \cdots, \mathbf{Q}_{n-1}}

The construction of \mathbf{T}_n is a repetitive calculation from n = k to n = 1. Since the procedure is the same for each calculation, then if we can find \mathbf{T}_{n = k}, we can find \mathbf{T}_{n = k+1}. The proof is established by induction.

The distinction regarding the symmetric case is a symptom of symmetry causing the transformations to produce the same results on the rows to the right of the diagonal as to the columns below the diagonal. Proofs relating to the Schur decomposition of symmetric matrices are given in Real Eigenvalues of a Symmetric Matrix and Spectral Decomposition Theorem.

\qed

You may have observed that the procedure in the proof showing the structure of \mathbf{T}_n and \bf{Q} shows a direct sequence of calculations. However, we required several eigenvalues and eigenvectors to complete the process. If we know the eigenpairs, then we don’t need the Schur decomposition. A proof is not a calculation. Converting a matrix from the upper Hessenberg form to the Schur triangular form requires an iterative algorithm.

A couple of corollaries to the Schur Triangularization Theorem give interesting properties about the eigenvalues of a matrix.

12.3.2.1.1. Sum of Eigenvalues

Corollary 12.1

The trace of a matrix (sum of the diagonal elements) is the sum of the eigenvalues. Further, similarity transformations preserve both the eigenvalues and the trace of a matrix.

Proof. There are two ways to prove this corollary. We will first show the proof that relates to Schur Triangularization Theorem. Then, the second proof derives from the determinant of the characteristic matrix.

  1. By the Schur Triangularization Theorem, \m{A} = \m{Q\, T\, Q}^H, where the matrices are the same as in Schur Triangularization Theorem. Thus \m{A\, Q} = \m{Q\, T}, and \mathrm{trace}(\m{A\, Q}) = \mathrm{trace}(\m{Q\, T}). Due to the cyclic property of traces of matrix products, \mathrm{trace}(\m{A\, Q}) = \mathrm{trace}(\m{Q\, A})
= \mathrm{trace}(\m{Q\, T}). So, \mathrm{trace}(\m{A}) = \mathrm{trace}(\m{T}). Since the eigenvalues are on the diagonal of \m{T},

    \mathrm{trace}(\m{A})
            = \mathrm{trace}(\m{T}) = \sum_{i = 1}^n \lambda_i.

  2. An alternate expression of the characteristic equation is cp(t) = \det(t\m{I} - \m{A}) = t^n + c_{n-1}t^{n-1} + \cdots + c_1t + c_0. The factoring of cp reveals the eigenvalues of \m{A}.

    cp(t) = (t - \lambda_1)(t - \lambda_2)\ldots(t - \lambda_n)

    We find the c_{n-1} term of cp(t) by multiplying the factors, c_{n-1} = -(\lambda_1 + \lambda_2 + \cdots + \lambda_n). Then the Laplace expansion of \det(t\m{I} - \m{A}) shows c_{n-1} = -(a_{11} + a_{22} + \cdots + a_{nn}), which is the negative of the trace of \m{A}. By equating terms we have,

    \mathrm{trace}(\m{A}) = \sum_{i = 1}^n \lambda_i.

Likewise, if \m{C_T} = \m{T} - \m{\Lambda} is the characteristic matrix of \m{T}, then \mathrm{trace}(\m{C_T}) = \mathrm{trace}(\m{T}) - \mathrm{trace}(\m{\Lambda}) = 0 and \mathrm{trace}(\m{C_B}) = 0, where \m{B} is any matrix that is similar to \m{T}, which is a property of \m{C_B} being singular. The trace of all singular matrices is zero, but a trace of zero is insufficient proof that a matrix is singular. Only a zero determinant or a zero singular value proves singularity.

\qed

12.3.2.1.2. Product of Eigenvalues

Corollary 12.2

The determinant of a matrix is the product of the eigenvalues. Further, similarity transformations preserve both the eigenvalues and the determinant of a matrix. [th-productOfEigs]

Proof. By the Schur Triangularization Theorem, \m{A} = \m{Q\, T\, Q}^H, where the matrices are the same as in Schur Triangularization Theorem. The determinant of \m{A} is the product of the determinants of its factors.

\det(\m{A}) = \det(\m{Q})\, \det(\m{T})\, \det(\m{Q}^H)

Because \m{Q} is unitary, \det(\m{Q}) is either 1 or -1, regardless, \det(\m{Q})\,\det(\m{Q}^H) = 1. Therefore, \det(\m{A}) = \det(\m{T}) = \det(\m{B}), where \m{B} is any matrix that is similar to \m{A} and \m{T}. Since \m{T} is upper triangular, its determinant is the product of the eigenvalues.

\det(\m{A}) = \det(\m{T})= \prod_{i=1}^n \lambda_i

\qed

Schur’s paper proved the existence of the decomposition but did not provide algorithms for finding the unitary factors. It would be more than fifty years before algorithms would be available to quickly and reliably find the unitary and upper triangular factors of the Schur decomposition.