9.2. The Classic SVD

We discuss the classic definition of the relationship between a matrix and it SVD factors in this section. The objective is to understand the relationship. We will consider the requirements and the difficulties associated with implementing the classic algorithm to compute the SVD.

As with diagonalization factoring, a square matrix has n equations like equation (9.1). Similarly, the n vector equations can be combined into a matrix equation. For a square, full rank matrix it looks as follows.

\begin{array}{ll}
  \mathbf{A\,V} &= \mathbf{A}
          \spalignmat{\vertbar{} \vertbar{} {} \vertbar{};
          \bm{v}_1 \bm{v}_2 \cdots{} \bm{v}_n;
          \vertbar{} \vertbar{} {} \vertbar{}}
    = \spalignmat{\vertbar{} \vertbar{} {} \vertbar{};
    \sigma_1\,\bm{u}_1 \sigma_2\,\bm{u}_2 \cdots{}
              \sigma_n\,\bm{u}_n;
              \vertbar{} \vertbar{} {} \vertbar{}} \\ {} \\
    &= \spalignmat{\vertbar{} \vertbar{} {} \vertbar{};
                   \bm{u}_1 \bm{u}_2 \cdots{} \bm{u}_n;
                   \vertbar{} \vertbar{} {} \vertbar{} }
       \spalignmat{\sigma_1 0 \cdots{} 0;
                   0 \sigma_2 \cdots{} 0;
                   \vdots{} \vdots{} \ddots{}  \vdots{};
                   0  0 \cdots{} \sigma_n} \\ {} \\
    &= \mathbf{U\,\Sigma}
\end{array}

Note

This matrix equation is written as for a square matrix (m = n). For rectangular matrices, the \bf{\Sigma} matrix has either rows or columns of zeros to accommodate the matrix multiplication requirements.

Then a factoring of \bf{A} is \mathbf{A} = \mathbf{U\, \Sigma\,
V}^{-1}, but a requirement of the \bm{v} vectors is that they are orthogonal unit vectors such that \mathbf{V}^{-1} = \mathbf{V}^T, so the SVD factoring of \bf{A} is

(9.2)\boxed{ \mathbf{A} = \mathbf{U\,\Sigma\,V}^T.}

The factoring finds two orthogonal rotation matrices (\bf{U} and \mathbf{V}^{T}) and a diagonal stretching matrix (\bf{\Sigma}). The size of each matrix is: \bf{A}: m{\times}n, \bf{U}: m{\times}m, \bf{\Sigma}: m{\times}n, and \bf{V}: n{\times}n.

You may have correctly guessed that finding the SVD factors from \bf{A} makes use of eigenvalues and eigenvectors. However, the SVD factorization works with rectangular matrices as well as square matrices, but eigenvalues and eigenvectors are only found for square matrices. So a square matrix derived from the values of \bf{A} is needed.

Consider matrices \mathbf{A}^T\,\bf{A} and \mathbf{A\,A}^T, which are always square, symmetric, and have real, orthogonal eigenvectors. The size of \mathbf{A}^T \bf{A} is n{\times}n, while the size of \mathbf{A}\,\mathbf{A}^T is m{\times}m. When we express \mathbf{A}^T \bf{A} and \mathbf{A\,A}^T with the SVD factors, a pair of matrices reduce to the identity matrix and two \bf{\Sigma} matrices combine into a diagonal matrix of squared singular values.

\begin{array}{ll}
\mathbf{A}^T \mathbf{A} &= \left(\mathbf{U\, \Sigma\, V}^T\right)^T\,
   \mathbf{U\, \Sigma\, V}^T \\
     &= \mathbf{V \, \Sigma}^T \, \mathbf{U}^T \, \mathbf{U\, \Sigma\, V}^T \\
     &= \mathbf{V \, \Sigma}^T \, \mathbf{\Sigma\, V}^T \\
     &= \mathbf{V \, \Sigma}^2 \, \mathbf{V}^T \\
     &= \mathbf{V} \spalignmat{\sigma_1^2 0 \cdots{} 0;
                         0 \sigma_2^2 \cdots{} 0;
                         \vdots{} \vdots{} \ddots{}  \vdots{};
                         0  0 \cdots{} \sigma_n^2} \mathbf{V}^T
      \end{array}

This factoring is the diagonalization of a symmetric matrix (Diagonalization of a Symmetric Matrix). It follows that the \bf{V} matrix comes from the eigenvectors of \mathbf{A}^T \bf{A}. Likewise, the \bf{\Sigma} matrix is the square root of the diagonal eigenvalue matrix of \mathbf{A}^T \bf{A}.

Similarly, the \bf{U} matrix is the eigenvector matrix of \mathbf{A\,A}^T.

\begin{array}{ll}
\mathbf{A \,A}^T &= \mathbf{U\, \Sigma\, V}^T\,
        \left(\mathbf{U\, \Sigma\, V}^T\right)^T \\
           &= \mathbf{U \, \Sigma}^T \, \mathbf{V}^T \,
                \mathbf{V\, \Sigma\, U}^T \\
           &= \mathbf{U \, \Sigma}^T \, \mathbf{\Sigma\, U}^T \\
           &= \mathbf{U \, \Sigma}^2 \, \mathbf{U}^T \\
           &= \mathbf{U} \spalignmat{\sigma_1^2 0 \cdots{} 0;
                         0 \sigma_2^2 \cdots{} 0;
                         \vdots{} \vdots{} \ddots{}  \vdots{};
                         0  0 \cdots{} \sigma_m^2} \mathbf{U}^T
      \end{array}

Although \bf{U} comes from the eigenvectors of \mathbf{A\,A}^T, calculating \bf{U} as such is a poor choice for square matrices and may not be necessary for rectangular matrices.

9.2.1. Ordered Columns of the SVD

The columns of the sub-matrices are ordered according to the values of the singular values (\sigma_1 > \sigma_2 >
\cdots > \sigma_n). The columns of \bf{U} and \bf{V} are ordered to match the singular values.

9.2.2. SVD of Square Matrices

If \bf{V} and \bf{\Sigma} are known, \bf{U} may be found directly from equation (9.2). Since \mathbf{V}^T is orthogonal, its inverse is just \bf{V}. The diagonal structure of \bf{\Sigma} makes its inverse the diagonal matrix with the reciprocals of the \sigmas on the diagonal.

\mathbf{U} = \mathbf{A\,V\,\Sigma}^{-1}

In MATLAB, \mathbf{\Sigma}^{-1} may be found with either the pseudo-inverse, pinv, function or the right-divide operator. For full rank matrices the diag function could quickly find the inverse of \bf{\Sigma} as (diag(1./diag(Sigma))), but care would be needed to prevent a division by zero for singular matrices that have singular values of zero.

U = A*V*pinv(Sigma);
  %  or
U = (A*V)/Sigma;

9.2.3. SVD of Rectangular Matrices

To satisfy the size requirements for multiplying the SVD factors, the \bf{\Sigma} matrix contains rows of zeros for over-determined matrices and columns of zeros for under-determined matrices. Figures Fig. 9.1, and Fig. 9.2 show the related sizes of the sub-matrices for over-determined and under-determined matrices.

../_images/OverD-SVD.png

Fig. 9.1 Shape of the SVD factors of an over-determined matrix.

../_images/UnderD-SVD.png

Fig. 9.2 Shape of the SVD factors of an under-determined matrix.

9.2.4. The Economy SVD

Note that in the multiplication of the factors to yield the original matrix, (m - n) columns of \bf{U} for an over-determined matrix and (n - m) rows of \mathbf{V}^T for an under-determined matrix are multiplied by zeros from \bf{\Sigma}. They are not needed to recover \bf{A} from its factors. Many applications of the SVD do not require the unused columns of \bf{U} or the unused rows of \mathbf{V}^T. So the economy SVD is often used instead of the full SVD. The economy SVD removes the unused elements.

Figure Fig. 9.3 shows the related sizes of the economy sub-matrices for over-determined matrices. The primary difference to be aware of when applying the economy SVD is a degradation of the unitary properties of \bf{U} and \bm{V}. For an over-determined matrix \mathbf{\tilde{U}}^T\mathbf{\tilde{U}} = \mathbf{I}, but \mathbf{\tilde{U}}\mathbf{\tilde{U}}^T \ne \mathbf{I}. Similarly, for an under-determined matrix \mathbf{\tilde{V}}^T\mathbf{\tilde{V}} =
\mathbf{I}, but \mathbf{\tilde{V}}\mathbf{\tilde{V}}^T \ne \mathbf{I}.

../_images/Econ-SVD.png

Fig. 9.3 Shape of the economy SVD factors of an over-determined matrix.

We will resume the discussion of the economy SVD in Projection and the Economy SVD in the context of vector projections.

9.2.5. Implementation Difficulties

Finding the SVD using the classic algorithm is problematic.

  • A difficulty arising from the rows or columns of zeros in \bf{\Sigma} for the full SVD is that the full \bf{U} matrix for an over-determined matrix can not be directly found from \bf{A}, \bf{V}, and \bf{\Sigma} as may be done for square matrices or using the economy SVD of a rectangular matrix. The same problem exists for finding the full \bf{V} matrix for an under-determine matrix. However, finding both \bf{U} and \bf{V} from both \mathbf{AA}^T and \mathbf{A}^T\mathbf{A} is fraught with problems. If the eigenvectors of \mathbf{A}^T\mathbf{A} and \mathbf{AA}^T are computed independent of each other, there can be a problem with certain columns of \bf{U} or \bf{V} needing to be multiplied by -1 for the factoring to be correct. It is best to do only one eigenvector calculation.

  • Some applications may have large \bf{A} matrices (even hundreds of thousands of rows and columns), so calculating \mathbf{AA}^T for an over-determined matrix might take a long time and risks having unacceptable round-off errors.

  • Small singular values of \bf{A} can not be computed accurately with this approach. A singular value of 10^{-5} for \bf{A} corresponds to an eigenvalue of 10^{-10} for \mathbf{A}^T\mathbf{A}, which can not be expected to be computed to the level of needed accuracy. This is known as the “loss of information through squaring” phenomenon [WATKINS10].

    Golub and Reinsch [GOLUB70] offer a simple experiment to observe the lack of accuracy. Of course, the error that we see now with double precision floating point arithmetic is less than they would have observed using 1970 computers. Consider the matrix

    \mathbf{A} = \mat{1 1; b 0; 0 b},

    then

    \mathbf{A}^T\mathbf{A} = \mat{{1+b^2} 1; 1 {1+b^2}}.

    The singular values are: \sigma_1 = \sqrt{2 + b^2}, and \sigma_2 = \abs{b}. As the b variable is reduced from 0.01 by powers of 10 to 10^{-8}, then with the classic SVD algorithm small errors in \sigma_2 are returned from the beginning. The errors grow as b is reduced. At b = 10^{-8}, the error is complete and \sigma_2 = 0.

The difficulties associated with finding the SVD using the classic approach motivates the search for alternative algorithms, which are explored in Calculating the SVD with Unitary Transformations.

To complete the discussion of the classic SVD, econSVD is a simple function that uses the classic algorithm to find the SVD of a square matrix and the economy SVD of a rectangular matrix. The code shown here is not how the SVD should be computed, but it reinforces the relationship between a matrix and its SVD factors.

function [U, S, V] = econSVD(A)
% ECONSVD - an implementation of Singular Value Decomposition (SVD)
%           using the classic algorithm.
%           It finds the full SVD when A is square and the economy
%           SVD when A is rectangular.
% [U,S,V] = econSVD(A) ==> A = U*S*V'
%
% This function is for educational purposes only.

% Use the smaller of A'*A or A*A' for eig calculation.
    [m, n] = size(A);
    if m < n             % under-determined
        [U, S] = order_eigs(A*A');
        V = (A'*U)/S;
    else                 % square or over-determined
        [V, S] = order_eigs(A'*A);
        U = (A*V)/S;
    end
end

function [X, S] = order_eigs(B)
% helper function, B is either A'*A or A*A'
    [X2, S2] = eig(B);
    [S2, order] = sort(diag(S2), 'descend');
    X = X2(:, order);
    % There are no negative eigenvalues but we still need abs()
    % in the next line just because a 0 can be -0, giving an undesired
    % complex result.
    S = diag(sqrt(abs(S2)));
end