.. _classicSVD: The Classic SVD =============== We discuss the classic definition of the relationship between a matrix and it SVD factors in this section. The objective is to understand the relationship. We will consider the requirements and the difficulties associated with implementing the classic algorithm to compute the SVD. As with diagonalization factoring, a square matrix has :math:`n` equations like equation :eq:`eq-svd-vect`. Similarly, the :math:`n` vector equations can be combined into a matrix equation. For a square, full rank matrix it looks as follows. .. math:: \begin{array}{ll} \mathbf{A\,V} &= \mathbf{A} \spalignmat{\vertbar{} \vertbar{} {} \vertbar{}; \bm{v}_1 \bm{v}_2 \cdots{} \bm{v}_n; \vertbar{} \vertbar{} {} \vertbar{}} = \spalignmat{\vertbar{} \vertbar{} {} \vertbar{}; \sigma_1\,\bm{u}_1 \sigma_2\,\bm{u}_2 \cdots{} \sigma_n\,\bm{u}_n; \vertbar{} \vertbar{} {} \vertbar{}} \\ {} \\ &= \spalignmat{\vertbar{} \vertbar{} {} \vertbar{}; \bm{u}_1 \bm{u}_2 \cdots{} \bm{u}_n; \vertbar{} \vertbar{} {} \vertbar{} } \spalignmat{\sigma_1 0 \cdots{} 0; 0 \sigma_2 \cdots{} 0; \vdots{} \vdots{} \ddots{} \vdots{}; 0 0 \cdots{} \sigma_n} \\ {} \\ &= \mathbf{U\,\Sigma} \end{array} .. note:: This matrix equation is written as for a square matrix (:math:`m = n`). For rectangular matrices, the :math:`\bf{\Sigma}` matrix has either rows or columns of zeros to accommodate the matrix multiplication requirements. Then a factoring of :math:`\bf{A}` is :math:`\mathbf{A} = \mathbf{U\, \Sigma\, V}^{-1}`, but a requirement of the :math:`\bm{v}` vectors is that they are orthogonal unit vectors such that :math:`\mathbf{V}^{-1} = \mathbf{V}^T`, so the SVD factoring of :math:`\bf{A}` is .. math:: :label: eq-svd-factor \boxed{ \mathbf{A} = \mathbf{U\,\Sigma\,V}^T.} The factoring finds two orthogonal rotation matrices (:math:`\bf{U}` and :math:`\mathbf{V}^{T}`) and a diagonal stretching matrix (:math:`\bf{\Sigma}`). The size of each matrix is: :math:`\bf{A}`: :math:`m{\times}n`, :math:`\bf{U}`: :math:`m{\times}m`, :math:`\bf{\Sigma}`: :math:`m{\times}n`, and :math:`\bf{V}`: :math:`n{\times}n`. You may have correctly guessed that finding the SVD factors from :math:`\bf{A}` makes use of eigenvalues and eigenvectors. However, the SVD factorization works with rectangular matrices as well as square matrices, but eigenvalues and eigenvectors are only found for square matrices. So a square matrix derived from the values of :math:`\bf{A}` is needed. Consider matrices :math:`\mathbf{A}^T\,\bf{A}` and :math:`\mathbf{A\,A}^T`, which are always square, symmetric, and have real, orthogonal eigenvectors. The size of :math:`\mathbf{A}^T \bf{A}` is :math:`n{\times}n`, while the size of :math:`\mathbf{A}\,\mathbf{A}^T` is :math:`m{\times}m`. When we express :math:`\mathbf{A}^T \bf{A}` and :math:`\mathbf{A\,A}^T` with the SVD factors, a pair of matrices reduce to the identity matrix and two :math:`\bf{\Sigma}` matrices combine into a diagonal matrix of squared singular values. .. math:: \begin{array}{ll} \mathbf{A}^T \mathbf{A} &= \left(\mathbf{U\, \Sigma\, V}^T\right)^T\, \mathbf{U\, \Sigma\, V}^T \\ &= \mathbf{V \, \Sigma}^T \, \mathbf{U}^T \, \mathbf{U\, \Sigma\, V}^T \\ &= \mathbf{V \, \Sigma}^T \, \mathbf{\Sigma\, V}^T \\ &= \mathbf{V \, \Sigma}^2 \, \mathbf{V}^T \\ &= \mathbf{V} \spalignmat{\sigma_1^2 0 \cdots{} 0; 0 \sigma_2^2 \cdots{} 0; \vdots{} \vdots{} \ddots{} \vdots{}; 0 0 \cdots{} \sigma_n^2} \mathbf{V}^T \end{array} This factoring is the diagonalization of a symmetric matrix (:ref:`symdiag`). It follows that the :math:`\bf{V}` matrix comes from the eigenvectors of :math:`\mathbf{A}^T \bf{A}`. Likewise, the :math:`\bf{\Sigma}` matrix is the square root of the diagonal eigenvalue matrix of :math:`\mathbf{A}^T \bf{A}`. Similarly, the :math:`\bf{U}` matrix is the eigenvector matrix of :math:`\mathbf{A\,A}^T`. .. math:: \begin{array}{ll} \mathbf{A \,A}^T &= \mathbf{U\, \Sigma\, V}^T\, \left(\mathbf{U\, \Sigma\, V}^T\right)^T \\ &= \mathbf{U \, \Sigma}^T \, \mathbf{V}^T \, \mathbf{V\, \Sigma\, U}^T \\ &= \mathbf{U \, \Sigma}^T \, \mathbf{\Sigma\, U}^T \\ &= \mathbf{U \, \Sigma}^2 \, \mathbf{U}^T \\ &= \mathbf{U} \spalignmat{\sigma_1^2 0 \cdots{} 0; 0 \sigma_2^2 \cdots{} 0; \vdots{} \vdots{} \ddots{} \vdots{}; 0 0 \cdots{} \sigma_m^2} \mathbf{U}^T \end{array} Although :math:`\bf{U}` comes from the eigenvectors of :math:`\mathbf{A\,A}^T`, calculating :math:`\bf{U}` as such is a poor choice for square matrices and may not be necessary for rectangular matrices. .. _order-columns-of-the-svd: Ordered Columns of the SVD -------------------------- The columns of the sub-matrices are ordered according to the values of the singular values (:math:`\sigma_1 > \sigma_2 > \cdots > \sigma_n`). The columns of :math:`\bf{U}` and :math:`\bf{V}` are ordered to match the singular values. SVD of Square Matrices ---------------------- If :math:`\bf{V}` and :math:`\bf{\Sigma}` are known, :math:`\bf{U}` may be found directly from equation :eq:`eq-svd-factor`. Since :math:`\mathbf{V}^T` is orthogonal, its inverse is just :math:`\bf{V}`. The diagonal structure of :math:`\bf{\Sigma}` makes its inverse the diagonal matrix with the reciprocals of the :math:`\sigma`\ s on the diagonal. .. math:: \mathbf{U} = \mathbf{A\,V\,\Sigma}^{-1} .. index:: pinv .. index:: diag In MATLAB, :math:`\mathbf{\Sigma}^{-1}` may be found with either the pseudo-inverse, ``pinv``, function or the right-divide operator. For full rank matrices the ``diag`` function could quickly find the inverse of :math:`\bf{\Sigma}` as (``diag(1./diag(Sigma))``), but care would be needed to prevent a division by zero for singular matrices that have singular values of zero. :: U = A*V*pinv(Sigma); % or U = (A*V)/Sigma; .. _SVDrectangular: SVD of Rectangular Matrices --------------------------- To satisfy the size requirements for multiplying the SVD factors, the :math:`\bf{\Sigma}` matrix contains rows of zeros for over-determined matrices and columns of zeros for under-determined matrices. Figures :numref:`fig-OverD-SVD`, and :numref:`fig-UnderD-SVD` show the related sizes of the sub-matrices for over-determined and under-determined matrices. .. _fig-OverD-SVD: .. figure:: OverD-SVD.png :align: center :width: 40% Shape of the SVD factors of an over-determined matrix. .. _fig-UnderD-SVD: .. figure:: UnderD-SVD.png :align: center :width: 40% Shape of the SVD factors of an under-determined matrix. .. _economy-SVD: The Economy SVD --------------- .. index:: economy SVD .. index:: SVD!economy Note that in the multiplication of the factors to yield the original matrix, :math:`(m - n)` columns of :math:`\bf{U}` for an over-determined matrix and :math:`(n - m)` rows of :math:`\mathbf{V}^T` for an under-determined matrix are multiplied by zeros from :math:`\bf{\Sigma}`. They are not needed to recover :math:`\bf{A}` from its factors. Many applications of the SVD do not require the unused columns of :math:`\bf{U}` or the unused rows of :math:`\mathbf{V}^T`. So the *economy* SVD is often used instead of the full SVD. The economy SVD removes the unused elements. Figure :numref:`fig-Econ-SVD` shows the related sizes of the economy sub-matrices for over-determined matrices. The primary difference to be aware of when applying the economy SVD is a degradation of the unitary properties of :math:`\bf{U}` and :math:`\bm{V}`. For an over-determined matrix :math:`\mathbf{\tilde{U}}^T\mathbf{\tilde{U}} = \mathbf{I}`, but :math:`\mathbf{\tilde{U}}\mathbf{\tilde{U}}^T \ne \mathbf{I}`. Similarly, for an under-determined matrix :math:`\mathbf{\tilde{V}}^T\mathbf{\tilde{V}} = \mathbf{I}`, but :math:`\mathbf{\tilde{V}}\mathbf{\tilde{V}}^T \ne \mathbf{I}`. .. _fig-Econ-SVD: .. figure:: Econ-SVD.png :align: center :width: 40% Shape of the economy SVD factors of an over-determined matrix. We will resume the discussion of the economy SVD in :ref:`econSVD` in the context of vector projections. Implementation Difficulties --------------------------- Finding the SVD using the classic algorithm is problematic. - A difficulty arising from the rows or columns of zeros in :math:`\bf{\Sigma}` for the full SVD is that the full :math:`\bf{U}` matrix for an over-determined matrix can not be directly found from :math:`\bf{A}`, :math:`\bf{V}`, and :math:`\bf{\Sigma}` as may be done for square matrices or using the economy SVD of a rectangular matrix. The same problem exists for finding the full :math:`\bf{V}` matrix for an under-determine matrix. However, finding both :math:`\bf{U}` and :math:`\bf{V}` from both :math:`\mathbf{AA}^T` and :math:`\mathbf{A}^T\mathbf{A}` is fraught with problems. If the eigenvectors of :math:`\mathbf{A}^T\mathbf{A}` and :math:`\mathbf{AA}^T` are computed independent of each other, there can be a problem with certain columns of :math:`\bf{U}` or :math:`\bf{V}` needing to be multiplied by :math:`-1` for the factoring to be correct. It is best to do only one eigenvector calculation. - Some applications may have large :math:`\bf{A}` matrices (even hundreds of thousands of rows and columns), so calculating :math:`\mathbf{AA}^T` for an over-determined matrix might take a long time and risks having unacceptable round-off errors. - Small singular values of :math:`\bf{A}` can not be computed accurately with this approach. A singular value of :math:`10^{-5}` for :math:`\bf{A}` corresponds to an eigenvalue of :math:`10^{-10}` for :math:`\mathbf{A}^T\mathbf{A}`, which can not be expected to be computed to the level of needed accuracy. This is known as the “loss of information through squaring” phenomenon [WATKINS10]_. Golub and Reinsch [GOLUB70]_ offer a simple experiment to observe the lack of accuracy. Of course, the error that we see now with double precision floating point arithmetic is less than they would have observed using 1970 computers. Consider the matrix .. math:: \mathbf{A} = \mat{1 1; b 0; 0 b}, then .. math:: \mathbf{A}^T\mathbf{A} = \mat{{1+b^2} 1; 1 {1+b^2}}. The singular values are: :math:`\sigma_1 = \sqrt{2 + b^2}`, and :math:`\sigma_2 = \abs{b}`. As the :math:`b` variable is reduced from 0.01 by powers of 10 to :math:`10^{-8}`, then with the classic SVD algorithm small errors in :math:`\sigma_2` are returned from the beginning. The errors grow as :math:`b` is reduced. At :math:`b = 10^{-8}`, the error is complete and :math:`\sigma_2 = 0`. The difficulties associated with finding the SVD using the classic approach motivates the search for alternative algorithms, which are explored in :ref:`SVDcalc`. To complete the discussion of the classic SVD, ``econSVD`` is a simple function that uses the classic algorithm to find the SVD of a square matrix and the economy SVD of a rectangular matrix. The code shown here is not how the SVD should be computed, but it reinforces the relationship between a matrix and its SVD factors. :: function [U, S, V] = econSVD(A) % ECONSVD - an implementation of Singular Value Decomposition (SVD) % using the classic algorithm. % It finds the full SVD when A is square and the economy % SVD when A is rectangular. % [U,S,V] = econSVD(A) ==> A = U*S*V' % % This function is for educational purposes only. % Use the smaller of A'*A or A*A' for eig calculation. [m, n] = size(A); if m < n % under-determined [U, S] = order_eigs(A*A'); V = (A'*U)/S; else % square or over-determined [V, S] = order_eigs(A'*A); U = (A*V)/S; end end function [X, S] = order_eigs(B) % helper function, B is either A'*A or A*A' [X2, S2] = eig(B); [S2, order] = sort(diag(S2), 'descend'); X = X2(:, order); % There are no negative eigenvalues but we still need abs() % in the next line just because a 0 can be -0, giving an undesired % complex result. S = diag(sqrt(abs(S2))); end