12.4.1. The Power Method

The power method is a technique to compute the most dominant eigenvalue of a matrix. Although it is an old and simple algorithm, it is still used in applications that require only one eigenpair. The basic theory of how the algorithm works is based on a change of basis that occurs when a matrix is raised to a specified power and multiplied by a vector, \mathbf{A}^k\bm{v}_0. Diagonalization of the matrix simplifies the exponent calculation. It changes the basis of the coordinate system from an Euclidean coordinate system to one that uses the matrix’s eigenvectors as the basis vectors. This approach has several applications in addition to the power method. See Change of Basis and Difference Equations for an explanation of using a change of basis strategy in the analysis of difference equations.

(12.5)\begin{array}{rl}
           \bm{v}_1 = A\,\bm{v}_0  &= \mathbf{A\,X}\,\bm{c} \\
                            &= \mathbf{X\,\Lambda}\,\bm{c} \\
   &= \lambda_1\,c_1\,\bm{x}_1 + \lambda_2\,c_2\,\bm{x}_2
               + \cdots + \lambda_n\,c_n\,\bm{x}_n \\
           \bm{v}_2 = A\,\bm{v}_1 =
          \mathbf{A}^2\bm{v}_0 &= \mathbf{A\,X\,\Lambda}\,\bm{c} \\
                            &= \mathbf{X\,\Lambda}^2\,\bm{c} \\
   &= \lambda_1^2\,c_1\,\bm{x}_1 + \lambda_2^2\,c_2\,\bm{x}_2
               + \cdots + \lambda_n^2\,c_n\,\bm{x}_n \\
           \bm{v}_k = A\,\bm{v}_{k-1} =
          \mathbf{A}^k\bm{v}_0 &= \mathbf{A\,X\,\Lambda}^{k-1}\,\bm{c} \\
                            &= \mathbf{X\,\Lambda}^k\,\bm{c} \\
   &= \lambda_1^k\,c_1\,\bm{x}_1 + \lambda_2^k\,c_2\,\bm{x}_2
               + \cdots + \lambda_n^k\,c_n\,\bm{x}_n \\
       \end{array}

Solving for the linear coefficients is just a matter of solving a system of linear equations: \bm{c} = \mathbf{X}^{-1}\bm{v}_0, or in MATLAB: c = X\v. However, the power method does not calculate the coefficients.

With the change of basis, the eigenvalues are the only variables raised to the power k. The term with the largest eigenvalue, \lambda_i, will dominate the calculation. Terms associated with eigenvalues having absolute values less than one go to zero when k becomes large. Since \abs{\lambda_1} > \abs{\lambda_2}
> \cdots > \abs{\lambda_n}, the limit involves only one eigenvalue.

\lim_{k\to\infty} \mathbf{A}^k\bm{v}_0 = \lambda_1^k\,c_1\,\bm{x}_1

The vector \bm{v}_0 can be any vector of the same dimension as the number of columns of \mathbf{A}. If, for example, \m{A} is a 2{\times}2 matrix, then \bm{v} = [1, 0]^T works. Find \bm{v}_k for a few values. And if k is large, the \bm{v}_k vectors are eigenvectors scaled by c_1 (\lambda_1)^k
\norm{x_1}. You will know if k is large enough or needs to be increased after calculating several eigenvalues for different values of k. A simple equation determines the eigenvalue estimates. Equation (12.6) makes a substitution of \mathbf{A}\,\bm{v} = \lambda \bm{v} to show that the calculation finds the eigenvalue.

(12.6)\lambda_{max,k} = \frac{\bm{v}_k^T \mathbf{A} \bm{v}_k}{\bm{v}_k^T \bm{v}_k}
      = \frac{ \lambda_{max,k} \norm{\bm{v}_k}^2 }{ \norm{\bm{v}_k}^2 }

Note

Equation (12.6) is called the Rayleigh quotient after English physicist John William Rayleigh (1842–1919).

Here is an example of eigenvalue calculation.

A = \mat{0.6353 0.9412; 0.3412 1.7647}, \quad \lambda_1 = 2.0,
\:\lambda_1 = 0.4

The first ten estimates of \lambda_1 are {1.4226, 1.905, 1.9839, 1.9969, 1.9994, 1.9999, 2, 2, 2, 2}. So, even without knowing the value of the dominant eigenvalue, we observe that the iterative algorithm converged at 2.0. The rate at which the algorithm converges is proportional to the ratio of the two largest eigenvalues \abs{\lambda_1}/\abs{\lambda_2}. In this small example, the convergence was fast, but \lambda_1 and \lambda_2 may have similar magnitudes for large matrices, so quite a few more iterations might be required before an accurate value is known.