8.3. Finding Eigenvalues and Eigenvectors

We will begin with the equation for eigenvectors and eigenvalues and insert an identity matrix so that we have a matrix multiplied by a vector on both sides of the equality.

\begin{aligned}
    \mathbf{A}\,\bm{x} &= \lambda\,\bm{x} \\
    \mathbf{A}\,\bm{x} &= \lambda\mathbf{I}\,\bm{x}
\end{aligned}

Then we rearrange the equation to find what is called the characteristic eigenvalue equation.

\mathbf{A}\,\bm{x} - \lambda\mathbf{I}\,\bm{x} = \bm{0} \\

(8.2)\boxed{\left(\mathbf{A} - \lambda\mathbf{I}\right)\bm{x} = \bm{0}}

The case where \bm{x} = \bm{0} is a trivial solution that is not of interest to us. Eigenvectors are defined to be nonzero vectors. The solution to equation (8.2) only exists when the columns of matrix \mathbf{A} - \lambda\,\bf{I} form a linear combination with \bm{x} yielding the zero vector. This linear dependence of the columns of the characteristic equation means that it is singular—having a zero determinant.

8.3.1. Finding Eigenvalues

The n scalar eigenvalues, \{\lambda_1, \lambda_2, \ldots,
\lambda_n\}, can be viewed as the shift of the matrix’s main diagonal that will make the characteristic matrix singular. The Eigenvalues are found by subtracting \lambda along the main diagonal and finding the set of \lambda for which the determinant is zero.

det(\mathbf{A} - \lambda\,\mathbf{I}) = 0

\spaligndelims\vert\vert
\spalignmat{{a_{11} - \lambda} a_{12} {\cdots} a_{1n};
    a_{21} {a_{22} - \lambda} {\cdots} a_{2n};
    {\vdots}  {\vdots} {}  {\vdots};
    a_{n1} a_{n2} {\cdots} {a_{nn} - \lambda}} = 0

Why singular requirement?

You may be wondering why it is required that \left(\mathbf{A} -
\lambda\,\mathbf{I}\right) be a singular matrix. Let us call this matrix \bf{C}, and the columns of \bf{C} will be labeled as \bm{c}_i. For \mathbf{C}\,\bm{x} = \bm{0}, it must be that either:

  1. \bm{c}_1 = \bm{c}_2 = \ldots = \bm{c}_n = \bm{0}, and hence \bf{C} = \bm{0}, which is not the case we are looking for.
  2. \bm{x} = \bm{0}, which is also not what we are looking for.
  3. The columns of \bf{C} are linearly dependent and a null solution, \bm{x}, exists such that \mathbf{C}\,\bm{x} = \bm{0}. It is the linear dependent property of the columns of \bf{C} that makes \bf{C} a singular matrix. Recall from the discussion about the column view of matrix multiplication in The Row and Column View that a matrix must be singular to have a null solution.

Note

We will use the determinant here on 2{\times}2 matrices because it keeps things simple. But as noted in Determinant, calculating determinants is computationally slow. So it is not used for large matrices. Programs such as MATLAB use iterative algorithms. Computational algorithms for finding eigenvalues are described in Orthogonal Matrix Factoring. That material is very interesting, but it is not required for this course.

The determinant yields a degree-n polynomial, which can be factored to find the eigenvalue roots.

\mathbf{A} = \spalignmat{2 2;2 -1}

\begin{array}{rl} \spaligndelims\vert\vert
\spalignmat{(2-\lambda) 2;2 (-1-\lambda)} & = 0 \\ \\
(2 - \lambda)(-1 - \lambda) - 4 & = 0 \\
\lambda^2 - \lambda - 6 & = 0 \\
(\lambda - 3)(\lambda + 2) & = 0 \end{array}

\lambda_1 = -2, \hspace{0.5in} \lambda_2 = 3

For the generalized 2{\times}2 matrix, the coefficient of the \lambda term in the quadratic equation is the negative of the sum of the matrix diagonal (the trace), while the constant term is the determinant of the matrix.

\begin{array}{rl} \spaligndelims\vert\vert
\spalignmat{(a-\lambda) b;c (d-\lambda)} &= 0 \\ \hfill \\
(a - \lambda)(d - \lambda) - bc & = 0 \\
\lambda^2 - (a + d)\lambda + (ad - bc) & = 0 \\
\end{array}

This result for 2{\times}2 matrices is further simplified by the quadratic equation. We will define variables m as the mean of the diagonal elements and p as the determinant of the matrix. Then we have a simple equation for the two eigenvalues.

\begin{array}{rl} &m = \frac{a + d}{2} \\
&p = ad - bc \\ \hfill \\
&\boxed{\lambda_1, \lambda_2 = m \pm \sqrt{m^2 - p}}
\end{array}

Here is a quick example, which is verified with MATLAB’s eig function.

>> A
A =
    -6    -1
    -8    -4
>> eig(A)
ans =
    -8
    -2
>> m = -5;
>> p = 24 - 8;            %  p = 16
>> l1 = m - sqrt(m^2 - p) %  l1 = -5 - 3
l1 =
    -8
>> l2 = m + sqrt(25 - 16)
l2 =
    -2

8.3.2. Roots of a Polynomial by Eigenvalues

As we saw above, finding the eigenvalues of a matrix is equivalent to finding the roots of the determinant of the characteristic equation. But as noted, the algorithms that computer programs, such as MATLAB, use to find eigenvalues neither calculates a determinate nor finds the roots of a polynomial.

Rather than finding polynomial roots to calculate eigenvalues, finding the roots of a polynomial is instead an application of the eigenvalue algorithm. This is how the MATLAB roots function finds the roots of a polynomial. To take advantage of the eigenvalue algorithm, a matrix is cleverly found that has eigenvalues equivalent to the roots of the polynomial.

Figure Fig. 8.2 shows the algorithmic relationship between finding the roots of a polynomial and finding the eigenvalues of a matrix.

../_images/roots-eigs.png

Fig. 8.2 The roots of a polynomial and the eigenvalues of a corresponding matrix are the same. It is faster and simpler to find the eigenvalues using an iterative algorithm than to factor a polynomial.

An example should illustrate how this works. Consider the following polynomial equation.

f(x) = x^3 - 4x^2 + x + 6

The argument passed to the roots function is a row vector containing the coefficients of the polynomial.

>> r = roots([1 -4 1 6])
r =
    3.0000
    2.0000
   -1.0000

The poly function is the inverse of roots:

>> poly(r)
ans =
    1.0000   -4.0000    1.0000    6.0000

The algorithm used by the roots function is short and quite clever.

>> n = 3;               % degree of the polynomial
>> p = [1 -4 1 6];      % coefficients
>> A = diag(ones(n-1,1),-1)
A =
    0     0     0
    1     0     0
    0     1     0
>> A(1,:) = -p(2:n+1)./p(1)
A =
    4    -1    -6
    1     0     0
    0     1     0
>> r = eig(A)
r =
    3.0000
    2.0000
   -1.0000

The determinant of the characteristic equation of \bf{A} has the same coefficients and thus the same roots as f(x).

\begin{array}{rl} \spaligndelims\vert\vert
\spalignmat{(4-\lambda) -1 -6;1 (-\lambda) 0;
    0 1 (-\lambda)} & = 0 \\ \\
\lambda^3 - 4\lambda^2 + \lambda + 6 & = 0
\end{array}

See also

  • If analytic roots are needed, then the Symbolic Math Toolbox can help (Solve).
  • Numeric methods may be used to find the numeric roots of a non-polynomial function (Numerical Roots of a Function).

8.3.3. Finding Eigenvectors

The eigenvectors, \bm{x}_i (one per eigenvalue) lie on the same line as \mathbf{A}\,\bm{x}_i.

(\mathbf{A} - \lambda_i\,\mathbf{I})\bm{x}_i = 0

The solution to the above equation is called the null solution because we are looking for a vector, \bm{x}_i, that sets the equation to zero. The eigenvectors can be found with elimination or with the null function (Null Space). If the null function fails to find a solution, which can happen for small eigenvalues, then the SVD of the characteristic equation finds our closest approximation to the null solution as the last column of the \bf{V} matrix (The Singular Value Decomposition).

We now continue the previous example with elimination to find the eigenvectors.

\lambda_1 = -2\text{:    }
\spalignaugmat{4, 2, 0; 2 1 0}

Add -1/2 of row 1 to row 2 and then divide row 1 by 4:

\spalignaugmat{1, 1/2, 0; 0 0 0}

x_{1a} + 1/2\,x_{1b} = 0

The second row of zeros occurs because it is a singular matrix. This means that we have a free variable, so we can set one variable to any desired value (usually 1).

\bm{x}_1 = \vector{1; -2}

\lambda_2 = 3\text{:    }
\spalignaugmat{-1, 2, 0; 2 -4 0}

Add 2 of row 1 to row 2 and then divide row 1 by -1:

\spalignaugmat{1, -2, 0; 0 0 0}

x_{2a} - 2\,x_{2b} = 0

\bm{x}_2 = \vector{2; 1}

Note that MATLAB’s null and eig functions normalizes the eigenvector so that it has length of one. The following example shows that eigenvectors may be multiplied by a scalar to match the manual calculations.

>> A = [2 2; 2 -1];
% the eigenvalues
>> l1 = -2; l2 = 3;
>> C1 = A - l1*eye(2)
C1 =
    4     2
    2     1
% normalized eigenvector
>> x1 = null(C1)
x1 =
   -0.4472
    0.8944

% scaled eigenvector
% NOTE: Do not do this next step on homework assignments. I am just
% scaling the eigenvector so it will match what we found from
% elimination in the example above. Remember that  eigenvectors
% are invarient to scaling operations.

>> x1 = x1/x1(1)
x1 =
    1
   -2
>> C2 = A - l2*eye(2)
C2 =
   -1     2
    2    -4
% normalized eigenvector
>> x2 = null(C2)
x2 =
   -0.8944
   -0.4472

% scaled eigenvector
% Again, do not do this on homework assignments. See the note above.

>> x2 = x2/x2(2)
x2 =
    2
    1

MATLAB has a function called eig that calculates both the eigenvalues and eigenvectors of a matrix. The results are returned as matrices. The eigenvectors are the columns of X. The eigenvalues are on the diagonal of L. The MATLAB command diag(L) will return the eigenvalues as a row vector.

>> A = [2 2;2 -1];
>> [X, L] = eig(A)
X =
    0.4472   -0.8944
   -0.8944   -0.4472
L =
   -2     0
    0     3
>> diag(L)
ans =
    -2
     3

Passing the matrix as a symbolic math variable to eig will show eigenvectors that are not normalized.

>> [X, L] = eig(sym(A))
X =
[ -1/2, 2]
[    1, 1]
L =
[ -2, 0]
[  0, 3]