6.11.3. Projections Onto a Hyperplane¶
We can extend projections to and still visualize the projection as projecting a vector onto a plane. Here, the column space of matrix is two 3-dimension vectors, and .
The span of two vectors in forms a plane. As before, we have an approximation solution to , which can be written as . The error is . We want to find such that is orthogonal to the plane, so again we set the dot products equal to zero.
The left matrix is just , which is size . The size of is , so the size of , like , is .
The orthogonal solution makes use of QR decomposition. Because is over-determined, the last rows of will be all zeros, so we can reduce the size of and with the economy QR option where The first columns of are returned ( is now , and is returned as a square matrix. After finding and , computing the solution only requires one multiplication with a vector and then a back-substitution of an upper triangular matrix.
In MATLAB, we can find as:
x_hat = (A'*A) \ (A'*b);
% or
[Q, R] = qr(A, 0); % Economy QR
x_hat = R \ (Q'*b);
Note
(A \ b)
finds the same answer. The left-divide operator handles
under-determined and over-determined systems as well as square matrix
systems (Left-Divide Operator).
The projection vector is then [1]:
The projection matrix is:
Note
is not a square matrix, and is not invertible. If it were, and there would be no need to do the projection. Try to verify why this is the case. Recall that .
6.11.3.1. Over-determined Pseudo-inverse¶
The matrix
has a
special name and symbol. It is called the Moore–Penrose pseudo-inverse
of an over-determined matrix. It is used when the simpler
can not be used. A superscript plus sign
() is used as a short-hand math symbol for the
pseudo-inverse. MATLAB has a function called pinv(A)
that returns
the pseudo-inverse of .
MATLAB uses the SVD to calculate the pseudo-inverse of over-determined systems and under-determined systems (The Preferred Under-determined Solution), which is more accurate and usually faster than the direct matrix calculation (Over-determined Pseudo-inverse).
Note
Unlike the matrix equations, the same equation of the SVD sub-matrices finds the pseudo-inverse for both over-determined and under-determined matrices. Recall that since and are orthogonal, their inverse are just their transpose.
Since is a diagonal matrix its inverse is the reciprocal of the nonzero elements on the diagonal. The zero elements remain zero.
>> S = diag([2 3 0])
S =
2 0 0
0 3 0
0 0 0
>> pinv(S)
ans =
0.5000 0 0
0 0.3333 0
0 0 0
Here is an example of using the pseudo-inverse on an over-determined matrix equation.
>> A=randi(20,15,4)-8;
>> y=randi(10,4,1)-4
y =
4
-1
2
3
>> b = A*y;
>> [U, S, V] = svd(A);
>> pinvA=V*pinv(S)*U';
>> x = pinvA*b
x =
4.0000
-1.0000
2.0000
3.0000
>> x = pinv(A)*b
x =
4.0000
-1.0000
2.0000
3.0000
6.11.3.2. Projection Example¶
The script program in projectR3.m
and the resulting plot in
figure Fig. 6.26 demonstrate the calculation and
display of the projection of a vector onto the column space of a matrix.
The dot product is also calculated to verify orthogonality.
Two vectors, and define a plane. To find the equation for a plane, we first determine a vector, , which is orthogonal to both vectors. The cross product provides this. The vector is defined as . Then given one point on the plane , we can calculate the equation for the plane.
In this case, we know that the point is on the plane, so the plane equation is simplified.
The points of the plane are determined by first defining a region for
and and then using the plane equation to calculate
the corresponding points for . A simple helper function called
vector3
was used to plot the vectors. A plot of the projection is
shown in figure Fig. 6.26.
% File: projectR3.m
% Projection of the vector B onto the column space of A.
% Note that with three sample points, we can visualize
% the projection in R^3.
%% Find the projection
% Define the column space of A, which forms a plane in R^3.
a1 = [1 1 0]'; % vector from (0, 0, 0) to (1, 1, 0).
a2 = [-1 2 1]';
A = [a1 a2];
b = [1 1 3]';
x_hat = (A'*A)\(A'*b);
p = A*x_hat; % projection vector
e = b - p; % error vector
fprintf('Error is orthogonal to the plane if close to zero:
%f\n', p'*e);
%% Plot the projection
% To make the equation for a plane, we need n, which is the
% orthogonal normal vector to the column space, found from
% the cross product.
n = cross(a1,a2);
% Plane is: a(x-x0) +b(y-y0) + c(z-z0) + d = 0
% Variables a, b, and c come from the normal vector.
% Keep it simple and use point (0, 0, 0) as the point on
% the plane (x0, y0, z0).
a=n(1); bp=n(2); c=n(3); d=0;
[x, y] = meshgrid(-1:0.25:3); % Generate x and y data
z = -1/c*(a*x + bp*y + d); % Generate z data
figure;
surf(x,y,z); % Plot the plane
daspect([1 1 1]); % consistent aspect ratio
hold on;
z0 = [0 0 0]'; % vector starting point
vector3(z0, a1, 'k'); % column 1 of A
vector3(z0, a2, 'k'); % column 2 of A
vector3(z0, b, 'g'); % target vector
vector3(z0, p, 'b'); % projection
vector3(p, b, 'r'); % error from p to b
hold off;
xlabel('X');
ylabel('Y');
zlabel('Z');
title('Projection of a vector onto a plane');
% helper function
function vector3( v1, v2, clr )
% VECTOR3 Plot a vector from v1 to v2 in R^3.
plot3([v1(1) v2(1)], [v1(2) v2(2)], ...
[v1(3) v2(3)], 'Color', clr,'LineWidth',2);
end
6.11.3.3. Alternate Projection Equation¶
Not all linear algebra textbooks present the same equations for projection of a vector onto a vector space. The approach that we have used might be described as projecting a vector onto the column space of a matrix. An alternate approach projects a vector onto an equivalent vector space that is spanned by an orthonormal set of basis vectors. So the first step in the alternate approach is to find the orthonormal basis vectors of the matrix columns. (See Vector Spaces for help with some of the terminology.)
Let be an orthonormal basis for the column space of matrix . Then the projection of vector onto is defined by
There is a trade-off between the two algorithms. The alternative
approach finds the orthonormal basis vectors to the columns of
using either the Gram–Schmidt process (Finding Orthogonal Basis Vectors
QR factorization (QR Decomposition), or the economy SVD
method (Projection and the Economy SVD), which is used by the orth
function.
Whereas, projecting onto the column space requires solving an
problem.
The geometry of the alternate algorithm provides a simple way to see that it also projects a vector onto a vector space.
- Any vector in the vector space must be a linear combination of the orthonormal basis vectors.
- The coefficients in the sum are scalars from the dot products of and each basis vector. Since the basis vectors are unit length, each term in the sum is , where is the angle between and the basis vector; thus, each term in the sum is the projection of onto a basis vector.
The example in the file alt_project.m
finds the projection using the
two methods. The mod_gram_schmidt
function from Implementation of Modified Gram–Schmidt is
used to find the orthogonal basis vectors.
% File: alt_project.m
%% Comparison of two projection equations
% Define the column space of A, which forms a plane in R^3.
a1 = [1 1 0]'; % vector from (0, 0, 0) to (1, 1, 0).
a2 = [-1 2 1]';
A = [a1 a2];
b = [1 1 3]';
%% Projection onto the column space of matrix A.
x_hat = (A'*A)\(A'*b);
p = A*x_hat; % projection vector
disp('Projection onto column space: ')
disp(p)
%% Alternate projection
% The projection is the vector sum of projections onto
% the orthonormal basis vectors of the column space of A.
[U, ~] = mod_gram_schmidt(A); % Could use the orth function
u1 = U(:,1);
u2 = U(:,2);
p_alt = b'*u1*u1 + b'*u2*u2;
disp('Projection onto basis vectors: ')
disp(p_alt)
The output from the vector projection example follows.
>> alt_project
Projection onto column space:
0.1818
1.8182
0.5455
Projection onto basis vectors:
0.1818
1.8182
0.5455
6.11.3.4. Higher Dimension Projection¶
If the matrix is larger than , then we can not visually plot a projection as we did above. However, the equations still hold. One of the nice things about linear algebra is that we can visually see what is happening when the vectors are in or , but higher dimensions are fine too. The projection equations are used in least squares regression where we want as many rows of data as possible to accurately fit an equation to the data.
Footnote:
[1] | It is tempting to replace here with , but is just . Remember that our projection equation is an estimate of when is not in the column space of . |