6.11.3. Projections Onto a Hyperplane¶
We can extend projections to and still visualize the
projection as projecting a vector onto a plane. Here, the column space of
matrix
is two 3-dimension vectors,
and
.
The span of two vectors in forms a plane. As
before, we have an approximation solution to
, which can be written as
. The error is
. We
want to find
such that
is orthogonal to the plane, so
again we set the dot products equal to zero.
The left matrix is just , which is size
. The size of
is
, so the size of
, like
, is
.
The orthogonal solution makes use of QR decomposition. Because
is over-determined, the last
rows of
will be all zeros, so we can reduce the size of
and
with the economy QR option where The
first
columns of
are returned (
is now
, and
is returned as a
square matrix. After finding
and
, computing the solution only requires one
multiplication with a
vector
and then a
back-substitution of an upper triangular
matrix.
In MATLAB, we can find as:
x_hat = (A'*A) \ (A'*b);
% or
[Q, R] = qr(A, 0); % Economy QR
x_hat = R \ (Q'*b);
Note
(A \ b)
finds the same answer. The left-divide operator handles
under-determined and over-determined systems as well as square matrix
systems (Left-Divide Operator).
The projection vector is then [1]:
The projection matrix is:
Note
is not a square matrix, and is not invertible. If it
were,
and there would be no need to do
the projection. Try to
verify why this is the case. Recall that
.
6.11.3.1. Over-determined Pseudo-inverse¶
The matrix
has a
special name and symbol. It is called the Moore–Penrose pseudo-inverse
of an over-determined matrix. It is used when the simpler
can not be used. A superscript plus sign
(
) is used as a short-hand math symbol for the
pseudo-inverse. MATLAB has a function called
pinv(A)
that returns
the pseudo-inverse of .
MATLAB uses the SVD to calculate the pseudo-inverse of over-determined systems and under-determined systems (The Preferred Under-determined Solution), which is more accurate and usually faster than the direct matrix calculation (Over-determined Pseudo-inverse).
Note
Unlike the matrix equations, the same equation of the SVD
sub-matrices finds the pseudo-inverse for both over-determined and
under-determined matrices. Recall that since and
are orthogonal, their inverse are just their
transpose.
Since is a diagonal matrix its inverse is the
reciprocal of the nonzero elements on the diagonal. The zero elements
remain zero.
>> S = diag([2 3 0])
S =
2 0 0
0 3 0
0 0 0
>> pinv(S)
ans =
0.5000 0 0
0 0.3333 0
0 0 0
Here is an example of using the pseudo-inverse on an over-determined matrix equation.
>> A=randi(20,15,4)-8;
>> y=randi(10,4,1)-4
y =
4
-1
2
3
>> b = A*y;
>> [U, S, V] = svd(A);
>> pinvA=V*pinv(S)*U';
>> x = pinvA*b
x =
4.0000
-1.0000
2.0000
3.0000
>> x = pinv(A)*b
x =
4.0000
-1.0000
2.0000
3.0000
6.11.3.2. Projection Example¶
The script program in projectR3.m
and the resulting plot in
figure Fig. 6.26 demonstrate the calculation and
display of the projection of a vector onto the column space of a matrix.
The dot product is also calculated to verify orthogonality.
Two vectors, and
define a plane. To
find the equation for a plane, we first determine a vector,
, which is orthogonal to both vectors. The cross product
provides this. The vector
is defined as
. Then given one point on the plane
, we can calculate the equation for the plane.
In this case, we know that the point is on the plane,
so the plane equation is simplified.
The points of the plane are determined by first defining a region for
and
and then using the plane equation to calculate
the corresponding points for
. A simple helper function called
vector3
was used to plot the vectors. A plot of the projection is
shown in figure Fig. 6.26.
% File: projectR3.m
% Projection of the vector B onto the column space of A.
% Note that with three sample points, we can visualize
% the projection in R^3.
%% Find the projection
% Define the column space of A, which forms a plane in R^3.
a1 = [1 1 0]'; % vector from (0, 0, 0) to (1, 1, 0).
a2 = [-1 2 1]';
A = [a1 a2];
b = [1 1 3]';
x_hat = (A'*A)\(A'*b);
p = A*x_hat; % projection vector
e = b - p; % error vector
fprintf('Error is orthogonal to the plane if close to zero:
%f\n', p'*e);
%% Plot the projection
% To make the equation for a plane, we need n, which is the
% orthogonal normal vector to the column space, found from
% the cross product.
n = cross(a1,a2);
% Plane is: a(x-x0) +b(y-y0) + c(z-z0) + d = 0
% Variables a, b, and c come from the normal vector.
% Keep it simple and use point (0, 0, 0) as the point on
% the plane (x0, y0, z0).
a=n(1); bp=n(2); c=n(3); d=0;
[x, y] = meshgrid(-1:0.25:3); % Generate x and y data
z = -1/c*(a*x + bp*y + d); % Generate z data
figure;
surf(x,y,z); % Plot the plane
daspect([1 1 1]); % consistent aspect ratio
hold on;
z0 = [0 0 0]'; % vector starting point
vector3(z0, a1, 'k'); % column 1 of A
vector3(z0, a2, 'k'); % column 2 of A
vector3(z0, b, 'g'); % target vector
vector3(z0, p, 'b'); % projection
vector3(p, b, 'r'); % error from p to b
hold off;
xlabel('X');
ylabel('Y');
zlabel('Z');
title('Projection of a vector onto a plane');
% helper function
function vector3( v1, v2, clr )
% VECTOR3 Plot a vector from v1 to v2 in R^3.
plot3([v1(1) v2(1)], [v1(2) v2(2)], ...
[v1(3) v2(3)], 'Color', clr,'LineWidth',2);
end
6.11.3.3. Alternate Projection Equation¶
Not all linear algebra textbooks present the same equations for projection of a vector onto a vector space. The approach that we have used might be described as projecting a vector onto the column space of a matrix. An alternate approach projects a vector onto an equivalent vector space that is spanned by an orthonormal set of basis vectors. So the first step in the alternate approach is to find the orthonormal basis vectors of the matrix columns. (See Vector Spaces for help with some of the terminology.)
Let be
an orthonormal basis for the column space of matrix
. Then
the projection of vector
onto
is defined by
There is a trade-off between the two algorithms. The alternative
approach finds the orthonormal basis vectors to the columns of
using either the Gram–Schmidt process (Finding Orthogonal Basis Vectors
QR factorization (QR Decomposition), or the economy SVD
method (Projection and the Economy SVD), which is used by the
orth
function.
Whereas, projecting onto the column space requires solving an
problem.
The geometry of the alternate algorithm provides a simple way to see that it also projects a vector onto a vector space.
- Any vector in the vector space must be a linear combination of the orthonormal basis vectors.
- The
coefficients in the sum are scalars from the dot products of
and each basis vector. Since the basis vectors are unit length, each term in the sum is
, where
is the angle between
and the basis vector; thus, each term in the sum is the projection of
onto a basis vector.
The example in the file alt_project.m
finds the projection using the
two methods. The mod_gram_schmidt
function from Implementation of Modified Gram–Schmidt is
used to find the orthogonal basis vectors.
% File: alt_project.m
%% Comparison of two projection equations
% Define the column space of A, which forms a plane in R^3.
a1 = [1 1 0]'; % vector from (0, 0, 0) to (1, 1, 0).
a2 = [-1 2 1]';
A = [a1 a2];
b = [1 1 3]';
%% Projection onto the column space of matrix A.
x_hat = (A'*A)\(A'*b);
p = A*x_hat; % projection vector
disp('Projection onto column space: ')
disp(p)
%% Alternate projection
% The projection is the vector sum of projections onto
% the orthonormal basis vectors of the column space of A.
[U, ~] = mod_gram_schmidt(A); % Could use the orth function
u1 = U(:,1);
u2 = U(:,2);
p_alt = b'*u1*u1 + b'*u2*u2;
disp('Projection onto basis vectors: ')
disp(p_alt)
The output from the vector projection example follows.
>> alt_project
Projection onto column space:
0.1818
1.8182
0.5455
Projection onto basis vectors:
0.1818
1.8182
0.5455
6.11.3.4. Higher Dimension Projection¶
If the matrix is larger than
, then we
can not visually plot a projection as we did above. However, the
equations still hold. One of the nice things about linear algebra is
that we can visually see what is happening when the vectors are in
or
, but higher dimensions are
fine too. The projection equations are used in least squares regression
where we want as many rows of data as possible to accurately fit an
equation to the data.
Footnote:
[1] | It is tempting to replace ![]() ![]() ![]() ![]() ![]() ![]() ![]() |