9.4. Dimensionality Reduction¶
When a matrix contains important information that is obscured by noise
or less significant details, the SVD factors can reveal the most
important aspects of the data while removing the less significant
details. We began our discussion of the SVD with the simple
matrix–vector relationship .
Each set of vectors,
and
, along with the
corresponding singular value,
, are combined to build an
outer product matrix of
. The singular
values are our guide to finding the most significant information. For
most matrices, a few singular values will be much larger than the other
singular values. Thus, in the complete matrix multiplication by outer
products, the product terms with larger singular values hold the
significant information and the lesser terms can be discarded with only
minimal loss of information.
The values are scalar multiples of each matrix
(Matrix Multiplication Properties).
The Eckart–Young theorem developed in 1936 says that the closest rank
matrix to the original comes from the
SVD [BRUNTON19], [ECKART36]. The choice of
should be based on the singular values. It might be chosen such that the
sum of the first
singular values constitute 95%, or another
percentage, of the sum of all singular values. Gavish and
Donoho [GAVISH14] suggest that if the noise can be
modeled independent of the data then
should be the number of
singular values from the data that are larger than the largest singular
value of the noise. They also suggest algorithms and equations for
picking
when the noise can not be modeled. The
rank-
approximation of
is given by the
following sum of rank-1 matrices.
Eckart–Young Theorem
Consider two different matrices, and
,
of the same size.
has rank
. Define
as the sum of the first
rank-1 outer
product matrices from the SVD. Both
and
have rank
. Then the Eckart–Young theorem
tells us that
is a closer match to
than any other rank
matrix,
[ECKART36]. We express this as the
norm (magnitude) of differences between
matrices—
. This relationship holds for all
matrix norms listed in Matrix Norms.
The bestKsvd
script builds a simple pattern in the
matrix and then adds noise to the matrix. We see that the best
representation of the original matrix is found in the first three terms
of the SVD outer product. The remaining two terms restore the noise,
which we would just as well do without.
% File: bestKsvd.m
% Demonstrate the Eckart-Young theorem that says that the
% closest rank k matrix to the original comes from the SVD.
% Because of the random noise added, you may want to run this
% more than once. You should see that after about 3 SVD terms,
% the SVD matrix is closest (lowest mean-squared error) to the
% clean matrix before the noise was added. The later terms
% mostly add the noise back into the matrix. We can see why a
% rank k (k < r) matrix from SVD may be preferred to the
% full rank r matrix.
A1 = 5*eye(5) + 5*rot90(eye(5));
Aclean = [A1 A1]; % A test pattern
A = [A1 A1] + 0.5*randn(5, 10); % add some noise
[U, S, V] = svd(A);
figure, plot(diag(S), '*'), title('Singular Values')
Ak = zeros(5, 10);
figure
% Look at each rank k matrix from SVD
for k = 1:5
disp(['SVD terms: ', num2str(k)])
Ak = Ak + U(:,k) * S(k,k) * V(:,k)';
disp( Ak )
subplot(2, 3, k), imshow(Ak, []), title(['rank = ', num2str(k)])
disp(['rank: ', num2str(rank(Ak))])
disp(['Mean-squared Error: ', num2str(immse(A, Ak))])
disp(['No noise Mean-squared Error: ', num2str(immse(Aclean, Ak))])
disp(' ')
end
subplot(2,3,6), imshow(A, []), title('Original')
disp('Original matrix with noise')
disp(A)
% NOTE:
% >> rank(A1)
% ans =
% 3
% >> rank(A)
% ans =
% 5
The bestKsvdImage
script shows another dimensionality reduction
example. This one examines the SVD of an image. With the sum of a few
rank-1 matrices from the SVD, the major parts of the image start to take
shape. The higher terms add fine details. Something to think about: how
is a low rank image from SVD different or the same as a low-pass
filtered image? Figure Fig. 9.6 shows the singular
values of the image, and the images with progressive rank are shown in
figure Fig. 9.7.
% File: bestKsvdImage.m
% Use an image to demonstrate the Eckart-Young theorem that says
% that the closest rank k matrix to the original comes from the SVD.
% Use a public domain image from the Image Processing Toolbox.
A = im2double(imread('cameraman.tif'));
[m, n] = size(A);
[U, S, V] = svd(A);
figure, plot(diag(S), '*'), title('Singular Values')
%%
Ak = zeros(m, n);
figure
% Look at each rank k matrix from SVD
for k = 1:15
disp(['SVD terms: ', num2str(k)])
Ak = Ak + U(:,k) * S(k,k) * V(:,k)';
subplot(4, 4, k), imshow(Ak, []), title(['rank = ', num2str(k)])
disp(['rank: ', num2str(rank(Ak))])
disp(['Mean-squared Error: ', num2str(immse(A, Ak))])
disp(' ')
end
subplot(4,4,16), imshow(A, []), title('Original')