9.4. Dimensionality Reduction¶
When a matrix contains important information that is obscured by noise or less significant details, the SVD factors can reveal the most important aspects of the data while removing the less significant details. We began our discussion of the SVD with the simple matrix–vector relationship . Each set of vectors, and , along with the corresponding singular value, , are combined to build an outer product matrix of . The singular values are our guide to finding the most significant information. For most matrices, a few singular values will be much larger than the other singular values. Thus, in the complete matrix multiplication by outer products, the product terms with larger singular values hold the significant information and the lesser terms can be discarded with only minimal loss of information.
The values are scalar multiples of each matrix (Matrix Multiplication Properties).
The Eckart–Young theorem developed in 1936 says that the closest rank matrix to the original comes from the SVD [BRUNTON19], [ECKART36]. The choice of should be based on the singular values. It might be chosen such that the sum of the first singular values constitute 95%, or another percentage, of the sum of all singular values. Gavish and Donoho [GAVISH14] suggest that if the noise can be modeled independent of the data then should be the number of singular values from the data that are larger than the largest singular value of the noise. They also suggest algorithms and equations for picking when the noise can not be modeled. The rank- approximation of is given by the following sum of rank-1 matrices.
Eckart–Young Theorem
Consider two different matrices, and , of the same size. has rank . Define as the sum of the first rank-1 outer product matrices from the SVD. Both and have rank . Then the Eckart–Young theorem tells us that is a closer match to than any other rank matrix, [ECKART36]. We express this as the norm (magnitude) of differences between matrices—. This relationship holds for all matrix norms listed in Matrix Norms.
The bestKsvd
script builds a simple pattern in the
matrix and then adds noise to the matrix. We see that the best
representation of the original matrix is found in the first three terms
of the SVD outer product. The remaining two terms restore the noise,
which we would just as well do without.
% File: bestKsvd.m
% Demonstrate the Eckart-Young theorem that says that the
% closest rank k matrix to the original comes from the SVD.
% Because of the random noise added, you may want to run this
% more than once. You should see that after about 3 SVD terms,
% the SVD matrix is closest (lowest mean-squared error) to the
% clean matrix before the noise was added. The later terms
% mostly add the noise back into the matrix. We can see why a
% rank k (k < r) matrix from SVD may be preferred to the
% full rank r matrix.
A1 = 5*eye(5) + 5*rot90(eye(5));
Aclean = [A1 A1]; % A test pattern
A = [A1 A1] + 0.5*randn(5, 10); % add some noise
[U, S, V] = svd(A);
figure, plot(diag(S), '*'), title('Singular Values')
Ak = zeros(5, 10);
figure
% Look at each rank k matrix from SVD
for k = 1:5
disp(['SVD terms: ', num2str(k)])
Ak = Ak + U(:,k) * S(k,k) * V(:,k)';
disp( Ak )
subplot(2, 3, k), imshow(Ak, []), title(['rank = ', num2str(k)])
disp(['rank: ', num2str(rank(Ak))])
disp(['Mean-squared Error: ', num2str(immse(A, Ak))])
disp(['No noise Mean-squared Error: ', num2str(immse(Aclean, Ak))])
disp(' ')
end
subplot(2,3,6), imshow(A, []), title('Original')
disp('Original matrix with noise')
disp(A)
% NOTE:
% >> rank(A1)
% ans =
% 3
% >> rank(A)
% ans =
% 5
The bestKsvdImage
script shows another dimensionality reduction
example. This one examines the SVD of an image. With the sum of a few
rank-1 matrices from the SVD, the major parts of the image start to take
shape. The higher terms add fine details. Something to think about: how
is a low rank image from SVD different or the same as a low-pass
filtered image? Figure Fig. 9.6 shows the singular
values of the image, and the images with progressive rank are shown in
figure Fig. 9.7.
% File: bestKsvdImage.m
% Use an image to demonstrate the Eckart-Young theorem that says
% that the closest rank k matrix to the original comes from the SVD.
% Use a public domain image from the Image Processing Toolbox.
A = im2double(imread('cameraman.tif'));
[m, n] = size(A);
[U, S, V] = svd(A);
figure, plot(diag(S), '*'), title('Singular Values')
%%
Ak = zeros(m, n);
figure
% Look at each rank k matrix from SVD
for k = 1:15
disp(['SVD terms: ', num2str(k)])
Ak = Ak + U(:,k) * S(k,k) * V(:,k)';
subplot(4, 4, k), imshow(Ak, []), title(['rank = ', num2str(k)])
disp(['rank: ', num2str(rank(Ak))])
disp(['Mean-squared Error: ', num2str(immse(A, Ak))])
disp(' ')
end
subplot(4,4,16), imshow(A, []), title('Original')