13.1. Norms

MATLAB includes a function called norm for the purpose of find the length of vectors or matrices. The most frequent usage is to find the Euclidean length of a vector, which we call a l_2-norm and comes direct from the Pythagorean theorem – the square root of the sum of the squares. As always, the norm function is well documented in the MATLAB documentation. However, there are different measures of length – some not covered by the norm function. The length of a vector is a familiar concept, but the length of a matrix feels somewhat mysterious. Moreover, the names, symbols, background, and application of the various norms could use some clarification.

In technical literature, the most common mathematics symbol for a norm or length is a pair of double bars around the variable name with a subscript for the type of norm, \norm{\bm{v}}_2. If the subscript is left off, then it is assumed to be 2. The generic name given to the type of norm is the italics letter l with a subscript of the type, l_2. You may sometimes see the type given as a super-script instead of a subscript.

Cardinality
SYMBOLS
\norm{\bm{v}}_0, \quad\norm{\mathbf{A}}_0, \quad l_0
MATLAB EXAMPLE
>> v = [1; 2; 0; 4];
>> v_card = nnz(v)
v_card =
      3

DESCRIPTION

The l_0 norm is the number of non-zero elements of either a vector or a matrix. The l_0 norm does not fit the properties that are normally expected of a norm, so it is not always classified as being a norm. It has utility for sparse (a lot of zeros) vectors and matrices.

13.1.1. Vector Norms

General Vector Norm, p-Norm

The norm function takes a second argument, p that specifies the order of the calculation as follows.

\norm{\bm{v}}_p = \text{norm(v, p)} =
\left[ \sum_{k=1}^N  \left | \bm{v}_k \right | ^p \right]^{1/p}

The following plots show the x and y values that satisfy \norm{\mat{x, y}}_p = 1 for various values of p. The l_1, l_2, and l_{\infty} norms are described below. Keep these plots in mind as you read the description of each. We see straight lines in the p=1 plot because the value of the norm is the sum of the absolute values of the elements. The plot of a circle for p=2 relates to the l_2 norm being the length of a vector by Pythagorean theorem. As the p values get larger, the plots approach the shape of a square, which models the l_{\infty} norm where the norm takes the value of the largest absolute value of the elements.

../_images/norms.png

Fig. 13.1 Plots of \norm{\vector{x; y}}_p) = 1.

Taxicab Norm, Manhattan Norm
SYMBOLS
\norm{\bm{v}}_1, \quad l_1

DESCRIPTION

The sum of the absolute values of the elements. It is the distance that a taxi would drive between two points on a rectangular grid.

\norm{\bm{v}}_1 = \sum_{k=1}^N | \bm{v}_k |

The l_1 norm has application to compressed sensing, which seeks to combine data collection and compression into a single algorithm ([BRUNTON19], pages 88-91). The l_1 norm has also shown to improve linear regression, compared to the l_2 norm, when the data contains outlier values.

MATLAB EXAMPLE
>> v = [1; 2; 0; 4];
>> v_l1 = norm(v, 1)
v_l1 =
      7
Euclidean Norm
SYMBOLS
\norm{\bm{v}}, \quad\norm{\bm{v}}_2, \quad l_2

DESCRIPTION

The l_2 norm of a vector is by far the most frequent application of the norm function. It calculates the length of a vector by the same means that the Pythagorean theorem finds the length of the hypotenuse of a right triangle.

\norm{\bm{v}}_2 = \sqrt{\sum_{k=1}^N \bm{v}_k^2 }
= \sqrt{\bm{v}_1^2 + \bm{v}_2^2 + \cdots + \bm{v}_N^2}

MATLAB EXAMPLE
>> v = [1; 2; 0; 4];
>> v_len = norm(v)
v_len =
    4.5826
>> norm(v, 2)
ans =
    4.5826
>> sqrt(sum(v.^2))
ans =
    4.5826
Infinity Norm
SYMBOLS
\norm{\bm{v}}_\infty, \quad\norm{\bm{v}}_{-\infty}, \quad l_\infty \quad l_{-\infty}

DESCRIPTION

The l_{\pm \infty} is the maximum or minimum absolute value of the vector.

\norm{\bm{v}}_\infty = \max \left( | \bm{v} | \right)

\norm{\bm{v}}_{-\infty} = \min \left( | \bm{v} | \right)

MATLAB EXAMPLE
>> v = [1; 2; 0; 4];
>> v_infty = norm(v, Inf)
v_infty =
     4
>> max(abs(v))
ans =
    4
>> norm(v, -Inf)
ans =
    0
>> min(abs(v))
ans =
    0

13.1.2. vecnorm function

MATLAB has a handy function called vecnorm that will compute the p-norms of the columns of a matrix. It takes the matrix variable name as an argument. Optional arguments specify p (l_2-norm is the default), and the dimension for computing the norm (columns is the default). The command to compute the l_2-norm of the rows is: vecnorm(A, 2, 2).

Here is an example that computes the length of the columns of a matrix.

>> A = randi(20, 3) - randi(10, 3)
A =
     7     9     4
    17     8     6
    -7    -7    10
>> Alen = vecnorm(A)
Alen =
    19.6723   13.9284   12.3288

>> A1 = A./Alen
A1 =
    0.3558    0.6462    0.3244
    0.8642    0.5744    0.4867
   -0.3558   -0.5026    0.8111
>> vecnorm(A1)
ans =
    1     1     1

13.1.3. Matrix Norms

The concept of the magnitude of a matrix from its norm is quite ambiguous compared to the length of a vector from its norm. Different applications have shown specific matrix norms to perform better than other norm calculations.

Maximum Absolute Column Sum
SYMBOLS
\norm{\mathbf{A}}_1, \quad \norm{\cdot}_1

DESCRIPTION

The \norm{\cdot}_1-norm is the maximum l_1 norm of the column vectors of the matrix.
MATLAB EXAMPLE
>> A
A =
    -1     7    -5
     4    -3    -5
    -5     0     7
>> norm(A, 1)
ans =
    17
>> max(sum(abs(A)))
ans =
    17
>> max(vecnorm(A, 1))
ans =
    17
2-Norm of a Matrix
SYMBOLS
\norm{\mathbf{A}}_2, \quad \norm{\cdot}

DESCRIPTION

The focus of the \norm{\cdot}_2 matrix norm is on the ability of a matrix to stretch a vector rather than on the values of the elements in the matrix. The calculation is the maximum ratio of l_2 vector norms.

\norm{\mathbf{A}}_2 = \max_{\bm{v} \neq 0}{}
\frac{\norm{\mathbf{A}\,\bm{v}}}{\norm{\bm{v}}}

One might think of the eigenvector corresponding to the largest eigenvalue for the \bm{v} vector that maximizes the ratio of vector lengths. That is correct if \bf{A} is a symmetric matrix, but the general purpose best choice for \bm{x} comes from the SVD. Recall that for each singular value, we have the relationship \mathbf{A}\,\bm{v}_i =
\sigma_i\,\bm{u}_i. So we want to consider the largest singular value, which is the first value since the SVD sorts the singular values.

\norm{\mathbf{A}}_2
= \frac{\norm{\mathbf{A}\,\bm{v}_1}}{\norm{\bm{v}_1}}
= \frac{\sigma_1\,\norm{\bm{u}_1}}{\norm{\bm{v}_1}}
= \sigma_1

The final simplification stems from the fact that both \bm{u} and \bm{v} are unit vectors. Thus, the \norm{\cdot}_2 matrix norm is the largest (first) singular value from the SVD.

As we learned in Singular Value Decomposition (SVD), the singular values are the eigenvalues of \mathbf{A}^T\mathbf{A}. If we use the notation \lambda_i(\mathbf{A}) to mean the i^{th} eigenvalue of \mathbf{A}, then the l_2 matrix norm is also given by

\norm{\mathbf{A}}_2 = \max_i
\sqrt{\abs{\lambda_i(\mathbf{A}^T\mathbf{A})}}.

For symmetric matices, this simplifies to

\norm{\mathbf{A}}_2 = \max_i \abs{\lambda_i(\mathbf{A})}.

Other Properties of Matrix 2-Norms

  • \norm{\mathbf{A}}_2 > 0
  • \norm{\mathbf{I}}_2 = 1
  • \norm{c \mathbf{A}}_2 = \abs{c}\,\norm{\mathbf{A}}_2, from which we also have \norm{-\mathbf{A}}_2 = \norm{\mathbf{A}}_2.
  • \norm{\mathbf{A} + \mathbf{B}}_2 \leq
\norm{\mathbf{A}}_2 + \norm{\mathbf{B}}_2.
  • \norm{\mathbf{A}\,\mathbf{B}}_2 \leq
\norm{\mathbf{A}}_2\,\norm{\mathbf{B}}_2.
MATLAB EXAMPLE
>> A
A =
    -1     7    -5
     4    -3    -5
    -5     0     7
>> A_l2 = norm(A, 2)
A_l2 =
    11.3568
>> A_l2 = norm(A)
A_l2 =
    11.3568
>> S = svd(A);
>> A_l2 = S(1)
A_l2 =
    11.3568
Frobenius Norm
SYMBOLS
\norm{\mathbf{A}}_F, \quad \norm{\cdot}_F

DESCRIPTION

The Frobenius norm is similar to the l_2 vector norm in that it is the square root of the sum of the squared elements. It is also can be found from the singular values.

\norm{\mathbf{A}}_F = \sqrt{\sum_{i=1}^m{} \sum_{j=1}^n
a_{ij}^2} =  \sqrt{\text{trace}(\mathbf{A}^T\,\mathbf{A})}

\norm{\mathbf{A}}_F = \sqrt{\sum_{i=1}^{\min{(m, n)}}
\sigma_i^2}

MATLAB EXAMPLE
>> A
A =
    -1     7    -5
     4    -3    -5
    -5     0     7
>> A_f = norm(A, 'fro')
A_f =
   14.1067
>> A_f = sqrt(sum(A(:).^2))
A_f =
   14.1067
>> A_f = sqrt(trace(A'*A))
A_f =
   14.1067
>> A_f = sqrt(sum(svd(A).^2))
A_f =
   14.1067
Maximum Absolute Row Sum
SYMBOLS
\norm{\mathbf{A}}_{\infty}, \quad\norm{\cdot}_{\infty}

DESCRIPTION

The \norm{\cdot}_{\infty} is similar to the \norm{\cdot}_1 except it uses the l_1 norm of the rows instead of the columns.
MATLAB EXAMPLE
>> A
A =
    -1     7    -5
     4    -3    -5
    -5     0     7
>> norm(A, Inf)
ans =
    13
>> max(sum(abs(A')))
ans =
    13
Nuclear Norm
SYMBOLS
\norm{\mathbf{A}}_N, \quad\norm{\cdot}_N

DESCRIPTION

The nuclear norm is the sum of the singular values from the SVD, which is related to the rank because most higher order singular values of matrices representing, such as an image, are close to zero. It has shown to give good results at improving the robustness of Principal Component Analysis (PCA) (RPCA) to outlier data points ([BRUNTON19], pages 107-108).

The nuclear norm gained recent recognition when Netflix held a challenge competition to find the best possible movie recommendation algorithm. It was no surprise that the PCA process was key to the algorithms that contestants developed. However, an interesting discovery was that the nuclear norm performed better than the 2-norm.

\norm{\mathbf{A}}_N = \sum_{k=1}^r \sigma_k

MATLAB EXAMPLE
>> A
A =
    -1     7    -5
     4    -3    -5
    -5     0     7
>> A_nuc_norm = sum(svd(A))
A_nuc_norm =
   20.4799