13.1. Norms

MATLAB includes a function called norm to find the length of vectors or matrices. The most frequent usage is to find the Euclidean length of a vector, which is called an \(l_2\)-norm. It comes directly from the Pythagorean theorem—the square root of the sum of the squares. The length of a vector is a familiar concept, but the length of a matrix may seem new. As always, MATLAB’s documentation explains all of the options of the norm function. However, different measures of length and properties of norms should be reviewed. Moreover, the names, symbols, and applications of the various norms could be clarified.

In technical literature, the most common mathematics symbol for a norm is a pair of double bars around the variable name with a subscript for the type of norm, \(\norm{\bm{v}}_2\). If the subscript is left off, then it is assumed to be 2. The generic name given to the type of norm for vectors is the italic letter l with a subscript of the type, \(l_2\). You may sometimes see the type given as a superscript instead of a subscript.

13.1.1. Cardinality

SYMBOLS

\(\norm{\bm{v}}_0\), \(\:\norm{\mathbf{A}}_0\), \(\: l_0\)

DESCRIPTION

The \(l_0\)-norm is the number of nonzero elements of a vector or a matrix. The \(l_0\)-norm does not fit the properties that are expected of a norm, so it is not always classified as being a norm. It has utility for sparse (many zeros) vectors and matrices.

MATLAB EXAMPLE
>> v = [1; 2; 0; 4];
>> v_card = nnz(v)
v_card =
      3

13.1.2. Vector Norms

13.1.3. General Vector Norm, p-Norm

The norm function takes a second argument, p, that specifies the order of the calculation as follows.

\[\norm{\bm{v}}_p = \text{norm(v, p)} = \left[ \sum_{k=1}^N \left | \bm{v}_k \right | ^p \right]^{1/p}\]

Figure Fig. 13.1 shows plots of the \(x\) and \(y\) values that satisfy \(\norm{\left[x\:y\right]}_p = 1\) for various values of \(p\). Keep these plots in mind as you read the description of the \(l_1\), \(l_2\), and \(l_{\infty}\) norms. We see straight lines in the \(p=1\) plot because the norm’s value is the sum of the absolute values of the elements. The plot of a circle for \(p=2\) relates to the \(l_2\) norm being the length of a vector by the Pythagorean theorem. As the \(p\) values get larger, the plots approach the shape of a square, which models the \(l_{\infty}\) norm where the norm takes the value of the largest absolute value of the elements.

Plots of p norms.

Fig. 13.1 Plots of \(\norm{[x \: y]}_p = 1\).

13.1.4. Taxicab Norm, Manhattan Distance, or City Block Distance

SYMBOLS

\(\norm{\bm{v}}_1\), \(\: l_1\)

DESCRIPTION

The \(l_1\) norm is the sum of the absolute values of the elements. It is the distance that a car would drive between two points on a rectangular grid of streets.

\[\norm{\bm{v}}_1 = \sum_{k=1}^N | \bm{v}_k |\]

The \(l_1\) norm has applications to compressed sensing, which seeks to combine data collection and compression into a single algorithm ([BRUNTON19], pages 88-91). The \(l_1\) norm has also been shown to improve linear regression when the data contains outlier values compared to the \(l_2\) norm.

MATLAB EXAMPLE
>> v = [1; 2; 0; 4];
>> v_l1 = norm(v, 1)
v_l1 =
      7

13.1.5. Euclidean Norm

SYMBOLS

\(\norm{\bm{v}}\), \(\:\norm{\bm{v}}_2\), \(\: l_2\)

DESCRIPTION

The \(l_2\) norm of a vector is by far the most frequently used norm calculation. It finds the length of a vector by the same means that the Pythagorean theorem finds the length of the hypotenuse of a right triangle.

\[\norm{\bm{v}}_2 = \sqrt{\sum_{k=1}^N \bm{v}_k^2 } = \sqrt{\bm{v}_1^2 + \bm{v}_2^2 + \cdots + \bm{v}_N^2}\]
MATLAB EXAMPLE
>> v = [1; 2; 0; 4];
>> v_len = norm(v)
v_len =
    4.5826
>> norm(v, 2)
ans =
    4.5826
>> sqrt(sum(v.^2))
ans =
    4.5826

13.1.6. Infinity Norm

SYMBOLS

\(\norm{\bm{v}}_\infty\), \(\:\norm{\bm{v}}_{-\infty}\), \(\: l_\infty\), \(\: l_{-\infty}\)

DESCRIPTION

The \(l_{\pm \infty}\)-norm is the vector elements’ maximum or minimum absolute value.

\[\norm{\bm{v}}_\infty = \max \left( | \bm{v} | \right)\]
\[\norm{\bm{v}}_{-\infty} = \min \left( | \bm{v} | \right)\]
MATLAB EXAMPLE
>> v = [1; 2; 0; 4];
>> v_infty = norm(v, Inf)
v_infty =
     4
>> max(abs(v))
ans =
    4
>> norm(v, -Inf)
ans =
    0
>> min(abs(v))
ans =
    0

13.1.7. vecnorm function

MATLAB has a handy function called vecnorm that will compute the \(p\)–norms of the columns of a matrix. It takes the matrix variable name as an argument. Optional arguments specify \(p\) (\(l_2\)-norm is the default), and the dimension for computing the norm (columns is the default). The command to compute the \(l_2\)-norm of the rows is: vecnorm(A, 2, 2).

Here is an example that computes the length of the columns of a matrix.

>> A = randi(20, 3) - randi(10, 3)
A =
     7     9     4
    17     8     6
    -7    -7    10
>> Alen = vecnorm(A)
Alen =
    19.6723   13.9284   12.3288

>> A1 = A./Alen;  % unit length
>> vecnorm(A1)
ans =
    1     1     1

13.1.8. Matrix Norms

The magnitude of a matrix from its norm seems ambiguous compared to the length of a vector from its norm. The \(\norm{\cdot}_2\) matrix norm is the most frequently used, but different applications have shown specific matrix norms to perform better than other matrix norm calculations.

13.1.9. Maximum Absolute Column Sum

SYMBOLS

\(\norm{\mathbf{A}}_1\), \(\: \norm{\cdot}_1\)

DESCRIPTION

The \(\norm{\cdot}_1\)-norm is the maximum \(l_1\) norm of the column vectors of the matrix.

MATLAB EXAMPLE
>> A
A =
    -1     7    -5
     4    -3    -5
    -5     0     7
>> norm(A, 1)
ans =
    17
>> max(sum(abs(A)))
ans =
    17
>> max(vecnorm(A, 1))
ans =
    17

13.1.10. 2-Norm of a Matrix

SYMBOLS

\(\norm{\mathbf{A}}_2\), \(\: \norm{\cdot}_2\)

DESCRIPTION

The \(\norm{\cdot}_2\) matrix norm focuses on the ability of a matrix to stretch a vector rather than on the values of the elements in the matrix. The calculation is the maximum ratio of \(l_2\) vector norms.

\[\norm{\mathbf{A}}_2 = \max_{\bm{v} \neq 0}{} \frac{\norm{\mathbf{A}\,\bm{v}}}{\norm{\bm{v}}}\]

The best choice for \(\bm{v}\) comes from the SVD. Recall that for each singular value, we have the relationship \(\mathbf{A}\,\bm{v}_i = \sigma_i\,\bm{u}_i\). So, the largest singular value is the largest stretch of a unit-length vector from multiplication with the matrix.

\[\norm{\mathbf{A}}_2 = \frac{\norm{\mathbf{A}\,\bm{v}_1}}{\norm{\bm{v}_1}} = \frac{\sigma_1\,\norm{\bm{u}_1}}{\norm{\bm{v}_1}} = \sigma_1\]
Other Properties of Matrix 2-Norms
  • \(\norm{\mathbf{A}}_2 > 0\)

  • \(\norm{\mathbf{I}}_2 = 1\)

  • \(\norm{c \mathbf{A}}_2 = \abs{c}\,\norm{\mathbf{A}}_2\), from which we also have \(\norm{-\mathbf{A}}_2 = \norm{\mathbf{A}}_2\).

  • \(\norm{\mathbf{A} + \mathbf{B}}_2 \leq \norm{\mathbf{A}}_2 + \norm{\mathbf{B}}_2\).

  • \(\norm{\mathbf{A}\,\mathbf{B}}_2 \leq \norm{\mathbf{A}}_2\,\norm{\mathbf{B}}_2\).


MATLAB EXAMPLE
>> A
A =
    -1     7    -5
     4    -3    -5
    -5     0     7
>> A_l2 = norm(A, 2)
A_l2 =
    11.3568
>> A_l2 = norm(A)
A_l2 =
    11.3568
>> S = svd(A);
>> A_l2 = S(1)
A_l2 =
    11.3568

13.1.11. Frobenius Norm

SYMBOLS

\(\norm{\mathbf{A}}_F\), \(\: \norm{\cdot}_F\)

DESCRIPTION

The Frobenius norm is similar to the \(l_2\) vector norm. It is the square root of the sum of the squared elements. It can also be found from the singular values.

\[\norm{\mathbf{A}}_F = \sqrt{\sum_{i=1}^m{} \sum_{j=1}^n a_{ij}^2} = \sqrt{\text{trace}(\mathbf{A}^T\,\mathbf{A})}\]
\[\norm{\mathbf{A}}_F = \sqrt{\sum_{i=1}^{\min{(m, n)}} \sigma_i^2}\]

MATLAB EXAMPLE

>> A
A =
    -1     7    -5
     4    -3    -5
    -5     0     7
>> A_f = norm(A, 'fro')
A_f =
   14.1067
>> A_f = sqrt(sum(A(:).^2))
A_f =
   14.1067
>> A_f = sqrt(trace(A'*A))
A_f =
   14.1067
>> A_f = sqrt(sum(svd(A).^2))
A_f =
   14.1067

13.1.12. Maximum Absolute Row Sum

SYMBOLS

\(\norm{\mathbf{A}}_{\infty}\), \(\:\norm{\cdot}_{\infty}\)

DESCRIPTION

The \(\norm{\cdot}_{\infty}\) is similar to the \(\norm{\cdot}_1\) except it uses the \(l_1\)-norm of the rows instead of the columns.

MATLAB EXAMPLE
>> A
A =
    -1     7    -5
     4    -3    -5
    -5     0     7
>> norm(A, Inf)
ans =
    13
>> max(sum(abs(A')))
ans =
    13

13.1.13. Nuclear Norm

SYMBOLS

\(\norm{\mathbf{A}}_N\), \(\:\norm{\cdot}_N\)

DESCRIPTION

The nuclear norm is the sum of the singular values from the SVD, which can relate to the rank because many higher-order singular values of matrices holding data, such as an image, are close to zero. It has been shown to give good results at improving the robustness of PCA (Principal Component Analysis (PCA)) (RPCA) to outlier data points ([BRUNTON19], pages 107-108).

The nuclear norm gained recent recognition when Netflix held a challenge competition to find the best possible movie recommendation algorithm. It was no surprise that the PCA process was key to the algorithms that contestants developed. However, an interesting discovery was that the nuclear norm performed better than the 2-norm.

\[\norm{\mathbf{A}}_N = \sum_{k=1}^r \sigma_k\]

MATLAB EXAMPLE

>> A
A =
    -1     7    -5
     4    -3    -5
    -5     0     7
>> A_nuc_norm = sum(svd(A))
A_nuc_norm =
   20.4799