13.1. Norms¶

MATLAB includes a function called norm for the purpose of find the length of vectors or matrices. The most frequent usage is to find the Euclidean length of a vector, which we call a $l_2$ -norm and comes direct from the Pythagorean theorem – the square root of the sum of the squares. As always, the norm function is well documented in the MATLAB documentation. However, there are different measures of length – some not covered by the norm function. The length of a vector is a familiar concept, but the length of a matrix feels somewhat mysterious. Moreover, the names, symbols, background, and application of the various norms could use some clarification.

In technical literature, the most common mathematics symbol for a norm or length is a pair of double bars around the variable name with a subscript for the type of norm, $\norm{\bm{v}}_2$ . If the subscript is left off, then it is assumed to be 2. The generic name given to the type of norm is the italics letter l with a subscript of the type, $l_2$ . You may sometimes see the type given as a super-script instead of a subscript.

Cardinality

SYMBOLS

$\norm{\bm{v}}_0$ , $\quad\norm{\mathbf{A}}_0$ , $\quad l_0$

MATLAB EXAMPLE

>> v = [1; 2; 0; 4];
>> v_card = nnz(v)
v_card =
      3

DESCRIPTION

The $l_0$ norm is the number of non-zero elements of either a vector or a matrix. The $l_0$ norm does not fit the properties that are normally expected of a norm, so it is not always classified as being a norm. It has utility for sparse (a lot of zeros) vectors and matrices.

13.1.1. Vector Norms¶

General Vector Norm, p-Norm

The norm function takes a second argument, p that specifies the order of the calculation as follows.

$\norm{\bm{v}}_p = \text{norm(v, p)} = \left[ \sum_{k=1}^N \left | \bm{v}_k \right | ^p \right]^{1/p}$

The following plots show the $x$ and $y$ values that satisfy $\norm{\mat{x, y}}_p = 1$ for various values of p. The $l_1$ , $l_2$ , and $l_{\infty}$ norms are described below. Keep these plots in mind as you read the description of each. We see straight lines in the $p=1$ plot because the value of the norm is the sum of the absolute values of the elements. The plot of a circle for $p=2$ relates to the $l_2$ norm being the length of a vector by Pythagorean theorem. As the $p$ values get larger, the plots approach the shape of a square, which models the $l_{\infty}$ norm where the norm takes the value of the largest absolute value of the elements.

Fig. 13.1 Plots of $\norm{\vector{x; y}}_p) = 1$ .

Taxicab Norm, Manhattan Norm

SYMBOLS: $\norm{\bm{v}}_1$ , $\quad l_1$

DESCRIPTION

The sum of the absolute values of the elements. It is the distance that a taxi would drive between two points on a rectangular grid.

$\norm{\bm{v}}_1 = \sum_{k=1}^N | \bm{v}_k |$

The $l_1$ norm has application to compressed sensing, which seeks to combine data collection and compression into a single algorithm ([BRUNTON19], pages 88-91). The $l_1$ norm has also shown to improve linear regression, compared to the $l_2$ norm, when the data contains outlier values.

MATLAB EXAMPLE

>> v = [1; 2; 0; 4];
>> v_l1 = norm(v, 1)
v_l1 =
      7

Euclidean Norm

SYMBOLS: $\norm{\bm{v}}$ , $\quad\norm{\bm{v}}_2$ , $\quad l_2$

DESCRIPTION

The $l_2$ norm of a vector is by far the most frequent application of the norm function. It calculates the length of a vector by the same means that the Pythagorean theorem finds the length of the hypotenuse of a right triangle.

$\norm{\bm{v}}_2 = \sqrt{\sum_{k=1}^N \bm{v}_k^2 } = \sqrt{\bm{v}_1^2 + \bm{v}_2^2 + \cdots + \bm{v}_N^2}$

MATLAB EXAMPLE

>> v = [1; 2; 0; 4];
>> v_len = norm(v)
v_len =
    4.5826
>> norm(v, 2)
ans =
    4.5826
>> sqrt(sum(v.^2))
ans =
    4.5826

Infinity Norm

SYMBOLS: $\norm{\bm{v}}_\infty$ , $\quad\norm{\bm{v}}_{-\infty}$ , $\quad l_\infty$ $\quad l_{-\infty}$

DESCRIPTION

The $l_{\pm \infty}$ is the maximum or minimum absolute value of the vector.

$\norm{\bm{v}}_\infty = \max \left( | \bm{v} | \right)$

$\norm{\bm{v}}_{-\infty} = \min \left( | \bm{v} | \right)$

MATLAB EXAMPLE

>> v = [1; 2; 0; 4];
>> v_infty = norm(v, Inf)
v_infty =
     4
>> max(abs(v))
ans =
    4
>> norm(v, -Inf)
ans =
    0
>> min(abs(v))
ans =
    0

13.1.2. vecnorm function¶

MATLAB has a handy function called vecnorm that will compute the p-norms of the columns of a matrix. It takes the matrix variable name as an argument. Optional arguments specify $p$ ( $l_2$ -norm is the default), and the dimension for computing the norm (columns is the default). The command to compute the $l_2$ -norm of the rows is: vecnorm(A, 2, 2).

Here is an example that computes the length of the columns of a matrix.

>> A = randi(20, 3) - randi(10, 3)
A =
     7     9     4
    17     8     6
    -7    -7    10
>> Alen = vecnorm(A)
Alen =
    19.6723   13.9284   12.3288

>> A1 = A./Alen
A1 =
    0.3558    0.6462    0.3244
    0.8642    0.5744    0.4867
   -0.3558   -0.5026    0.8111
>> vecnorm(A1)
ans =
    1     1     1

13.1.3. Matrix Norms¶

The concept of the magnitude of a matrix from its norm is quite ambiguous compared to the length of a vector from its norm. Different applications have shown specific matrix norms to perform better than other norm calculations.

Maximum Absolute Column Sum

SYMBOLS: $\norm{\mathbf{A}}_1$ , $\quad \norm{\cdot}_1$

DESCRIPTION

The $\norm{\cdot}_1$ -norm is the maximum $l_1$ norm of the column vectors of the matrix.

MATLAB EXAMPLE

>> A
A =
    -1     7    -5
     4    -3    -5
    -5     0     7
>> norm(A, 1)
ans =
    17
>> max(sum(abs(A)))
ans =
    17
>> max(vecnorm(A, 1))
ans =
    17

2-Norm of a Matrix

SYMBOLS: $\norm{\mathbf{A}}_2$ , $\quad \norm{\cdot}$

DESCRIPTION

The focus of the $\norm{\cdot}_2$ matrix norm is on the ability of a matrix to stretch a vector rather than on the values of the elements in the matrix. The calculation is the maximum ratio of $l_2$ vector norms.

$\norm{\mathbf{A}}_2 = \max_{\bm{v} \neq 0}{} \frac{\norm{\mathbf{A}\,\bm{v}}}{\norm{\bm{v}}}$

One might think of the eigenvector corresponding to the largest eigenvalue for the $\bm{v}$ vector that maximizes the ratio of vector lengths. That is correct if $\bf{A}$ is a symmetric matrix, but the general purpose best choice for $\bm{x}$ comes from the SVD. Recall that for each singular value, we have the relationship $\mathbf{A}\,\bm{v}_i = \sigma_i\,\bm{u}_i$ . So we want to consider the largest singular value, which is the first value since the SVD sorts the singular values.

$\norm{\mathbf{A}}_2 = \frac{\norm{\mathbf{A}\,\bm{v}_1}}{\norm{\bm{v}_1}} = \frac{\sigma_1\,\norm{\bm{u}_1}}{\norm{\bm{v}_1}} = \sigma_1$

The final simplification stems from the fact that both $\bm{u}$ and $\bm{v}$ are unit vectors. Thus, the $\norm{\cdot}_2$ matrix norm is the largest (first) singular value from the SVD.

As we learned in Singular Value Decomposition (SVD), the singular values are the eigenvalues of $\mathbf{A}^T\mathbf{A}$ . If we use the notation $\lambda_i(\mathbf{A})$ to mean the $i^{th}$ eigenvalue of $\mathbf{A}$ , then the $l_2$ matrix norm is also given by

$\norm{\mathbf{A}}_2 = \max_i \sqrt{\abs{\lambda_i(\mathbf{A}^T\mathbf{A})}}.$

For symmetric matices, this simplifies to

$\norm{\mathbf{A}}_2 = \max_i \abs{\lambda_i(\mathbf{A})}.$

Other Properties of Matrix 2-Norms

$\norm{\mathbf{A}}_2 > 0$

$\norm{\mathbf{I}}_2 = 1$

$\norm{c \mathbf{A}}_2 = \abs{c}\,\norm{\mathbf{A}}_2$ , from which we also have $\norm{-\mathbf{A}}_2 = \norm{\mathbf{A}}_2$ .

$\norm{\mathbf{A} + \mathbf{B}}_2 \leq \norm{\mathbf{A}}_2 + \norm{\mathbf{B}}_2$ .

$\norm{\mathbf{A}\,\mathbf{B}}_2 \leq \norm{\mathbf{A}}_2\,\norm{\mathbf{B}}_2$ .

MATLAB EXAMPLE

>> A
A =
    -1     7    -5
     4    -3    -5
    -5     0     7
>> A_l2 = norm(A, 2)
A_l2 =
    11.3568
>> A_l2 = norm(A)
A_l2 =
    11.3568
>> S = svd(A);
>> A_l2 = S(1)
A_l2 =
    11.3568

Frobenius Norm

SYMBOLS: $\norm{\mathbf{A}}_F$ , $\quad \norm{\cdot}_F$

DESCRIPTION

The Frobenius norm is similar to the $l_2$ vector norm in that it is the square root of the sum of the squared elements. It is also can be found from the singular values.

$\norm{\mathbf{A}}_F = \sqrt{\sum_{i=1}^m{} \sum_{j=1}^n a_{ij}^2} = \sqrt{\text{trace}(\mathbf{A}^T\,\mathbf{A})}$

$\norm{\mathbf{A}}_F = \sqrt{\sum_{i=1}^{\min{(m, n)}} \sigma_i^2}$

MATLAB EXAMPLE

>> A
A =
    -1     7    -5
     4    -3    -5
    -5     0     7
>> A_f = norm(A, 'fro')
A_f =
   14.1067
>> A_f = sqrt(sum(A(:).^2))
A_f =
   14.1067
>> A_f = sqrt(trace(A'*A))
A_f =
   14.1067
>> A_f = sqrt(sum(svd(A).^2))
A_f =
   14.1067

Maximum Absolute Row Sum

SYMBOLS: $\norm{\mathbf{A}}_{\infty}$ , $\quad\norm{\cdot}_{\infty}$

DESCRIPTION

The $\norm{\cdot}_{\infty}$ is similar to the $\norm{\cdot}_1$ except it uses the $l_1$ norm of the rows instead of the columns.

MATLAB EXAMPLE

>> A
A =
    -1     7    -5
     4    -3    -5
    -5     0     7
>> norm(A, Inf)
ans =
    13
>> max(sum(abs(A')))
ans =
    13

Nuclear Norm

SYMBOLS: $\norm{\mathbf{A}}_N$ , $\quad\norm{\cdot}_N$

DESCRIPTION

The nuclear norm is the sum of the singular values from the SVD, which is related to the rank because most higher order singular values of matrices representing, such as an image, are close to zero. It has shown to give good results at improving the robustness of Principal Component Analysis (PCA) (RPCA) to outlier data points ([BRUNTON19], pages 107-108).

The nuclear norm gained recent recognition when Netflix held a challenge competition to find the best possible movie recommendation algorithm. It was no surprise that the PCA process was key to the algorithms that contestants developed. However, an interesting discovery was that the nuclear norm performed better than the 2-norm.

$\norm{\mathbf{A}}_N = \sum_{k=1}^r \sigma_k$

MATLAB EXAMPLE

>> A
A =
    -1     7    -5
     4    -3    -5
    -5     0     7
>> A_nuc_norm = sum(svd(A))
A_nuc_norm =
   20.4799

13.1. Norms¶

13.1.1. Vector Norms¶

13.1.2. vecnorm function¶

13.1.3. Matrix Norms¶

Table of Contents

Previous topic

Next topic

This Page