10.3. Homography

When points in the world lie on a plane and we have some calibration location information about certain points, then we can use a technique called homography to find the locations of other points from an image. That is, we can find a geometric Transformation Matrix in homogeneous coordinates to map points from the image to their world coordinates on the plane.

The coordinate system for the world that we will use lies on the plane so that every point that lies on this plane has a Z value of zero. We can show this in the Camera Matrix as:

\begin{bmatrix} \tilde{u} \\ \tilde{v} \\ \tilde{w} \end{bmatrix}
= \begin{bmatrix} C_{11} & C_{12} & C_{13} & C_{14} \\
            C_{21} & C_{22} & C_{23} & C_{24} \\
            C_{31} & C_{32} & C_{33} & C_{34} \end{bmatrix}
            \begin{bmatrix} X \\ Y \\ 0 \\ 1 \end{bmatrix}

The Z multiples all of the elements in the third column of our camera matrix. But because it is zero, we can effectively remove that column from the matrix and we can remove that row from the world coordinate vector.

\begin{bmatrix} \tilde{u} \\ \tilde{v} \\ \tilde{w} \end{bmatrix}
= \begin{bmatrix} C_{11} & C_{12} &  & C_{14} \\
C_{21} & C_{22} &  & C_{24} \\
C_{31} & C_{32} &  & C_{34} \end{bmatrix}
\begin{bmatrix} X \\ Y \\ \\ 1 \end{bmatrix}

What we’re left with now is a 3-by-3 camera matrix for objects that are on a plane. The inverse of the camera matrix is called a planar homography (\bf{H}) matrix that can find the coordinates of points on the plane from the (u, v) locations in the image. We can estimate the homography matrix if we have four world points and the corresponding position of those points in the image.

Note that the homography matrix gives us a geometric transformation between two planes. We will consider it here as a mapping from the image plane to a physical plane, but it could map between two image planes. The inverse of a homography provides the reverse mapping between the two planes.

We can apply homographies in two ways. First we can do perspective rectification, which is a warping of the image to the coordinates of the physical world, much like the Geometric Operations that we considered before. The resulting image looks as if the picture was taken with the camera held orthogonal to the plane.

When we learn about finding objects in an image from their Image Features, we can also apply homography to find the location of specific objects from an image.

Look up the documentation of the following functions provided by the Machine Vision Toolbox:

  • homography
  • homwarp
  • homtrans

Consider four points in an image making matrix \bf{P} and the corresponding points in the world in matrix \bf{Q} . The points in \bf{Q} might be calibration points and \bf{P} represents where those calibration points were found in the image. With a homography matrix, we can easily find the world location of other points found in the image.

The homography mapping from \bf{P} to \bf{Q} is in homogeneous coordinates (\mathbf{\tilde{P}} and \mathbf{\tilde{Q}}). The transformation is \mathbf{H\,\tilde{P}} = \mathbf{\tilde{Q}}. We might hope to find \bf{H} using a similar linear algebra calculation as used to solve a classic system of linear equations of the form \mathbf{A}\,\bm{x} = \bm{b}. However, this problem is more complex.

We have the following for each pair of corresponding points between image points and the world points on a plane.

\mat{h_1, h_2, h_3; h_4, h_5, h_6; h_7, h_8, h_9}
\vector{u, v, 1} = \vector{\tilde{X}, \tilde{Y}, \tilde{Z}}

\spalignsys{h_1 u + h_2 v + h_3 = \tilde{X};
h_4 u + h_5 v + h_6 = \tilde{Y};
h_7 u + h_8 v + h_9 = \tilde{Z}}

Additionally, the homogeneous coordinate requirement is that X =
\tilde{X}/\tilde{Z} and Y = \tilde{Y}/\tilde{Z}. Letting \tilde{Z} = 1, the equation for \tilde{Z} is combined with equations for X = \tilde{X} and Y = \tilde{Y}.

X = \frac{h_1 u + h_2 v + h_3}{h_7 u + h_8 v + h_9} \qquad \qquad \\

Y = \frac{h_4 u + h_5 v + h_6}{h_7 u + h_8 v + h_9} \qquad \qquad

\begin{cases}
X\left(h_7 u + h_8 v + h_9\right) = h_1 u + h_2 v + h_3 \\
Y\left(h_7 u + h_8 v + h_9\right) = h_4 u + h_5 v + h_6
\end{cases}

Then we have the following two equations for each pair of corresponding points.

\spalignsys{h_1 u + h_2 v + h_3 - h_7 u X - h_8 v X - h_9 X = 0;
h_4 u + h_5 v + h_6 - h_7 u Y - h_8 v Y - h_9 Y = 0}

With n = 4 pair of corresponding points, which is the minimum, we have eight equations of the form above. The coefficients of \bf{H} are put into a column vector and the multipliers make an \bf{A} matrix of eight rows (the rank of \bf{A} is 8).

\mat{u_1,  v_1, 1, 0, 0, 0,  -X_1u_1, -X_1v_1 -X_1;
  0, 0, 0, u_1, v_1, 1, -Y_1u_1, -Y_1v_1, -Y_1;
  u_2, v_2, 1, 0, 0, 0, -X_2u_2, -X_2v_2, -X_2;
  0, 0, 0, u_2, v_2, 1, -Y_2u_2, -Y_2v_2, -Y_2;
  \vdots, {}, {}, {}, {}, {}, {}, {}, \vdots;
  u_n, v_n, 1, 0, 0, 0, -X_nu_n, -X_nv_n, -X_n;
  0, 0, 0, u_n, v_n, 1, -Y_nu_n, -Y_nv_n, -Y_n}
\vector{h_1, h_2, h_3, h_4, h_5, h_6, h_7, h_8, h_9}
= \vector{0, 0, 0, 0, 0, 0, 0, 0, 0}

There are then two ways to solve the system of equations.

  1. As shown above, the \bf{A} matrix is {n \times 9} and the \bm{h} vector is {1 \times 9}, \mathbf{A} \bm{h} =
\bm{0}. So the null solution of \bf{A} gives the \bm{h} values. A numerically stable way to solve for \bm{h} is to find the singular value decomposition (SVD) factoring of \bf{A} ([U,S,V] = svd(A)), and then the null solution is the last column of the \bf{V} matrix (h = V(:, end)).

  2. Since the system of equations has eight degrees of freedom, the h_9 term can be set to 1 and the X and Y values moved to the other side of the equal sign giving a \bf{b} vector of 8 values. Now we have a system of equations of the form \mathbf{A} \bm{h} =
\bm{b}. But in case there are more than four pair of corresponding points, the first 8 h values are found from either using the pseudo-inverse of \bf{A} (h = pinv(A)*b) or as follows.

    \bm{h} =
\left(\mathbf{A}^T\,\mathbf{A}\right)^{-1} \mathbf{A}^T \bf{b}

To find \bf{H} directly from \bf{P} and \bf{Q} in Euclidean coordinates, the MVTB has an homography function. The location of any points on the plane are then found with the homtrans function from the corresponding points in the image.

>> P = [10 12 80 95; 120 450 130 500]
P =
    10    12    80    95
   120   450   130   500
>> Q = [10 10 90 90;10 500 10 500]
Q =
    10    10    90    90
    10   500    10   500

>> H = homography(P, Q)
maximum residual 1.589e-14
H =
    1.3672   -0.0029   -2.5446
   -0.2518    1.8681 -210.8748
    0.0014    0.0005    1.0000

>> Q1 = homtrans(H, P)  % Test if the homography applied to P gives Q.
Q1 =
    10    10    90    90
    10   500    10   500
% Same result from: h2e( H * e2h(P) )
% e2h - Euclidean to homogeneous, h2e - homogeneous to Euclidean

Take a picture from the side of or below an object with points that lie on a plane. Find the homography matrix mapping points from the image to world coordinates. Use the homwarp function to transform the image with perspective rectification.

Here is example of perspective rectification.

../_images/homography_warp.png

Homography image warp for perspective rectification