Let’s take \(\left[\matrix{u&v}\right]\) as the coordinates of a point in the input image and \(\left[\matrix{x&y}\right]\) as the coordinates of that same point in the output image^{164}.
The simplest form of coordinate transformation (or warping) is the scaling of the coordinates, let’s assume we want to scale the first axis by \(M\) and the second by \(N\), the output coordinates of that point can be calculated by

$$\left[\matrix{x\cr y}\right]= \left[\matrix{Mu\cr Nv}\right]= \left[\matrix{M&0\cr0&N}\right]\left[\matrix{u\cr v}\right]$$

Note that these are matrix multiplications. We thus see that we can represent any such grid warping as a matrix. Another thing we can do with this \(2\times2\) matrix is to rotate the output coordinate around the common center of both coordinates. If the output is rotated anticlockwise by \(\theta\) degrees from the positive (to the right) horizontal axis, then the warping matrix should become:

$$\left[\matrix{x\cr y}\right]= \left[\matrix{ucos\theta-vsin\theta\cr usin\theta+vcos\theta}\right]= \left[\matrix{cos\theta&-sin\theta\cr sin\theta&cos\theta}\right] \left[\matrix{u\cr v}\right] $$

We can also flip the coordinates around the first axis, the second axis and the coordinate center with the following three matrices respectively:

$$\left[\matrix{1&0\cr0&-1}\right]\quad\quad \left[\matrix{-1&0\cr0&1}\right]\quad\quad \left[\matrix{-1&0\cr0&-1}\right]$$

The final thing we can do with this definition of a \(2\times2\) warping matrix is shear. If we want the output to be sheared along the first axis with \(A\) and along the second with \(B\), then we can use the matrix:

$$\left[\matrix{1&A\cr B&1}\right]$$

To have one matrix representing any combination of these steps, you use matrix multiplication, see Merging multiple warpings. So any combinations of these transformations can be displayed with one \(2\times2\) matrix:

$$\left[\matrix{a&b\cr c&d}\right]$$

The transformations above can cover a lot of the needs of most coordinate transformations. However they are limited to mapping the point \([\matrix{0&0}]\) to \([\matrix{0&0}]\). Therefore they are useless if you want one coordinate to be shifted compared to the other one. They are also space invariant, meaning that all the coordinates in the image will receive the same transformation. In other words, all the pixels in the output image will have the same area if placed over the input image. So transformations which require varying output pixel sizes like projections cannot be applied through this \(2\times2\) matrix either (for example, for the tilted ACS and WFC3 camera detectors on board the Hubble space telescope).

To add these further capabilities, namely translation and projection, we use the homogeneous coordinates.
They were defined about 200 years ago by August Ferdinand Möbius (1790 – 1868).
For simplicity, we will only discuss points on a 2D plane and avoid the complexities of higher dimensions.
We cannot provide a deep mathematical introduction here, interested readers can get a more detailed explanation from Wikipedia^{165} and the references therein.

By adding an extra coordinate to a point we can add the flexibility we need. The point \([\matrix{x&y}]\) can be represented as \([\matrix{xZ&yZ&Z}]\) in homogeneous coordinates. Therefore multiplying all the coordinates of a point in the homogeneous coordinates with a constant will give the same point. Put another way, the point \([\matrix{x&y&Z}]\) corresponds to the point \([\matrix{x/Z&y/Z}]\) on the constant \(Z\) plane. Setting \(Z=1\), we get the input image plane, so \([\matrix{u&v&1}]\) corresponds to \([\matrix{u&v}]\). With this definition, the transformations above can be generally written as:

$$\left[\matrix{x\cr y\cr 1}\right]= \left[\matrix{a&b&0\cr c&d&0\cr 0&0&1}\right] \left[\matrix{u\cr v\cr 1}\right]$$

We thus acquired 4 extra degrees of freedom.
By giving non-zero values to the zero valued elements of the last column we can have translation (try the matrix multiplication!).
In general, any coordinate transformation that is represented by the matrix below is known as an affine transformation^{166}:

$$\left[\matrix{a&b&c\cr d&e&f\cr 0&0&1}\right]$$

We can now consider translation, but the affine transform is still spatially invariant.
Giving non-zero values to the other two elements in the matrix above gives us the projective transformation or Homography^{167} which is the most general type of transformation with the \(3\times3\) matrix:

$$\left[\matrix{x'\cr y'\cr w}\right]= \left[\matrix{a&b&c\cr d&e&f\cr g&h&1}\right] \left[\matrix{u\cr v\cr 1}\right]$$

So the output coordinates can be calculated from:

$$x={x' \over w}={au+bv+c \over gu+hv+1}\quad\quad\quad\quad y={y' \over w}={du+ev+f \over gu+hv+1}$$

Thus with Homography we can change the sizes of the output pixels on the input plane, giving a ‘perspective’-like visual impression.
This can be quantitatively seen in the two equations above.
When \(g=h=0\), the denominator is independent of \(u\) or \(v\) and thus we have spatial invariance.
Homography preserves lines at all orientations.
A very useful fact about Homography is that its inverse is also a Homography.
These two properties play a very important role in the implementation of this transformation.
A short but instructive and illustrated review of affine, projective and also bi-linear mappings is provided in Heckbert 1989^{168}.

These can be any real number, we are not necessarily talking about integer pixels here.

http://en.wikipedia.org/wiki/Homogeneous_coordinates

http://en.wikipedia.org/wiki/Affine_transformation

http://en.wikipedia.org/wiki/Homography

Paul S. Heckbert. 1989. *Fundamentals of Texture mapping and Image Warping*, Master’s thesis at University of California, Berkeley.
Note that since points are defined as row vectors there, the matrix is the transpose of the one discussed here.

JavaScript license information

GNU Astronomy Utilities 0.20 manual, April 2023.