Something that I found simple yet profound is the idea of overflow encountered in image processing which really is a by-product of how image pixels are represented in computation.
Typically the intensity of a pixel is represented by 8 bits, or 1 byte which has the capacity to represent \(256\) values, \(2^8\), ranging from \(0\) to \(255\).
One example of an issue that arises as a result is apparent when implementing an operation as simple as averaging.
For the purpose of discussion, lets say we have a gray scale image represented by a \(2\times2\) matrix as such:
\[\begin{bmatrix} 240 & 240 \\ 10 & 120 \end{bmatrix}\]The average function applied to two identical images should obviously result in mapping to the same matrix, and is equivalent to applying the identity matrix.
This is not particular surprising and one has to wonder what the point of this post is at this point, which is fine.
First, the above matrix calculation used values represented by the type int64
, which
has 64 bits to switch and the capacity to represent a large quantity of values,
specifically \(-2^{63}\) to \(2^{63}-1\). A bit of an overkill for this
exercise, but serves the purpose.
For images represented by 8 bits, or specifically uint8
in numpy, the problem
of overflow becomes apparent very quickly. Working through the same exercise as
above, but constraining the capacity to one associated with uint8
we discover that
the calculations fail to produce the correct results.
What happened? If we break down the calculations, the issue becomes clear.
First the averaging function adds two numbers of type unsigned integer
represented by 8 bits. \(240 + 240\) would equal \(480\), but because of the
capacity constraints, the operation is equivalent to applying a modulo 255 after
the addition operator, \((240 + 240) % 255 \). The result wraps around and
outputs 225, then dividing this by 2 results in 112 after rounding. A completely
expected but undesired result. This can be easily resolved by increasing the
capacity of the representation, by converting from uint8
to int64
, for
example.
Why does this matter? Simply, unattended the results will be visually apparent, in addition to the intensity of the pixels being artificially undervalued.
A visualization exercise will make the result more clear.
We start by introducing the original, as a disclaimer this a private picture taken of a original Heather Brown painting that I purchased a while back.
To make the difference more apparent, we use two identical image and take the average, which should result in the same image, and equivalent to applying the identity matrix. Its clear even visually that the result is what we want.
The image below is applying the same operation but this time using unsigned integer with 8 bit representation. The result is clear if we compare with the desired output above, and visually can confirm the under valuation of the intensity of the pixel values.
This can easily be averted if we pay attention to the transitive operations, and remain sensitive to the types involved.
A very simple concept, easily overlooked, but a basic concept worth knowing.