Overflow in Image Processing

Something that I found simple yet profound is the idea of overflow encountered in image processing which really is a by-product of how image pixels are represented in computation.

Typically the intensity of a pixel is represented by 8 bits, or 1 byte which has the capacity to represent \(256\) values, \(2^8\), ranging from \(0\) to \(255\).

One example of an issue that arises as a result is apparent when implementing an operation as simple as averaging.

For the purpose of discussion, lets say we have a gray scale image represented by a \(2\times2\) matrix as such:

\[\begin{bmatrix} 240 & 240 \\ 10 & 120 \end{bmatrix}\]

The average function applied to two identical images should obviously result in mapping to the same matrix, and is equivalent to applying the identity matrix.

>>> import numpy as np 
>>> image = np.array([[240,240],[10,120]])  
>>> (image + image) / 2  
array([[240, 240],
       [ 10, 120]])

This is not particular surprising and one has to wonder what the point of this post is at this point, which is fine.

First, the above matrix calculation used values represented by the type int64, which has 64 bits to switch and the capacity to represent a large quantity of values, specifically \(-2^{63}\) to \(2^{63}-1\). A bit of an overkill for this exercise, but serves the purpose.

For images represented by 8 bits, or specifically uint8 in numpy, the problem of overflow becomes apparent very quickly. Working through the same exercise as above, but constraining the capacity to one associated with uint8 we discover that the calculations fail to produce the correct results.

>>> image = np.array([[240,240],[10,120]]).astype('uint8') 
>>> (image + image) / 2  
array([[112, 112],
       [ 10, 120]], dtype=uint8)

What happened? If we break down the calculations, the issue becomes clear. First the averaging function adds two numbers of type unsigned integer represented by 8 bits. \(240 + 240\) would equal \(480\), but because of the capacity constraints, the operation is equivalent to applying a modulo 255 after the addition operator, \((240 + 240) % 255 \). The result wraps around and outputs 225, then dividing this by 2 results in 112 after rounding. A completely expected but undesired result. This can be easily resolved by increasing the capacity of the representation, by converting from uint8 to int64, for example.

Why does this matter? Simply, unattended the results will be visually apparent, in addition to the intensity of the pixels being artificially undervalued.

A visualization exercise will make the result more clear.

We start by introducing the original, as a disclaimer this a private picture taken of a original Heather Brown painting that I purchased a while back.

To make the difference more apparent, we use two identical image and take the average, which should result in the same image, and equivalent to applying the identity matrix. Its clear even visually that the result is what we want.

No overflow with int64

The image below is applying the same operation but this time using unsigned integer with 8 bit representation. The result is clear if we compare with the desired output above, and visually can confirm the under valuation of the intensity of the pixel values.

Overflow with uint8

This can easily be averted if we pay attention to the transitive operations, and remain sensitive to the types involved.

A very simple concept, easily overlooked, but a basic concept worth knowing.

Overflow in Image Processing

Related Posts