## Phone Vision 16 – Binary Images

9 03 2011

When we are creating a binary image we are defining objects of interest.  Defining these objects is very specific to the task at hand.  In some cases a simple threshold will work while others require a more complex approach such as color segmentation, edge detection, or something even more sinister.

Thresholding

The most common way to create a binary image from a grayscale image is to pick an intensity threshold (also known as gray-level segmentation).  Everything below the threshold becomes a pixel of interest (black).  Everything else is not (white).  Let’s explore a few methods for picking that threshold.

The Image

Interestingly, this picture is a white piece of poster board (not blue) with some dark puzzle pieces on it.  If you look closely you can see the famous HD7 pink tint.  The image on the right is simply converted to grayscale using the average of the RGB values.

Mean Threshold

Almost every time an image processing problem arises and a simple average is a possible answer my gut tells me to go for it.  In this case, my gut is going to lead me astray, but first some code:

WriteableBitmap ToAverageBinary(WriteableBitmap grayscale)

{

WriteableBitmap binary =

new WriteableBitmap(grayscale.PixelWidth,

grayscale.PixelHeight);

int[] histogramData = new int[256];

int maxCount = 0;

//first we will determine the histogram

//for the grayscale image

for (int pixelIndex = 0;

pixelIndex < grayscale.Pixels.Length;

pixelIndex++)

{

byte intensity = (byte)grayscale.Pixels[pixelIndex];

//simply add another to the count

//for that intensity

histogramData[intensity]++;

if (histogramData[intensity] > maxCount)

{

maxCount = histogramData[intensity];

}

}

//now we need to figure out the average intensity

long average = 0;

for (int intensity = 0; intensity < 256; intensity++)

{

average += intensity * histogramData[intensity];

}

//this is our threshold

average /= grayscale.Pixels.Length;

for (int pixelIndex = 0;

pixelIndex < grayscale.Pixels.Length;

pixelIndex++)

{

byte intensity = (byte)grayscale.Pixels[pixelIndex];

//now we’re going to set the pixels

//greater than or equal to the average

//to white and everything else to black

if (intensity >= average)

{

intensity = 255;

unchecked { binary.Pixels[pixelIndex] = (int)0xFFFFFFFF; }

}

else

{

intensity = 0;

unchecked { binary.Pixels[pixelIndex] = (int)0xFF000000; }

}

}

//note that this is the ORIGINAL histogram

//not the histogram for the binary image

PlotHistogram(histogramData, maxCount);

//this is a line to show where the

//average is relative to the rest of

//the histogram

Line averageLine = new Line();

averageLine.X1 = HistogramPlot.Width * average / 255;

averageLine.X2 = HistogramPlot.Width * average / 255;

averageLine.Y1 = 0;

averageLine.Y2 = HistogramPlot.Height;

averageLine.Stroke = new SolidColorBrush(Colors.Magenta);

averageLine.StrokeThickness = 2;

averageLine.StrokeDashArray = new DoubleCollection() { 5, 2.5 };

return binary;

}

While the puzzle pieces came through loud and clear so did a lot of garbage.  Notice how the average line (magenta) splits the peak?  This is going to lead to a large number of false black pixels.  What we want is an automated technique for shifting the threshold left.

Two Peak

If you look at the original grayscale image you’ll notice that the puzzle pieces seem to be close to the same intensity while the background is another. This is represented in the histogram by the large peak (the background) and an almost imperceptible peak toward the dark end of the spectrum (the puzzle pieces).

Finding the large peak is trivial.  It’s just the highest occurring intensity.  The small peak on the other hand is a little trickier.  It’s not the second most frequent intensity – that’s right next to the largest peak.  A little trick is to give a higher weight to intensities that are far from the highest peak.  To do this we multiply the intensity count at each point by the square of its distance to the peak.  That’s a mouthful, but it’s pretty easy.  Here’s the code:

int secondPeak = 0;

long secondPeakCount = 0;

for (int intensity = 0; intensity < 256; intensity++)

{

(long)(histogramData[intensity]*Math.Pow(intensity-maxPeak, 2));

{

secondPeak = intensity;

}

}

So we calculate an adjusted count for each intensity.  By multiplying it by the square of the distance we give higher counts to those further from the first peak.  Amazingly, this works even in this case where the second peak is so small.

Wow!  Those results are so much better.  Notice I marked the two peaks with green and the new threshold is in magenta.

Summary

Don’t use the mean.  You might get lucky in a few cases, but in general you’ll end up in a messy situation.  That said, the two-peak solution will work well in cases where your objects are of a consistent color and that color is separate from the background.  I created this photo because I knew it would have good results.  I highly recommend you try these techniques out on your own photos so you can get a feel for when they will work and when they won’t.

We will most likely revisit different thresholding techniques for different situations when they arise, but for now we have a binary image so we’re going to see what we can do with it.

http://cid-88e82fb27d609ced.office.live.com/embedicon.aspx/Blog%20Files/PhoneVision/PhoneVision%2016%20-%20Binary%20Images.zip

Up Next: Sets

## Phone Vision 15 – Median Filters

1 03 2011

Filters, filters, filters… Tired of them yet?  Me too.  After this we’ll leave them alone for a while.  I promise.  Until then, we have work to do.

What is a median filter?  If you understand the average filter (or mean filter) then the median filter should give you no troubles at all – assuming you know what the median is.  We’ll start there.

What is the Median?

The median is the middle number of a sample.  To find it, you sort your list and pick the middle item.  Yes, it really is that simple.

Here is a random set of 9 numbers:

Sort them:

Pick the middle value:

Like I said, it’s simple.

Why use median instead of mean?

The median is particularly effective against certain types of noise – specifically “salt and pepper” noise (black and white specks).  Let’s assume this is the neighborhood we’re dealing with:

Performing the mean gives:

While the median returns:

If we expect that the 0 values are noise then the median is probably a much closer estimate.

Code

private WriteableBitmap MedianFilter(WriteableBitmap grayscale, int radius)

{

// we are still going to create a new image

// because we don’t want to modify the

// old image as we are processing it

WriteableBitmap filtered =

new WriteableBitmap(

grayscale.PixelWidth,

grayscale.PixelHeight);

// boiler plate code for our

// histogram stuff

int[] histogram = new int[256];

int maxIntensity = 0;

// the math is still easier if we create two loops

for (int y = 0; y < grayscale.PixelHeight; y++)

{

for (int x = 0; x < grayscale.PixelWidth; x++)

{

//here’s the pixel we’re centered on

int pixel = x + y * grayscale.PixelWidth;

byte intensity = (byte)grayscale.Pixels[pixel];

// if we are on an edge we are going to leave it

// as the original intensity.  you will see the

// edges increasingly unsmoothed as the window

// size increases.  here we are using the radius

// to determine our bounds

if (y <= radius – 1 ||

x <= radius – 1 ||

y >= grayscale.PixelHeight – radius ||

{

histogram[intensity]++;

if (histogram[intensity] > maxIntensity)

{

maxIntensity = histogram[intensity];

}

continue;

}

/////////////////////////////////////////////////////////

// IMPORTANT PART ///////////////////////////////////////

/////////////////////////////////////////////////////////

// this list is the key

// it contains all of the neighboring pixels

List<byte> localIntensities = new List<byte>();

{

xoffset++)

{

(byte)grayscale.Pixels[(x + xoffset)

+ (y + yoffset) * grayscale.PixelWidth]));

}

}

//sort the intensities

localIntensities.Sort();

//pick the middle value

int medianLocalIntensity =

localIntensities[(int)(localIntensities.Count/2.0+.5)];

/////////////////////////////////////////////////////////

// END IMPORTANT PART ///////////////////////////////////

/////////////////////////////////////////////////////////

// and now just set the color

filtered.Pixels[pixel] = (255 << 24)

| (byte)medianLocalIntensity << 16

| (byte)medianLocalIntensity << 8

| (byte)medianLocalIntensity;

histogram[(byte)medianLocalIntensity]++;

if (histogram[(byte)medianLocalIntensity] > maxIntensity)

{

maxIntensity = histogram[(byte)medianLocalIntensity];

}

}

}

PlotHistogram(histogram, maxIntensity);

return filtered;

}

Results

Taking a simple gray image I added salt and pepper noise to the image then performed a median filter and an average filter to it.  The results are stunning.

flat gray image                              10% salt and pepper noise

after median filtering                      after mean filtering

There are a few specks after applying the median filter, but the noise is removed pretty well.  The average filter performed dismally to say the list.

Summary

If you expect salt and pepper noise then the median filter is great tool to have in your toolbox.  If you want you can explore max, min, and mode filters.  I don’t think we’ll cover them here unless we have a specific application for it.

http://cid-88e82fb27d609ced.office.live.com/embedicon.aspx/Blog%20Files/PhoneVision/PhoneVision%2015%20-%20Median%20Filter.zip

(code includes salt and pepper noise generation)

Up Next: Binary Images

## Phone Vision 14 – Sobel Operators

25 02 2011

Last time we explored how we can use the rate of change between adjacent pixels to enhance a boundary between two features in an image.  Today we are going to take that one step further and isolate edges using the Sobel operators.  Hold on tight…

The Sobel Operators

A search for the Sobel operators will quickly turn up these two filters:

I created a rather colorful graphic to demonstrate some of the concepts.

What do you notice?  Here’s what I notice:

• All of the non-edges turn black.
• Diagonal edges show up in both filters.
• The first filter misses horizontal edges
• The second filter misses vertical edges.

The union of these two images will give you a more complete view of the edges in the system.  If that’s all you wanted to know then you’re done and you can stop here.

I’m curious how they work so I’m going to dig deeper.

When you are driving down a mountain you might see a sign (like the one to the right) indicating a steep grade.  The grade in the context of a mathematical function is no different.  We are trying to find the slope of the function at a given point.  This is called the gradient.

Last time we looked at how fast a function was changing by subtracting adjacent intensities.    If you think about it, it’s easy to imagine areas with a lot of change having a steep slope, right?  So have we already found the gradient?  Yes and no.  Hold on, this is going to get rough.

Subtracting the adjacent intensities, we determined the slope from one pixel to the next, but we didn’t determine the slope at the pixel.  Because we are dealing with discrete values (i.e. there is no ‘in-between’ pixels) the best we can hope for is an approximation of the slope at a pixel.  I am hoping that an example will clear this up a bit.

Imagine the sin function represents some hilly terrain you are driving over.

It should be fairly apparent that at the top of the hill our grade is flat (slope = 0).  Let’s imagine we have a GPS recording our position every so often so we get an approximation of the terrain.

Using the same technique as the last lesson, let’s try to approximate the slope at the top of the hill.  Last time we said the change was f(x+1) – f(x).

Well, that’s not very good.  It turns out that if we adjust our formula ever so slightly we get much better results.  What if we use the position right before we got to the top of the hill instead of the position at the top?

That is a much better approximation,but what now?

Let’s convert our new difference function to our filter representation.

Remember,

• f(x+1) = next position (right after the top)
• f(x) = current position (at the top)
• f(x-1) = previous position (right before the top)

so our gradient function looks like:

-1*f(x-1) + 0* f(x) + 1*f(x+1)

or

Following the same process in the y direction we end up with

-1*f(y-1) + 0* f(y) +1*f(y+1)

or

If we apply these we end up with:

It’s a much weaker signal, but I can still spot the edges.

If these filters make sense to you then you should be able to answer the following questions:

• Why are the horizontal lines still missing in the first image?
• And the vertical lines still missing in the second image?
• Why do the diagonals show up?

Separable Filters

You might be thinking, “um, I guess this is nice, but what does it have to do with the Sobel?”  Everything.

We don’t have the tools yet to completely analyze this, but the Sobel operator is a separable filter.  This means that it is a combination of two 1D filters.  Sobel is a smooth filter multiplied with the gradient filter we just built.

A couple of notes:

• [1,2,1] is a 1D weighted smoothing filter
• This is matrix multiplication.  If you need a refresher, check out the Kahn Academy video on the subject.  I love that place.

and

Whoa!  Mind blown.

Summary

We are starting to add some useful tools to our toolbox, but we are dancing around some really complex concepts.  Over the next few lessons we will continue working with a few more simple blur, sharpen, and edge detection routines before we jump head first into frequency analysis.

http://cid-88e82fb27d609ced.office.live.com/embedicon.aspx/Blog%20Files/PhoneVision/PhoneVision%2014%20-%20Sobel%20Operator.zip

Up Next: Median Filter

## Phone Vision – 13 Sharpening Filters

21 02 2011

Smoothing an image was pretty intuitive.  When we are smoothing an image we are actually trying to suppress details in the image.  When we sharpen an image we are trying to do the opposite.  So, how do we do that?

Rates of Change

When we are sharpening an image we are trying to enhance the differences.  What are the differences?  Well, let’s take a look at a 1D strip of intensities:

If we take a cursory look at these pixels we’ll see that the intensities gradually decrease from left to right before jumping back up dramatically then dropping back down.  The dramatic ‘jumps’ are what will ultimately be enhanced when we sharpen the image.  Here they are highlighted.

We want to know how fast the intensities are changing.  If ∆ represents the rate of change between adjacent pixels then our rate of change looks like:

We just subtract the value of the next pixel from the value of the current pixel.  A little more formally, this looks like

∆(x) = f(x+1) – f(x)

Under normal conditions, some natural variation will occur pixel by pixel.  We want to ignore that natural variation and only concern ourselves when the change is drastic.  For that we will have to perform the subtraction one more time.

Now we can see how fast each intensity is changing.  Once again, let’s formalize this.

2(x) =∆(x+1) – ∆(x)

2(x) = (f(x+1) – f(x)) – (f(x) – f(x-1)

2(x) = f(x-1) – 2 * f(x) + f(x+1)

Implementing the 1D Case

Then the filter for our 1D case  looks like

If we apply this to our image we end up with this from above:

If we subtract this from our original image something magical happens:

We have sharpened the image!

In order to achieve that result, we subtracted ∆2  from the original image f.  Our new formula looks like:

sharp = f – ∆2

so in the x direction,

sharp(x) = f(x) – ∆2(x)

sharp(x) = f(x) – (f(x-1) – 2 * f(x) + f(x+1))

sharp(x) = f(x) + -f(x-1) + 2 * f(x) – f(x+1)

sharp(x) = –f(x-1) + 3*f(x) – f(x+1)

Hence, our final 1D filter:

Expanding to 2D

If all images were just a single strip of pixels we’d be done.  Unfortunately they aren’t, but accounting this is a pretty simple adjustment.  The first thing to notice is that the 1D case works in any direction.  If we wanted to apply the filter vertically instead of horizontally it would look like this:

So, how do we expand this to two dimensions?  It’s pretty easy, actually:

Now you have a sharpening filter that will work in two dimensions.  If you want to include the diagonals, there is no reason you can’t.  Using the exact same method the final sharpening filter becomes:

And we’re done!

Summary

With some luck, you will have a solid grasp of how the sharpening filter works.  If you are interested in researching this topic more, we walked through creating the Laplacian operator.  If you are comfortable with calculus, the difference functions we discussed were actually just the first derivative (∆) and the second derivative (∆2).

Having discussed the filter code at length, the code is not posted here.  The code from previous lessons will work fantastically.

Up Next: Sobel Operators

## Phone Vision 12–Smoothing Filters

15 02 2011

What does it mean to smooth an image?  When I imagine a smooth surface it doesn’t have a lot of bumps or pits.  If we were to try to smooth out a rough piece of wood we would probably sand down the peaks and fill in holes.  If you apply that analogy to an image it sounds a lot like averaging a neighborhood.

Average Filter

If you remember from last time we described a neighborhood filter using a number of weights as follows:

A simple average would allow all of the pixels in the neighborhood to contribute equally to the average.  In that case would represent the filter as:

Using our code from last time, the filter looks like

int[,] filter = new int[,]

{

{ 1, 1, 1 },

{ 1, 1, 1 },

{ 1, 1, 1 }

};

And here are the results when we run it:

The original is on the left and the smoothed version is on the right.  Notice how the large objects are relatively unchanged, but the small objects have lost a lot of information.  In addition to removing noise, if we are trying to segment large objects, we can use smoothing to blur out the small objects.

Modified Filter Code

If we run this filter enough times we will eventually get every pixel to be some sort of gray.  It would be more efficient just to increase the size of the window.  Let’s modify the filtering code to accept a square filter of arbitrary size.

private WriteableBitmap Filter(WriteableBitmap grayscale, int[,] filter)

{

// we are still going to create a new image

// because we don’t want to modify the

// old image as we are processing it

WriteableBitmap filtered =

new WriteableBitmap(

grayscale.PixelWidth,

grayscale.PixelHeight);

// boiler plate code for our

// histogram stuff

int[] histogram = new int[256];

int maxIntensity = 0;

//here’s where the magic starts

//assume the filter is square

int length = (int)Math.Sqrt(filter.Length);

int filterMagnitude = 0;

for (int x = 0; x < length; x++)

{

for (int y = 0; y < length; y++)

{

filterMagnitude += filter[x, y];

}

}

// this will be used for our loops

// it’s important that our size is odd

// so that our current pixel can be the

// center point.

int radius = length / 2;

// the math is still easier if we create two loops

for (int y = 0; y < grayscale.PixelHeight; y++)

{

for (int x = 0; x < grayscale.PixelWidth; x++)

{

//here’s the pixel we’re centered on

int pixel = x + y * grayscale.PixelWidth;

byte intensity = (byte)grayscale.Pixels[pixel];

// if we are on an edge we are going to leave it

// as the original intensity.  you will see the

// edges increasingly unsmoothed as the window

// size increases.  here we are using the radius

// to determine our bounds

if (y <= radius – 1 ||

x <= radius – 1 ||

y >= grayscale.PixelHeight – radius ||

{

//maintain the original / non-smoothed

filtered.Pixels[pixel] = (255 << 24)

| (byte)intensity << 16

| (byte)intensity << 8

| (byte)intensity;

histogram[intensity]++;

if (histogram[intensity] > maxIntensity)

{

maxIntensity = histogram[intensity];

}

continue;

}

int newIntensity = 0;

//math easier here too

{

xoffset++)

{

// we loop through each pixel in the neighborhood

// and apply the filter. by ‘apply the filter’

// I mean multiply it by the appropriate weight

newIntensity +=

((byte)grayscale.Pixels

[(x + xoffset)

+ (y + yoffset) * grayscale.PixelWidth])

}

}

// scale the new intensity back

newIntensity /= filterMagnitude;

newIntensity =

Math.Max(Math.Min(newIntensity, 255), 0);

// and now just set the color

filtered.Pixels[pixel] = (255 << 24)

| (byte)newIntensity << 16

| (byte)newIntensity << 8

| (byte)newIntensity;

histogram[(byte)newIntensity]++;

if (histogram[(byte)newIntensity] > maxIntensity)

{

maxIntensity = histogram[(byte)newIntensity];

}

}

}

PlotHistogram(histogram, maxIntensity);

return filtered;

}

Now we can really smooth the image.

A Large Filter

A 21×21 filter is about as large as I can do without having the mask wrap to the next line.

int[,] filter = new int[,]

{

{ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 },

{ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 },

{ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 },

{ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 },

{ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 },

{ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 },

{ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 },

{ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 },

{ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 },

{ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 },

{ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 },

{ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 },

{ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 },

{ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 },

{ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 },

{ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 },

{ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 },

{ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 },

{ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 },

{ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 },

{ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 },

};

We coded it so that the edges are unfiltered.  To work around this issue you can perform your average with the pixels you have available.  Notice here that the large objects are still relatively unchanged, while the smaller objects have lost a substantial amount of information.

Summary

This technique is pretty easy.  More important however, is that the masks can be changed easily and the effects can be seen instantly.  Once again I recommend that you play around with the masks on your own.  As we’ll see soon they can be used for things far more powerful than blurring an image.

http://cid-88e82fb27d609ced.office.live.com/embedicon.aspx/Blog%20Files/PhoneVision/PhoneVision%2012%20-%20Smoothing%20Filters.zip

Up next: Sharpening Filters

## Phone Vision 11–Intro to Spatial Filters

11 02 2011

When working with images there is literally an infinite number of operations we can perform on a single pixel.  However, blindly manipulating the image will rarely, if ever, produce a desirable result.  For example, if an image is already too bright, we wouldn’t want to run a ‘brighten’ function on it.  We need to rely on context to achieve sensible results.  In the case of spatial filters our context is the neighborhood of the pixel.

Neighborhoods

What is a neighborhood?  Neighborhoods are simply pixels adjacent to the pixel we are working with.

Neighborhoods are typically (but not always) squares with an odd dimension like 3×3 or 5×5.  While this is not a requirement, for our purposes we will stick with 3×3.  Later we may throw in a 5×5.  Incidentally, per pixel operations are a special case of neighborhood operations where the neighborhood is 1×1.

Filters

The term filter actually arises from frequency analysis (something we will get to later).  Some other terms that you might see used are convolution mask or kernel.  For now, we will think of a filter as a set of weights that correspond to pixels in the neighborhood.  If we have a 3×3 filter with weights w then we often express the weights as:

When we “apply a filter” we are just calculating the average using the specified weights.  Let’s look at an example.

Imagine our neighborhood…

and our filter…

So our transformed pixel becomes:

This identity* filter is pretty useless for (hopefully) obvious reasons, but it demonstrates the point.  If we apply this filter for every pixel in the image we will get the same image back.

* in mathematics, the ‘identity’ returns the original value for a given operation: e.g. 0 is the identity for addition so 1+0 = 1.

Code

Finally, let’s take a look at a filter function.

private WriteableBitmap Filter(WriteableBitmap grayscale, int[,] filter)

{

// we are going to create a new image

// because we don’t want to modify the

// old image as we are processing it

WriteableBitmap filtered =

new WriteableBitmap(

grayscale.PixelWidth,

grayscale.PixelHeight);

// boiler plate code for our

// histogram stuff

int[] histogram = new int[256];

int maxIntensity = 0;

// this is the magnitude

// of the weights |w|

// we will divide by this later

int filterMagnitude = 0;

for (int x = 0; x < 3; x++)

{

for (int y = 0; y < 3; y++)

{

filterMagnitude += filter[x, y];

}

}

// the math is easier if we create two loops

for (int y = 0; y < grayscale.PixelHeight; y++)

{

for (int x = 0; x < grayscale.PixelWidth; x++)

{

//here’s the pixel we’re centered on

int pixel = x + y * grayscale.PixelWidth;

byte intensity = (byte)grayscale.Pixels[pixel];

//if we are on an edge we are going to skip it

if (y == 0 ||

x == 0 ||

y == grayscale.PixelHeight – 1 ||

x == grayscale.PixelWidth – 1)

{

histogram[intensity]++;

if (histogram[intensity] > maxIntensity)

{

maxIntensity = histogram[intensity];

}

continue;

}

int newIntensity = 0;

//going from -1 to 1 makes the math easier here too

for (int yoffset = -1; yoffset <= 1; yoffset++)

{

for (int xoffset = -1; xoffset <= 1; xoffset++)

{

// we loop through each pixel in the neighborhood

// and apply the filter. by ‘apply the filter’

// I mean multiply it by the appropriate weight

newIntensity +=

((byte)grayscale.Pixels

[(x + xoffset)

+ (y + yoffset) * grayscale.PixelWidth])

* filter[(yoffset + 1), (xoffset + 1)];

}

}

// here we are scaling the new intensity back

newIntensity /= filterMagnitude;

newIntensity =

Math.Max(Math.Min(newIntensity, 255), 0);

// and now just set the color to the

// new intensity

filtered.Pixels[pixel] = (255 << 24)

| (byte)newIntensity << 16

| (byte)newIntensity << 8

| (byte)newIntensity;

histogram[(byte)newIntensity]++;

if (histogram[(byte)newIntensity] > maxIntensity)

{

maxIntensity = histogram[(byte)newIntensity];

}

}

}

PlotHistogram(histogram, maxIntensity);

return filtered;

}

Here’s a simple way to call this with the identity filter described above:

// create the identity filter

// important note: this is ROW by COLUMN (y, x)

int[,] filter = new int[,] {{0,0,0}, {0,1,0}, {0,0,0}};

WriteableBitmap filtered = Filter(grayscale, filter);

As expected, using the identity filter the results are exactly the same.  This is a good test to make sure we didn’t mess anything up.  Next time we will use this code to apply some useful filters.

Summary

Though the formulas might look a little daunting when you write them down, the concept of spatial filtering is pretty easy.  Now that we have some code that makes it trivial to test different filters, I suggest you do just that.  Play around with this and over the next few lessons we will talk about some specific filters.

http://cid-88e82fb27d609ced.office.live.com/embedicon.aspx/Blog%20Files/PhoneVision/PhoneVision%2011%20-%20Intro%20to%20Spatial%20Filters.zip

Up next: Smoothing Filters

## Phone Vision 10 – Histogram Equalization

7 02 2011

Last time we stretched the intensities of a specific range of values.  The method for determining the bounds was somewhat arbitrary and we simply spread the intensities equally.  Stretching uses more of the intensity spectrum and, hence, increases contrast, but it may not be the best approach to preserve information.  Could we be a little more intelligent about how we distribute the intensities?

Uniform Distribution

In an image with perfect, uniformly distributed intensities, the probability of any intensity would be the same as the probability of any other intensity.  This generally isn’t possible on a per intensity basis, however, if we consider ranges of intensities, we can get close.  For instance, a random pixel in a high contrast image would have roughly the same probability of being in the light or dark ranges of intensities.  This is achievable through, what I like to call, mathemagic.

Probability

When we roll a die we have six possibilities: 1,2,3,4,5, or 6.  What are the odds that 1 will come up?  1 in 6 right?  In this case our probability is P(1)= .1667.  What about 2? P(2)=.1667 as well, right?  What is the probability of 1 or 2 coming up?  P(1 or 2) = P(1) + P(2) = .3333.  Note that the chances of any side coming up = P(1)+P(2)+P(3)+P(4)+P(5)+P(6)=1.

Back to our situation, what is the probability of a randomly selected pixel having a specific intensity?   Using the histogram, this is trivial to calculate.  Simply take the total number of pixels at the intensity divided by the total number of pixels.

Okay, so what?  This is where things get a little tricky.

Probability Density Function

If you were to divide every value in a histogram by the number of pixels you would end up with the probability for every intensity.  This is a discrete probability density function (PDF).

The PDF for our wintery tree from last time:

Cumulative Distribution Function

If we have a given intensity k, what is the probability that a pixel will have an intensity between 0 and k?  I hope it is fairly apparent that P(intensity≤k)=P(0)+P(1)+…+P(k).  In our case, the possible intensities range from 0 to 255.

So, for 0≤k≤255 the CDF(k) = P(intensity≤k).

Why does this matter?  Well, if we map the current intensities to the CDF, it will evenly distribute our intensities across the spectrum which is exactly what we want.

The Code

private double[] CalculateCDF(WriteableBitmap grayscale)

{

int[] histogramData = new int[256];

// here we are just calculating the histogram

// this should be old hat by now

for (int pixelIndex = 0;

pixelIndex < grayscale.Pixels.Length;

pixelIndex++)

{

byte intensity = (byte)(grayscale.Pixels[pixelIndex] & 0xFF) ;

histogramData[intensity]++;

}

double[] cumulativeDistributionFunction = new double[256];

//the first value is just the probability of the first intensity

//notice we divide the histogram value by the total number of pixels

//to get the probability

cumulativeDistributionFunction[0] =

(((double)histogramData[0]) / grayscale.Pixels.Length);

for (int intensity = 1; intensity < 256; intensity++)

{

//now we just add the current probability

//to the previous CDF

cumulativeDistributionFunction[intensity] =

(((double)histogramData[intensity]) / grayscale.Pixels.Length)

+ cumulativeDistributionFunction[intensity – 1];

}

return cumulativeDistributionFunction;

}

public WriteableBitmap Equalize(

WriteableBitmap grayscale,

double[] cumulativeDistFunction)

{

WriteableBitmap equalized = new WriteableBitmap(grayscale.PixelWidth, grayscale.PixelHeight);

int[] histogramData = new int[256];

for (int pixelIndex = 0;

pixelIndex < grayscale.Pixels.Length;

pixelIndex++)

{

byte intensity = (byte)(grayscale.Pixels[pixelIndex] & 0xFF);

histogramData[intensity]++;

}

histogramData = new int[256];

int maxCount = 0;

for (int pixelIndex = 0;

pixelIndex < grayscale.Pixels.Length;

pixelIndex++)

{

//this is where the magic happens

//first we get the old intensity

//then we map the old intensity to

//the CDF and scale it up to 255

byte oldintensity = (byte)(grayscale.Pixels[pixelIndex] & 0xFF);

byte newIntensity =

(byte)(cumulativeDistFunction[oldintensity] * 255 + .5);

equalized.Pixels[pixelIndex] =

255 << 24

| newIntensity << 16

| newIntensity << 8

| newIntensity;

histogramData[newIntensity]++;

if (histogramData[newIntensity] > maxCount)

{

maxCount = histogramData[newIntensity];

}

}

PlotHistogram(histogramData, maxCount);

return equalized;

}

Notice how the CDF is now a straight line?  This means that we have done a pretty good job of distributing the intensities.  If we were to run this procedure again it would not change.  While these results are similar to our contrast stretched tree notice how the spacing is not equal.  In fact higher occurring intensities are spaced further apart than lower occurring intensities.

Summary

The math involved here is a little more intense than the other image enhancement techniques we performed.  This is one procedure that I did not find well explained on the internet.  If you have any difficulties understanding it get in touch with me @azzlsoft on Twitter or rich@azzlsoft.com.

http://cid-88e82fb27d609ced.office.live.com/embedicon.aspx/Blog%20Files/PhoneVision/PhoneVision%2010%20-%20Histogram%20Equalization.zip

Up next: Intro to Spatial Filters

## Phone Vision – 09 Contrast Stretching

1 02 2011

Now that we have a histogram, it’s time to have some fun.  Michael Reichmann at Luminous Landscape uses the following picture for a great post on Understanding Histograms for digital photographers.  He graciously allowed to me to use it here and it’s perfect for what we want to accomplish.

This is a fantastic snow scene, but it would be nice if we could increase the contrast to help us pick out features.  In other words, can we ‘stretch’ the intensities to use the whole spectrum?  Of course we can!

Determining Boundaries

Which intensities do we want to become the new black and white?  You could probably eyeball it and come up with a pretty good approximation, but we’d rather automate this process whenever possible.

Let’s pick a threshold, say, 5% of the highest occurring intensity and use it to determine our lower (black) and upper (white) bounds.  If the highest occurring intensity has 100 pixels, then any intensities with fewer than 5 occurrences will be below our threshold.  Allow the code to clear up any confusion.

public Point DetermineBounds(WriteableBitmap grayscale)

{

//we will use the x value for

//the left boundary and the

//y value for right the boundary

Point bounds = new Point();

int maxIntensity = 0;

int threshold = 0;

//once again, we are determining

//the histogram

int[] histogramData = new int[256];

for (int pixelIndex = 0;

pixelIndex < grayscale.Pixels.Length;

pixelIndex++)

{

byte intensity = (byte)(grayscale.Pixels[pixelIndex] & 0xFF);

histogramData[intensity]++;

if (histogramData[intensity] > maxIntensity)

{

maxIntensity = histogramData[intensity];

}

}

// set the threshold = 5% of max

threshold = maxIntensity / 20;

//first, let’s find the left bound

//we start at one to simplify our math

//if we hit the boundary straight away

//it will be recorded as zero

//which is what we want

for (int intensity = 1;

intensity < histogramData.Length;

intensity++ )

{

//if the number of occurrences of an intensity

//is greater than our threshold then we’ve found our boundary

if (histogramData[intensity] >= threshold)

{

//the current intensity is where we

//crossed the threshold

//we need to store the previous

//intensity

bounds.X = intensity – 1;

break;

}

}

//now let’s find the right bound

//so we’ll start on the right side.

for (int intensity = histogramData.Length – 1;

intensity > 0;

intensity–)

{

//same as before

if (histogramData[intensity] >= threshold)

{

//subtract

bounds.Y = intensity + 1;

break;

}

}

return bounds;

}

And just so we can see what we’re doing, we’ll put in a method to draw the boundaries.

public void DrawBounds(Point bounds)

{

//double threshold = maxCount * .05;

Line lowIntensity = new Line();

lowIntensity.Height = HistogramPlot.Height;

lowIntensity.Width = HistogramPlot.Width;

lowIntensity.Stroke = new SolidColorBrush(Colors.Blue);

lowIntensity.StrokeThickness = 2;

lowIntensity.Y1 = 0;

lowIntensity.Y2 = HistogramPlot.Height;

lowIntensity.X1 = (bounds.X / 256.0) * HistogramPlot.Width;

lowIntensity.X2 = (bounds.X / 256.0) * HistogramPlot.Width;

Line highIntensity = new Line();

highIntensity.Height = HistogramPlot.Height;

highIntensity.Width = HistogramPlot.Width;

highIntensity.Stroke = new SolidColorBrush(Colors.Blue);

highIntensity.StrokeThickness = 2;

highIntensity.Y1 = 0;

highIntensity.Y2 = HistogramPlot.Height;

highIntensity.X1 = (bounds.Y / 256.0) * HistogramPlot.Width;

highIntensity.X2 = (bounds.Y / 256.0) * HistogramPlot.Width;

}

That looks like it accurately portrays the boundaries to me.

Shifting the Intensities

The next two steps should be combined, but it’s a little easier to understand when it’s broken into two separate components.  The first step is to make the left bound the new zero.  This means let’s shift the intensities to the dark side of the spectrum.

public WriteableBitmap ShiftLeft(WriteableBitmap grayscale, Point bounds)

{

WriteableBitmap leftShifted =

new WriteableBitmap(grayscale.PixelWidth,grayscale.PixelHeight);

int maxIntensity = 0;

int[] histogramData = new int[256];

for (int pixelIndex = 0;

pixelIndex < grayscale.Pixels.Length;

pixelIndex++)

{

byte intensity = (byte)(grayscale.Pixels[pixelIndex] & 0xFF);

//here we just subtract every intensity by the

//left boundary being careful not to let any

//negative intensities slip through

intensity = (byte)Math.Max(intensity – bounds.X, 0);

//the rest of this stuff is just fluff

leftShifted.Pixels[pixelIndex] =

255 << 24

| intensity << 16

| intensity << 8

| intensity;

histogramData[intensity]++;

if (histogramData[intensity] > maxIntensity)

{

maxIntensity = histogramData[intensity];

}

}

PlotHistogram(histogramData, maxIntensity);

return leftShifted;

}

That was easy and already I feel I can pick out more detail, but we aren’t using any more of the spectrum than we were before.

Stretching the Contrast

The final piece is to use the gamut of intensities.  Note that we are not adding information here.  We are just going to spread out the information we already have.  Let’s get to it.

public WriteableBitmap Stretch(

WriteableBitmap leftShifted,

Point bounds)

{

WriteableBitmap stretched =

new WriteableBitmap(

leftShifted.PixelWidth,

leftShifted.PixelHeight);

int maxIntensity = 0;

int[] histogramData = new int[256];

for (int pixelIndex = 0;

pixelIndex < leftShifted.Pixels.Length;

pixelIndex++)

{

byte intensity = (byte)(leftShifted.Pixels[pixelIndex] & 0xFF);

//here’s where the magic happens

//in our example, our low bound was

//184 and our high bound was 245

//we have already shifted everything left

//now we just need to spread it out

//our target range is 255 intensities,

//but our current intensity range is only

//61 (245 – 184).  So we want to scale up

//our intensities appropriately.  Our scaling

//factor becomes 255 / 61 ~ 4.18.

// 0 -> 0

// i -> i * 4.18

// 61 -> 255

// > 61 -> 255

intensity = (byte)Math.Min

(intensity * 255 / (bounds.Y – bounds.X), 255);

stretched.Pixels[pixelIndex] =

255 << 24

| intensity << 16

| intensity << 8

| intensity;

histogramData[intensity]++;

if (histogramData[intensity] > maxIntensity)

{

maxIntensity = histogramData[intensity];

}

}

PlotHistogram(histogramData, maxIntensity);

return stretched;

}

Now, that is a picture!  Notice how the histogram bars have spaces between them?  This is because we aren’t actually adding information.  We’re just making it easier to see.

Summary

I really like this picture.  It works out great for a project like this and we are finally starting to do some useful image processing.  The beauty of this lesson is that it took no human interaction to perform and the results are pretty spectacular.

If you have any questions find me on Twitter (@azzlsoft).

http://cid-88e82fb27d609ced.office.live.com/embedicon.aspx/Blog%20Files/PhoneVision/PhoneVision%2009%20-%20Contrast%20Stretching.zip

Up Next: Histogram Equalization

## Phone Vision 08 – Intro to Histograms

27 01 2011

Now that we’ve finally manipulated a few images we’re starting to get somewhere.   Today we are going to take some strides in classifying images.  It would be nice if we had an objective way to say “that image is really bright”.  It shouldn’t come as a surprise that this is actually pretty easy.

What is a Histogram?

Histogram is really just a fancy word for counting.  In our case, we are counting how many pixels are at each intensity level between black and white.  Let’s revisit our friendly orangutan.

The chart below the picture is his histogram.  The horizontal-axis represents the intensity while the vertical-axis represents the number of pixels with that intensity.

Creating the Histogram

When you are converting an image to grayscale you can trivially create the histogram data with it.  The process is as simple as looping through every pixel (which we have to do to convert to grayscale anyway) and adding up the intensities.

//we are going to use the

//average of the r, g, and b

//values for our grayscale conversion

WriteableBitmap ToAverageGrayscale(WriteableBitmap bmp)

{

WriteableBitmap grayscale =

new WriteableBitmap(bmp.PixelWidth, bmp.PixelHeight);

//there are 256 intensities

//so a simple array of the ints

//should suffice for holding the counts

int[] histogramData = new int[256];

//we will use this to determine

//what the number of the highest

//occurring intensity is

//we are simply going to use this value

//to scale our plot

int maxCount = 0;

for (int pixelIndex = 0;

pixelIndex < bmp.Pixels.Length;

pixelIndex++)

{

int pixelColor = bmp.Pixels[pixelIndex];

byte red = (byte)((pixelColor & 0xFF0000) >> 16);

byte green = (byte)((pixelColor & 0xFF00) >> 8);

byte blue = (byte)(pixelColor & 0xFF);

//this is not rocket science

byte intensity = (byte)((red + green + blue) / 3);

//once again we repeat the intensity

grayscale.Pixels[pixelIndex] = 255 << 24

| intensity << 16

| intensity << 8

| intensity;

//simply add another to the count

//for that intensity

histogramData[intensity]++;

if (histogramData[intensity] > maxCount)

{

maxCount = histogramData[intensity];

}

}

//this would typically be a completely

//separate function, but it’s easier

//in this case to perform the operation

//with the grayscale conversion

PlotHistogram(histogramData, maxCount);

return grayscale;

}

public void PlotHistogram(int[] counts, int maxCount)

{

//in MainPage we have a canvas that will

//hold the histogram rectangles

HistogramPlot.Children.Clear();

//our tallest rectangle will be the height

//of the canvas, so we want to scale the

//them down from there

double scale = HistogramPlot.Height / maxCount;

//in our case we will always have 256 counts

//but better safe than sorry

double rectWidth = HistogramPlot.Width / counts.Length;

for (int intensity = 0; intensity < counts.Length; intensity++)

{

//there is no magic going on here

//for each intensity we are going to build

//a rectangle with a height proportional to

//its count

Rectangle histRect = new Rectangle();

histRect.Width = rectWidth;

histRect.Height = counts[intensity] * scale;

histRect.SetValue

(Canvas.LeftProperty,

intensity * rectWidth);

histRect.SetValue

(Canvas.TopProperty,

HistogramPlot.Height – histRect.Height);

//it just gives the fill a little

//softer look

Color =

Color.FromArgb(

255,

(byte)intensity,

(byte)intensity,

(byte)intensity),

Offset = 0 });

Color = Color.FromArgb(

255,

(byte)(.25 * intensity),

(byte)(.25 * intensity),

(byte)(.25 * intensity)),

Offset = 1 });

histRect.Fill = rectFill;

//finally, let’s add it to the canvas

}

}

It really is pretty simple.

What now?

The histogram offers us an objective method for classifying images.  For instance, based on the histogram below we might say the crane is not too dark and not too light. Because it doesn’t use the entire range, however,  we might classify this as having low contrast.  Trust me, this will come in handy in future lessons.

Summary

Creating a histogram is really easy, and it gives us some basic statistics.  It doesn’t take a wild imagination to come up with some ways to use this information. Over the next few lessons we will do just that.

http://cid-88e82fb27d609ced.office.live.com/embedicon.aspx/Blog%20Files/PhoneVision/PhoneVision%2008%20-%20Intro%20to%20Histograms.zip

Up next: Contrast Stretching

## Phone Vision – 07 Grayscale

24 01 2011

Many useful image processing concepts are simpler and faster without color information, so now that you have a solid grasp of color, let’s remove it and take a look at the world in grayscale.

Conversion

Grayscale is a measure of the intensity of the total light at a given point.  Because the ranges for red, green, and blue in the WriteableBitmap are from 0 to 255 it is convenient for us to set the grayscale ranges similarly where the absence of light (black) is 0 and maximum intensity (white) is 255.  We just need a method to extract this intensity from the color values.

I always go with my gut and it says we should be able to average the color intensities.  Time to put on the coding gloves.

WriteableBitmap ToAverageGrayscale(WriteableBitmap bmp)

{

WriteableBitmap grayscale =

new WriteableBitmap(bmp.PixelWidth, bmp.PixelHeight);

for (int pixelIndex = 0;

pixelIndex < bmp.Pixels.Length;

pixelIndex++)

{

int pixelColor = bmp.Pixels[pixelIndex];

byte red = (byte)((pixelColor & 0xFF0000) >> 16);

byte green = (byte)((pixelColor & 0xFF00) >> 8);

byte blue = (byte)(pixelColor & 0xFF);

byte intensity = (byte)((red + green + blue) / 3);

//notice how we repeat the intensity

//across each color channel

//this makes the output "color neutral"

//i.e. gray

grayscale.Pixels[pixelIndex] = 255 << 24

| intensity << 16

| intensity << 8

| intensity;

}

return grayscale;

}

Easy peasy lemon squeezy.

This looks like mission accomplished to me.

However, we are going from over 16 million colors to 256 intensities.  Obviously we are going to lose information.  How do we know that we are keeping the best information?

Any three numbers can represent six separate colors, but only one intensity.  the following example uses the numbers 255, 128, and 64:

Simply averaging the numbers will work fine in many situations, but in this situation it breaks down.

If we are trying to approximate the brightness a human perceives, mimicking human physiology is a better approach.  In the last lesson, the Bayer filter had twice as many green filters as red or blue because humans are more sensitive to green frequencies.   Green colors should then appear brighter than other colors right?

We can modify our code to be slightly more intelligent by weighting the average toward the green colors.

WriteableBitmap ToWeightedGrayscale(

WriteableBitmap bmp,

double redWeight,

double greenWeight,

double blueWeight)

{

WriteableBitmap grayscale =

new WriteableBitmap(bmp.PixelWidth, bmp.PixelHeight);

for (int pixelIndex = 0;

pixelIndex < bmp.Pixels.Length;

pixelIndex++)

{

int pixelColor = bmp.Pixels[pixelIndex];

byte red = (byte)((pixelColor & 0xFF0000) >> 16);

byte green = (byte)((pixelColor & 0xFF00) >> 8);

byte blue = (byte)(pixelColor & 0xFF);

//notice that we are dividing by the sum of the weights

//this can be avoided by ensuring the weights add to 1

byte intensity = (byte)((red * redWeight

+ green * greenWeight

+ blue * blueWeight)

/ (redWeight + greenWeight + blueWeight));

grayscale.Pixels[pixelIndex] =

255 << 24

| intensity << 16

| intensity << 8

| intensity;

}

return grayscale;

}

Weights that work well for human vision are red = .2, green = .7, and blue =.1 (rounded from Frequently Asked Questions about Color by Charles Poynton).

As you should expect, the green colors are brighter.  If you want a grayscale image to match the perceived brightness of the color image, this technique does the job.  Is it better than the simple average?  In our contrived example, yes.  In many cases, the simple average will suffice.  Here is the grayscale orangutan using both techniques.  Can you guess which is simple and which is weighted?

(hint: the bottom one is weighted)

Summary

Bottom line: there is no “right” answer.  If you’re trying to make your grayscale image “look” right, the weights used above may be a good path.  If speed is your game, you might try just using the  green channel alone.  No technique is right for every situation.  Remember, our end goal is for the phone, not a person, to understand the image.  Have fun and play around with the numbers.  Think about the advantages and disadvantages of different techniques.

I highly recommend the Frequently Asked Questions about Color by Charles Poynton.  Warning: some of it is pretty math intensive.