Phone Vision 11–Intro to Spatial Filters

11 02 2011

When working with images there is literally an infinite number of operations we can perform on a single pixel.  However, blindly manipulating the image will rarely, if ever, produce a desirable result.  For example, if an image is already too bright, we wouldn’t want to run a ‘brighten’ function on it.  We need to rely on context to achieve sensible results.  In the case of spatial filters our context is the neighborhood of the pixel.

Neighborhoods

What is a neighborhood?  Neighborhoods are simply pixels adjacent to the pixel we are working with.

image

Neighborhoods are typically (but not always) squares with an odd dimension like 3×3 or 5×5.  While this is not a requirement, for our purposes we will stick with 3×3.  Later we may throw in a 5×5.  Incidentally, per pixel operations are a special case of neighborhood operations where the neighborhood is 1×1.

Filters

The term filter actually arises from frequency analysis (something we will get to later).  Some other terms that you might see used are convolution mask or kernel.  For now, we will think of a filter as a set of weights that correspond to pixels in the neighborhood.  If we have a 3×3 filter with weights w then we often express the weights as:

image

When we “apply a filter” we are just calculating the average using the specified weights.  Let’s look at an example. 

Imagine our neighborhood…

image

and our filter…

image

So our transformed pixel becomes:

image

This identity* filter is pretty useless for (hopefully) obvious reasons, but it demonstrates the point.  If we apply this filter for every pixel in the image we will get the same image back.

* in mathematics, the ‘identity’ returns the original value for a given operation: e.g. 0 is the identity for addition so 1+0 = 1.

Code

Finally, let’s take a look at a filter function.

private WriteableBitmap Filter(WriteableBitmap grayscale, int[,] filter)

{

    // we are going to create a new image

    // because we don’t want to modify the

    // old image as we are processing it

    WriteableBitmap filtered =

        new WriteableBitmap(

            grayscale.PixelWidth,

            grayscale.PixelHeight);

 

    // boiler plate code for our

    // histogram stuff

    int[] histogram = new int[256];

    int maxIntensity = 0;

 

    // this is the magnitude

    // of the weights |w|

    // we will divide by this later

    int filterMagnitude = 0;

    for (int x = 0; x < 3; x++)

    {

        for (int y = 0; y < 3; y++)

        {

            filterMagnitude += filter[x, y];

        }

    }

 

    // the math is easier if we create two loops

    // instead of one

    for (int y = 0; y < grayscale.PixelHeight; y++)

    {

        for (int x = 0; x < grayscale.PixelWidth; x++)

        {

            //here’s the pixel we’re centered on

            int pixel = x + y * grayscale.PixelWidth;

            byte intensity = (byte)grayscale.Pixels[pixel];

 

            //if we are on an edge we are going to skip it

            if (y == 0 ||

                x == 0 ||

                y == grayscale.PixelHeight – 1 ||

                x == grayscale.PixelWidth – 1)

            {

                histogram[intensity]++;

                if (histogram[intensity] > maxIntensity)

                {

                    maxIntensity = histogram[intensity];

                }

                continue;

            }

 

            int newIntensity = 0;

            //going from -1 to 1 makes the math easier here too

            for (int yoffset = -1; yoffset <= 1; yoffset++)

            {

                for (int xoffset = -1; xoffset <= 1; xoffset++)

                {

                    // we loop through each pixel in the neighborhood

                    // and apply the filter. by ‘apply the filter’

                    // I mean multiply it by the appropriate weight

                    newIntensity +=

                        ((byte)grayscale.Pixels

                            [(x + xoffset)

                            + (y + yoffset) * grayscale.PixelWidth])

                        * filter[(yoffset + 1), (xoffset + 1)];

                }

            }

 

            // here we are scaling the new intensity back

            newIntensity /= filterMagnitude;

            newIntensity =

                Math.Max(Math.Min(newIntensity, 255), 0);

 

            // and now just set the color to the

            // new intensity

            filtered.Pixels[pixel] = (255 << 24)

                | (byte)newIntensity << 16

                | (byte)newIntensity << 8

                | (byte)newIntensity;

 

            histogram[(byte)newIntensity]++;

            if (histogram[(byte)newIntensity] > maxIntensity)

            {

                maxIntensity = histogram[(byte)newIntensity];

            }

        }

    }

 

    PlotHistogram(histogram, maxIntensity);

    return filtered;

}

Here’s a simple way to call this with the identity filter described above:

// create the identity filter

// important note: this is ROW by COLUMN (y, x)

int[,] filter = new int[,] {{0,0,0}, {0,1,0}, {0,0,0}};

WriteableBitmap filtered = Filter(grayscale, filter);

As expected, using the identity filter the results are exactly the same.  This is a good test to make sure we didn’t mess anything up.  Next time we will use this code to apply some useful filters.

image

image

Summary

Though the formulas might look a little daunting when you write them down, the concept of spatial filtering is pretty easy.  Now that we have some code that makes it trivial to test different filters, I suggest you do just that.  Play around with this and over the next few lessons we will talk about some specific filters.

Download Code

http://cid-88e82fb27d609ced.office.live.com/embedicon.aspx/Blog%20Files/PhoneVision/PhoneVision%2011%20-%20Intro%20to%20Spatial%20Filters.zip

Up next: Smoothing Filters

Advertisements




Phone Vision 10 – Histogram Equalization

7 02 2011

Last time we stretched the intensities of a specific range of values.  The method for determining the bounds was somewhat arbitrary and we simply spread the intensities equally.  Stretching uses more of the intensity spectrum and, hence, increases contrast, but it may not be the best approach to preserve information.  Could we be a little more intelligent about how we distribute the intensities?

Uniform Distribution

In an image with perfect, uniformly distributed intensities, the probability of any intensity would be the same as the probability of any other intensity.  This generally isn’t possible on a per intensity basis, however, if we consider ranges of intensities, we can get close.  For instance, a random pixel in a high contrast image would have roughly the same probability of being in the light or dark ranges of intensities.  This is achievable through, what I like to call, mathemagic.

Probability

When we roll a die we have six possibilities: 1,2,3,4,5, or 6.  What are the odds that 1 will come up?  1 in 6 right?  In this case our probability is P(1)= .1667.  What about 2? P(2)=.1667 as well, right?  What is the probability of 1 or 2 coming up?  P(1 or 2) = P(1) + P(2) = .3333.  Note that the chances of any side coming up = P(1)+P(2)+P(3)+P(4)+P(5)+P(6)=1.

Back to our situation, what is the probability of a randomly selected pixel having a specific intensity?   Using the histogram, this is trivial to calculate.  Simply take the total number of pixels at the intensity divided by the total number of pixels.

 

image

Okay, so what?  This is where things get a little tricky.

Probability Density Function

If you were to divide every value in a histogram by the number of pixels you would end up with the probability for every intensity.  This is a discrete probability density function (PDF).

The PDF for our wintery tree from last time:

image

Cumulative Distribution Function

If we have a given intensity k, what is the probability that a pixel will have an intensity between 0 and k?  I hope it is fairly apparent that P(intensity≤k)=P(0)+P(1)+…+P(k).  In our case, the possible intensities range from 0 to 255. 

So, for 0≤k≤255 the CDF(k) = P(intensity≤k).

image

Why does this matter?  Well, if we map the current intensities to the CDF, it will evenly distribute our intensities across the spectrum which is exactly what we want.

The Code

private double[] CalculateCDF(WriteableBitmap grayscale)

{

    int[] histogramData = new int[256];

 

    // here we are just calculating the histogram

    // this should be old hat by now

    for (int pixelIndex = 0;

        pixelIndex < grayscale.Pixels.Length;

        pixelIndex++)

    {

        byte intensity = (byte)(grayscale.Pixels[pixelIndex] & 0xFF) ;

        histogramData[intensity]++;

    }

 

 

    double[] cumulativeDistributionFunction = new double[256];

    //the first value is just the probability of the first intensity

    //notice we divide the histogram value by the total number of pixels

    //to get the probability

    cumulativeDistributionFunction[0] =

        (((double)histogramData[0]) / grayscale.Pixels.Length);

    for (int intensity = 1; intensity < 256; intensity++)

    {

        //now we just add the current probability

        //to the previous CDF

        cumulativeDistributionFunction[intensity] =

            (((double)histogramData[intensity]) / grayscale.Pixels.Length)

            + cumulativeDistributionFunction[intensity – 1];

    }

 

    return cumulativeDistributionFunction;

}

 

public WriteableBitmap Equalize(

    WriteableBitmap grayscale,

    double[] cumulativeDistFunction)

{

    WriteableBitmap equalized = new WriteableBitmap(grayscale.PixelWidth, grayscale.PixelHeight);

    int[] histogramData = new int[256];

    for (int pixelIndex = 0;

        pixelIndex < grayscale.Pixels.Length;

        pixelIndex++)

    {

        byte intensity = (byte)(grayscale.Pixels[pixelIndex] & 0xFF);

        histogramData[intensity]++;

    }

 

    histogramData = new int[256];

    int maxCount = 0;

    for (int pixelIndex = 0;

        pixelIndex < grayscale.Pixels.Length;

        pixelIndex++)

    {

        //this is where the magic happens

        //first we get the old intensity

        //then we map the old intensity to

        //the CDF and scale it up to 255

        byte oldintensity = (byte)(grayscale.Pixels[pixelIndex] & 0xFF);

        byte newIntensity =

            (byte)(cumulativeDistFunction[oldintensity] * 255 + .5);

 

        equalized.Pixels[pixelIndex] =

            255 << 24

            | newIntensity << 16

            | newIntensity << 8

            | newIntensity;

 

        histogramData[newIntensity]++;

               

        if (histogramData[newIntensity] > maxCount)

        {

            maxCount = histogramData[newIntensity];

        }

    }

 

    PlotHistogram(histogramData, maxCount);

    return equalized;

}

 

image

image

Notice how the CDF is now a straight line?  This means that we have done a pretty good job of distributing the intensities.  If we were to run this procedure again it would not change.  While these results are similar to our contrast stretched tree notice how the spacing is not equal.  In fact higher occurring intensities are spaced further apart than lower occurring intensities.

Summary

The math involved here is a little more intense than the other image enhancement techniques we performed.  This is one procedure that I did not find well explained on the internet.  If you have any difficulties understanding it get in touch with me @azzlsoft on Twitter or rich@azzlsoft.com.

Download Code

http://cid-88e82fb27d609ced.office.live.com/embedicon.aspx/Blog%20Files/PhoneVision/PhoneVision%2010%20-%20Histogram%20Equalization.zip

Up next: Intro to Spatial Filters





Phone Vision – 09 Contrast Stretching

1 02 2011

Now that we have a histogram, it’s time to have some fun.  Michael Reichmann at Luminous Landscape uses the following picture for a great post on Understanding Histograms for digital photographers.  He graciously allowed to me to use it here and it’s perfect for what we want to accomplish.

image 

This is a fantastic snow scene, but it would be nice if we could increase the contrast to help us pick out features.  In other words, can we ‘stretch’ the intensities to use the whole spectrum?  Of course we can!

Determining Boundaries

Which intensities do we want to become the new black and white?  You could probably eyeball it and come up with a pretty good approximation, but we’d rather automate this process whenever possible.

Let’s pick a threshold, say, 5% of the highest occurring intensity and use it to determine our lower (black) and upper (white) bounds.  If the highest occurring intensity has 100 pixels, then any intensities with fewer than 5 occurrences will be below our threshold.  Allow the code to clear up any confusion.

public Point DetermineBounds(WriteableBitmap grayscale)

{

    //we will use the x value for

    //the left boundary and the

    //y value for right the boundary

    Point bounds = new Point();

 

    int maxIntensity = 0;

    int threshold = 0;

 

    //once again, we are determining

    //the histogram

    int[] histogramData = new int[256];

    for (int pixelIndex = 0;

        pixelIndex < grayscale.Pixels.Length;

        pixelIndex++)

    {

        byte intensity = (byte)(grayscale.Pixels[pixelIndex] & 0xFF);

        histogramData[intensity]++;

        if (histogramData[intensity] > maxIntensity)

        {

            maxIntensity = histogramData[intensity];

        }

    }

 

    // set the threshold = 5% of max

    threshold = maxIntensity / 20;

 

    //first, let’s find the left bound

    //we start at one to simplify our math

    //if we hit the boundary straight away

    //it will be recorded as zero

    //which is what we want

    for (int intensity = 1;

        intensity < histogramData.Length;

        intensity++ )

    {

        //if the number of occurrences of an intensity

        //is greater than our threshold then we’ve found our boundary

        if (histogramData[intensity] >= threshold)

        {

            //the current intensity is where we

            //crossed the threshold

            //we need to store the previous

            //intensity

            bounds.X = intensity – 1;

            break;

        }

    }

 

    //now let’s find the right bound

    //so we’ll start on the right side.

    for (int intensity = histogramData.Length – 1;

        intensity > 0;

        intensity–)

    {

        //same as before

        if (histogramData[intensity] >= threshold)

        {

            //add 1 this time instead of

            //subtract

            bounds.Y = intensity + 1;

            break;

        }

    }

 

    return bounds;

}

 

And just so we can see what we’re doing, we’ll put in a method to draw the boundaries. 

 

public void DrawBounds(Point bounds)

{

    //double threshold = maxCount * .05;

 

    Line lowIntensity = new Line();

    lowIntensity.Height = HistogramPlot.Height;

    lowIntensity.Width = HistogramPlot.Width;

    lowIntensity.Stroke = new SolidColorBrush(Colors.Blue);

    lowIntensity.StrokeThickness = 2;

    lowIntensity.Y1 = 0;

    lowIntensity.Y2 = HistogramPlot.Height;

    lowIntensity.X1 = (bounds.X / 256.0) * HistogramPlot.Width;

    lowIntensity.X2 = (bounds.X / 256.0) * HistogramPlot.Width;

    HistogramPlot.Children.Add(lowIntensity);

 

    Line highIntensity = new Line();

    highIntensity.Height = HistogramPlot.Height;

    highIntensity.Width = HistogramPlot.Width;

    highIntensity.Stroke = new SolidColorBrush(Colors.Blue);

    highIntensity.StrokeThickness = 2;

    highIntensity.Y1 = 0;

    highIntensity.Y2 = HistogramPlot.Height;

    highIntensity.X1 = (bounds.Y / 256.0) * HistogramPlot.Width;

    highIntensity.X2 = (bounds.Y / 256.0) * HistogramPlot.Width;

    HistogramPlot.Children.Add(highIntensity);

}

 

image

That looks like it accurately portrays the boundaries to me.

Shifting the Intensities

The next two steps should be combined, but it’s a little easier to understand when it’s broken into two separate components.  The first step is to make the left bound the new zero.  This means let’s shift the intensities to the dark side of the spectrum.

public WriteableBitmap ShiftLeft(WriteableBitmap grayscale, Point bounds)

{

    WriteableBitmap leftShifted =

        new WriteableBitmap(grayscale.PixelWidth,grayscale.PixelHeight);

 

    int maxIntensity = 0;

    int[] histogramData = new int[256];

 

    for (int pixelIndex = 0;

        pixelIndex < grayscale.Pixels.Length;

        pixelIndex++)

    {

        byte intensity = (byte)(grayscale.Pixels[pixelIndex] & 0xFF);

 

        //here we just subtract every intensity by the

        //left boundary being careful not to let any

        //negative intensities slip through

        intensity = (byte)Math.Max(intensity – bounds.X, 0);

 

        //the rest of this stuff is just fluff

        leftShifted.Pixels[pixelIndex] =

            255 << 24

            | intensity << 16

            | intensity << 8

            | intensity;

 

        histogramData[intensity]++;

        if (histogramData[intensity] > maxIntensity)

        {

            maxIntensity = histogramData[intensity];

        }

    }

    PlotHistogram(histogramData, maxIntensity);

    return leftShifted;

}

 

image

That was easy and already I feel I can pick out more detail, but we aren’t using any more of the spectrum than we were before.

Stretching the Contrast

The final piece is to use the gamut of intensities.  Note that we are not adding information here.  We are just going to spread out the information we already have.  Let’s get to it.

public WriteableBitmap Stretch(

    WriteableBitmap leftShifted,

    Point bounds)

{

    WriteableBitmap stretched =

        new WriteableBitmap(

            leftShifted.PixelWidth,

            leftShifted.PixelHeight);

 

    int maxIntensity = 0;

    int[] histogramData = new int[256];

 

    for (int pixelIndex = 0;

        pixelIndex < leftShifted.Pixels.Length;

        pixelIndex++)

    {

        byte intensity = (byte)(leftShifted.Pixels[pixelIndex] & 0xFF);

 

        //here’s where the magic happens

        //in our example, our low bound was

        //184 and our high bound was 245

        //we have already shifted everything left

        //now we just need to spread it out

        //our target range is 255 intensities,

        //but our current intensity range is only

        //61 (245 – 184).  So we want to scale up

        //our intensities appropriately.  Our scaling

        //factor becomes 255 / 61 ~ 4.18.

        // 0 -> 0

        // i -> i * 4.18

        // 61 -> 255

        // > 61 -> 255

        intensity = (byte)Math.Min

            (intensity * 255 / (bounds.Y – bounds.X), 255);

 

        stretched.Pixels[pixelIndex] =

            255 << 24

            | intensity << 16

            | intensity << 8

            | intensity;

 

        histogramData[intensity]++;

        if (histogramData[intensity] > maxIntensity)

        {

            maxIntensity = histogramData[intensity];

        }

    }

    PlotHistogram(histogramData, maxIntensity);

    return stretched;

}

 

See that’s not so bad.

image

Now, that is a picture!  Notice how the histogram bars have spaces between them?  This is because we aren’t actually adding information.  We’re just making it easier to see.

Summary

I really like this picture.  It works out great for a project like this and we are finally starting to do some useful image processing.  The beauty of this lesson is that it took no human interaction to perform and the results are pretty spectacular.

If you have any questions find me on Twitter (@azzlsoft).

Download Code

http://cid-88e82fb27d609ced.office.live.com/embedicon.aspx/Blog%20Files/PhoneVision/PhoneVision%2009%20-%20Contrast%20Stretching.zip

Up Next: Histogram Equalization

 





Phone Vision 08 – Intro to Histograms

27 01 2011

Now that we’ve finally manipulated a few images we’re starting to get somewhere.   Today we are going to take some strides in classifying images.  It would be nice if we had an objective way to say “that image is really bright”.  It shouldn’t come as a surprise that this is actually pretty easy.

What is a Histogram?

Histogram is really just a fancy word for counting.  In our case, we are counting how many pixels are at each intensity level between black and white.  Let’s revisit our friendly orangutan.

image

The chart below the picture is his histogram.  The horizontal-axis represents the intensity while the vertical-axis represents the number of pixels with that intensity. 

Creating the Histogram

When you are converting an image to grayscale you can trivially create the histogram data with it.  The process is as simple as looping through every pixel (which we have to do to convert to grayscale anyway) and adding up the intensities.

//we are going to use the

//average of the r, g, and b

//values for our grayscale conversion

WriteableBitmap ToAverageGrayscale(WriteableBitmap bmp)

{

    WriteableBitmap grayscale =

        new WriteableBitmap(bmp.PixelWidth, bmp.PixelHeight);

 

    //there are 256 intensities

    //so a simple array of the ints

    //should suffice for holding the counts

    int[] histogramData = new int[256];

 

    //we will use this to determine

    //what the number of the highest

    //occurring intensity is

    //we are simply going to use this value

    //to scale our plot

    int maxCount = 0;

 

    for (int pixelIndex = 0;

        pixelIndex < bmp.Pixels.Length;

        pixelIndex++)

    {

        int pixelColor = bmp.Pixels[pixelIndex];

        byte red = (byte)((pixelColor & 0xFF0000) >> 16);

        byte green = (byte)((pixelColor & 0xFF00) >> 8);

        byte blue = (byte)(pixelColor & 0xFF);

 

        //this is not rocket science

        byte intensity = (byte)((red + green + blue) / 3);

 

        //once again we repeat the intensity

        grayscale.Pixels[pixelIndex] = 255 << 24

            | intensity << 16

            | intensity << 8

            | intensity;

 

        //simply add another to the count

        //for that intensity

        histogramData[intensity]++;

 

        if (histogramData[intensity] > maxCount)

        {

            maxCount = histogramData[intensity];

        }

    }

 

    //this would typically be a completely

    //separate function, but it’s easier

    //in this case to perform the operation

    //with the grayscale conversion

    PlotHistogram(histogramData, maxCount);

    return grayscale;

}

 

public void PlotHistogram(int[] counts, int maxCount)

{

    //in MainPage we have a canvas that will

    //hold the histogram rectangles

    HistogramPlot.Children.Clear();

 

    //our tallest rectangle will be the height

    //of the canvas, so we want to scale the

    //them down from there

    double scale = HistogramPlot.Height / maxCount;

 

    //in our case we will always have 256 counts

    //but better safe than sorry

    double rectWidth = HistogramPlot.Width / counts.Length;

 

    for (int intensity = 0; intensity < counts.Length; intensity++)

    {

        //there is no magic going on here

        //for each intensity we are going to build

        //a rectangle with a height proportional to

        //its count

        Rectangle histRect = new Rectangle();

        histRect.Width = rectWidth;

        histRect.Height = counts[intensity] * scale;

        histRect.SetValue

            (Canvas.LeftProperty,

            intensity * rectWidth);

 

        histRect.SetValue

            (Canvas.TopProperty,

            HistogramPlot.Height – histRect.Height);

 

        //the gradient isn’t necessary

        //it just gives the fill a little

        //softer look

        LinearGradientBrush rectFill = new LinearGradientBrush();

        rectFill.GradientStops.Add(

            new GradientStop() {

                Color =

                Color.FromArgb(

                255,

                (byte)intensity,

                (byte)intensity,

                (byte)intensity),

            Offset = 0 });

 

        rectFill.GradientStops.Add(new GradientStop() {

            Color = Color.FromArgb(

            255,

            (byte)(.25 * intensity),

            (byte)(.25 * intensity),

            (byte)(.25 * intensity)),

            Offset = 1 });

 

        histRect.Fill = rectFill;

 

        //finally, let’s add it to the canvas

        HistogramPlot.Children.Add(histRect);

    }

}

It really is pretty simple. 

What now?

The histogram offers us an objective method for classifying images.  For instance, based on the histogram below we might say the crane is not too dark and not too light. Because it doesn’t use the entire range, however,  we might classify this as having low contrast.  Trust me, this will come in handy in future lessons.

image

Summary

Creating a histogram is really easy, and it gives us some basic statistics.  It doesn’t take a wild imagination to come up with some ways to use this information. Over the next few lessons we will do just that.

Download Code

http://cid-88e82fb27d609ced.office.live.com/embedicon.aspx/Blog%20Files/PhoneVision/PhoneVision%2008%20-%20Intro%20to%20Histograms.zip

Up next: Contrast Stretching

follow @azzlsoft

 





Phone Vision – 07 Grayscale

24 01 2011

Many useful image processing concepts are simpler and faster without color information, so now that you have a solid grasp of color, let’s remove it and take a look at the world in grayscale. 

Conversion

Grayscale is a measure of the intensity of the total light at a given point.  Because the ranges for red, green, and blue in the WriteableBitmap are from 0 to 255 it is convenient for us to set the grayscale ranges similarly where the absence of light (black) is 0 and maximum intensity (white) is 255.  We just need a method to extract this intensity from the color values.

I always go with my gut and it says we should be able to average the color intensities.  Time to put on the coding gloves.

WriteableBitmap ToAverageGrayscale(WriteableBitmap bmp)

{

    WriteableBitmap grayscale =

        new WriteableBitmap(bmp.PixelWidth, bmp.PixelHeight);

    for (int pixelIndex = 0;

        pixelIndex < bmp.Pixels.Length;

        pixelIndex++)

    {

        int pixelColor = bmp.Pixels[pixelIndex];

        byte red = (byte)((pixelColor & 0xFF0000) >> 16);

        byte green = (byte)((pixelColor & 0xFF00) >> 8);

        byte blue = (byte)(pixelColor & 0xFF);

 

        byte intensity = (byte)((red + green + blue) / 3);

 

        //notice how we repeat the intensity

        //across each color channel

        //this makes the output "color neutral"

        //i.e. gray

        grayscale.Pixels[pixelIndex] = 255 << 24

            | intensity << 16

            | intensity << 8

            | intensity;

    }

    return grayscale;

}

Easy peasy lemon squeezy.

image

This looks like mission accomplished to me.

However, we are going from over 16 million colors to 256 intensities.  Obviously we are going to lose information.  How do we know that we are keeping the best information? 

Any three numbers can represent six separate colors, but only one intensity.  the following example uses the numbers 255, 128, and 64:

image

six_colorssix_straight_gray

Simply averaging the numbers will work fine in many situations, but in this situation it breaks down. 

If we are trying to approximate the brightness a human perceives, mimicking human physiology is a better approach.  In the last lesson, the Bayer filter had twice as many green filters as red or blue because humans are more sensitive to green frequencies.   Green colors should then appear brighter than other colors right?

We can modify our code to be slightly more intelligent by weighting the average toward the green colors.

WriteableBitmap ToWeightedGrayscale(

    WriteableBitmap bmp,

    double redWeight,

    double greenWeight,

    double blueWeight)

{

    WriteableBitmap grayscale =

        new WriteableBitmap(bmp.PixelWidth, bmp.PixelHeight);

    for (int pixelIndex = 0;

        pixelIndex < bmp.Pixels.Length;

        pixelIndex++)

    {

        int pixelColor = bmp.Pixels[pixelIndex];

        byte red = (byte)((pixelColor & 0xFF0000) >> 16);

        byte green = (byte)((pixelColor & 0xFF00) >> 8);

        byte blue = (byte)(pixelColor & 0xFF);

 

        //notice that we are dividing by the sum of the weights

        //this can be avoided by ensuring the weights add to 1

        byte intensity = (byte)((red * redWeight

            + green * greenWeight

            + blue * blueWeight)

        / (redWeight + greenWeight + blueWeight));

 

        grayscale.Pixels[pixelIndex] =

            255 << 24

            | intensity << 16

            | intensity << 8

            | intensity;

    }

    return grayscale;

}

Weights that work well for human vision are red = .2, green = .7, and blue =.1 (rounded from Frequently Asked Questions about Color by Charles Poynton).

six_colors

six_weighted_gray

As you should expect, the green colors are brighter.  If you want a grayscale image to match the perceived brightness of the color image, this technique does the job.  Is it better than the simple average?  In our contrived example, yes.  In many cases, the simple average will suffice.  Here is the grayscale orangutan using both techniques.  Can you guess which is simple and which is weighted?

 

image

(hint: the bottom one is weighted)

 

Summary

 

Bottom line: there is no “right” answer.  If you’re trying to make your grayscale image “look” right, the weights used above may be a good path.  If speed is your game, you might try just using the  green channel alone.  No technique is right for every situation.  Remember, our end goal is for the phone, not a person, to understand the image.  Have fun and play around with the numbers.  Think about the advantages and disadvantages of different techniques. 

 

I highly recommend the Frequently Asked Questions about Color by Charles Poynton.  Warning: some of it is pretty math intensive.

 

Download Code

http://cid-88e82fb27d609ced.office.live.com/embedicon.aspx/Blog%20Files/PhoneVision/PhoneVision%2007%20-%20Grayscale.zip

 

Up next: Intro to Histograms

follow @azzlsoft

 





Phone Vision 06 – RGB Color Intensities

21 01 2011

I enjoyed our last blog’s bit of history, but how can we make something more useful than a spinning top?  It turns out that the fundamental principle Maxwell described for color photography is still in use today.  Though the medium is different, for all intents and purposes we are still taking three pictures – one with a red filter, one with a green filter, and one with a blue filter.  Those images are then combined to form a color photograph. 

Color Intensity

When we take a picture, we are just measuring the intensity of light at a given point.  With a sensor alone we would only be capable of creating a map of intensities.  This is definitely useful in some situations, but it ignores oodles of information humans can decipher naturally – color.

Using a filter that only lets through a specific color, we  can measure the intensity of that color.  Remember, we need at least three distinct colors to create the gamut of colors our eyes can distinguish.  To capture three different colors we need three different filters.

In the tender, early years of color photography, three separate pictures had to be taken.  I can’t imagine it was easy to capture a moving target.  Some enterprising people came up with a way to split the light into the three different wavelengths and then aim those beams at three separate light sensitive plates.  Clever as it may be, it can also be expensive and bulky.  There has to be a better way.

Bayer Filter

Conceptually, this next idea is very simple.  From a manufacturing standpoint, I’m not so certain.   Regardless, most modern digital cameras use a single image sensor covered by a grid of color filters arranged in what’s called a Bayer pattern (named after its inventor Bryce Bayer).

BayerFilter

The astute observer might notice that there are twice as many green filters as there are red or blue.  Wikipedia says this is because humans have a heck of a lot of rod cells (light sensing) and they are most sensitive to green light (~498 nm if you care).

In its raw form, the output of the camera above could be thought of as 3 separate images with with missing information:

red_mosaic_question

green_mosaic_questionblue_mosaic_question

Converting this raw data into a human friendly image is a process called demosaicing.  There are lots and lots and lots of ways to accomplish this.  They range from the very simple – like treating a 4×4 group as one pixel by averaging the green values – to a little more complex – like filling in the gaps by averaging neighboring pixels – to way beyond the scope of this blog.

Artificial Bayer Filter

The images we deal with are already demosaiced.  For this lesson we want to mess around with a raw image, but we don’t have one.  With that in mind, we are just going to have to “de-demosaic” an image.  For the color [255,128,64], the “de-demosaic” transform looks something like this:

pixel-255-128-64_to_bayer

 

Notice that our width will be twice the original width and our height will be twice the original height.  This will quadruple the size of the image.  Another important piece is the original coordinate (x,y) becomes 4 separate coordinates in the new image:

image

Let’s see what we can do.

//This assumes you already have the image in

//WriteableBitmap form. Refer to earlier lessons

//if you’re not sure how to do this.

WriteableBitmap ToBayerRaw(WriteableBitmap bmp)

{

    //we need to double the size of the target bitmap

    WriteableBitmap bayerRaw =

        new WriteableBitmap(bmp.PixelWidth * 2, bmp.PixelHeight * 2);

           

    //we loop through every column, row by row.

    //the two loops are not necessary, but

    //they demonstrate the concept easier

    //than a single loop

    for (int y = 0; y < bmp.PixelHeight; y++)

    {

        for(int x = 0; x < bmp.PixelWidth; x++)

        {

            //first recover the RGB pixel data

            int pixel = bmp.Pixels[x + y * bmp.PixelWidth];

                   

            // remember that (x,y) translates to 4 coordinates

            // g1 = (2x, 2y)

            // b = (2x+1, 2y)

            // r = (2x, 2y+1)

            // g2 = (2x+1, 2y+1)

            // also note that we are using the new pixel width

            // as our offset

            // notice the masks in use as well

 

            //upper left green

            bayerRaw.Pixels

                [2*x + 2 * y * (bayerRaw.PixelWidth)]

                = (int)(pixel & 0xFF00FF00);

                   

            //upper right blue

            bayerRaw.Pixels

                [2 * x + 1 + 2 * y * (bayerRaw.PixelWidth)]

                = (int)(pixel & 0xFF0000FF);

                   

            //lower left red

            bayerRaw.Pixels

                [2*x + (2 * y + 1) * (bayerRaw.PixelWidth)]

                = (int)(pixel & 0xFFFF0000);

 

            //lower right green

            bayerRaw.Pixels

                [2*x + 1 + (2 * y + 1) * (bayerRaw.PixelWidth)]

                = (int)(pixel & 0xFF00FF00);

        }

    }

    return bayerRaw;

}

 

image

image

Since half of the pixels are green and only a quarter are red and a quarter are blue, we would expect a green tint.  Matrixesque, am I right?

Interpolation

So now we have a raw file to work with, let’s see if we can fill in some of the “gaps” we’ve created.  There are quite a few techniques to accomplish this, but we are going to ignore them and try to figure them out on our own.   Let’s start with the red pixels:

red_mosaic_question

The first thing I notice is that some of ? pixels have immediate neighbors with values so let’s tackle those first.  My gut says that averaging them would yield a decent approximation:

red_average_01

This gives us value for 8 of the missing pixels leaving 4 behind:

image

If we ran through the image one more time, we could fill in the remaining gaps.  On the second run the new top, bottom, left, and right values would determine the new value of the missing pixels:

image

Wait a minute… Top, bottom, left and right were also the average of the original pixels.  Can we use that?  Let’s label a set of original pixels w, x, y, and z.

image

That means that top, bottom, left, and right are defined as:

image

image

image

image

This simplifies rather nicely (I left out a couple steps):

image

What I like to call the average of the original four pixels.  This is intuitive for a lot of us, but it’s nice to have some math to back it up.

Putting it all Together

WriteableBitmap Demosaic(WriteableBitmap bmp)

{

    WriteableBitmap interpolatedBmp =

        new WriteableBitmap(bmp.PixelWidth, bmp.PixelHeight);

    // we are going to cheat and ignore the boundaries

    // dealing with the boundaries is not difficult,

    // but the code is long enough as it is

    for (int y = 1; y < bmp.PixelHeight – 1; y++)

    {

        for (int x = 1; x < bmp.PixelWidth – 1; x++)

        {

            //first we are going to recover the pixel neighborhood.

            //the mask at the end is simply to get rid of the alpha

            //channel

            //middlecenter is the pixel we are working with (x,y)

            int topleft =

                bmp.Pixels[x – 1 + (y – 1) * bmp.PixelWidth] & 0xFFFFFF;

            int topcenter =

                bmp.Pixels[x + (y – 1) * bmp.PixelWidth] & 0xFFFFFF;

            int topright =

                bmp.Pixels[x + 1 + (y – 1) * bmp.PixelWidth] & 0xFFFFFF;

 

            int middleleft =

                bmp.Pixels[x – 1 + y * bmp.PixelWidth] & 0xFFFFFF;

            int middlecenter =

                bmp.Pixels[x + y * bmp.PixelWidth] & 0xFFFFFF;

            int middleright =

                bmp.Pixels[x + 1 + y * bmp.PixelWidth] & 0xFFFFFF;

 

            int bottomleft =

                bmp.Pixels[x – 1 + (y + 1) * bmp.PixelWidth] & 0xFFFFFF;

            int bottomcenter =

                bmp.Pixels[x + (y + 1) * bmp.PixelWidth] & 0xFFFFFF;

            int bottomright =

                bmp.Pixels[x + 1 + (y + 1) * bmp.PixelWidth] & 0xFFFFFF;

 

            int blue = 0;

            int red = 0;

            int green = 0;

 

            // if we are on an even row and an even column

            // like (2, 2) then we are on a green pixel

            // we need to average top center and bottom

            // center for red we need to average left

            // middle and right middle for blue

            if (y % 2 == 0 && x % 2 == 0)

            {

                red = (((topcenter >> 16) +

                    (bottomcenter >> 16)) / 2) << 16;

                blue = (middleleft + middleright) / 2;

                green = middlecenter;

            }

 

            // if we are on an even row and an odd column

            // like (2, 1) then we are on a blue pixel

            // red is an average of top left, top right,

            // bottom left, and bottom right

            // green is top center, bottom center,

            // left middle, and right middle

            else if (y % 2 == 0 && x % 2 == 1)

            {

                red = (((topleft >> 16) +

                    (topright >> 16) +

                    (bottomleft >> 16) +

                    (bottomright >> 16)) / 4) << 16;

                blue = middlecenter;

                green = (((topcenter >> 8 ) +

                    (bottomcenter >> 8 ) +

                    (middleleft >> 8 ) +

                    (middleright >> 8 )) / 4) << 8;

            }

 

            // if we are on an odd row and an even column

            // like (1, 2) then we are on a red pixel

            // blue is an average of top left, top right,

            // bottom left, and bottom right

            // green is top center, bottom center,

            // left middle, and right middle

            else if (y % 2 == 1 && x % 2 == 0)

            {

                red = middlecenter;

                blue = (topleft +

                    topright +

                    bottomleft +

                    bottomright) / 4;

                green = (((topcenter >> 8 ) +

                    (bottomcenter >> 8 ) +

                    (middleleft >> 8 ) +

                    (middleright >> 8 )) / 4) << 8;

            }

 

            // otherwise we are on an odd row and odd column

            // like (1,1). this is a green pixel

            // red left middle + right middle

            // blue is top center + bottom center

            else

            {

                red = (((middleleft >> 16) +

                    (middleright >> 16)) / 2) << 16;

                blue = (topcenter + bottomcenter) / 2;

                green = middlecenter;

            }

 

            interpolatedBmp.Pixels[x + y * interpolatedBmp.PixelWidth]

                = (255 << 24) | red | green | blue;

        }

    }

    return interpolatedBmp;

}

 

With a little luck, our image should look pretty close to the original.

image

image

image

On the left you’ll find the original followed by the Bayer filter in the middle and, finally the demosaiced is on the right.  I think we did alright, eh?  This image is large, much larger than the viewing space.  If, however, you were to perform the same procedure on an image that is smaller than the viewing space, you would most likely see artifacts from the resizing.

Summary

Hopefully you learned a little about how cameras actually work and maybe a little math.  It wasn’t too hard, but trying to keep all those colors straight can make the code messy.

Download Code

http://cid-88e82fb27d609ced.office.live.com/embedicon.aspx/Blog%20Files/PhoneVision/PhoneVision%2006%20-%20RGBColorIntensities.zip

Up next: Grayscale

follow @azzlsoft





Phone Vision 05 – A Brief History of Color

18 01 2011

You should have a pretty solid grasp of how to effectively manipulate pixels with images.  I guess now we should talk a little bit about color.  Why, after all, do we pick red, green, and blue as our primary colors from which to build things?

 

History

  • 1666 – Isaac Newton demonstrates white light is a combination of colors.
  • 1802 – Thomas Young postulates the existence of three types of photoreceptors in human eyes.
  • 1849 – James David Forbes experiments with rapidly spinning tops creating the illusion of a single color out of a mixture.
  • 1850 – Herman von Helmholtz classifies human color receptors as short, medium, and long corresponding to the wavelengths they are sensitive to.
  • 1861 – James Clerk Maxwell demonstrates a color photographic process using red, green, and blue light.

It turns out that the long, medium, and short photoreceptors (cones) in our eyes roughly correspond to red, green, and blue, respectively.  When emitting light, these three colors can be combined in various ways to produce all of the colors most humans are capable of seeing.  There are several color spaces we can use to generate human-friendly colors, but today we will focus on RGB.

 

The Forbes Top
(I made this name up)

Clearly, techniques for capturing and displaying color have been refined over the last few centuries.   We now enjoy color televisions, LCD screens, web cams, and so-called ‘retina displays’ to name a few.  Today, we are going to take a step back and try to simulate the rapidly spinning tops that James Forbes first experimented with.  The results will be somewhat crude, but they should demonstrate the concepts.

The most important component of this application is actually the design of the colorized circle that will represent our top.  I tried several different approaches before settling on 36 different slices (12 per color) with the colors alternating.  Coupling this with an adjustable rotation, an illusion of color can be established.

Here is the final product with red = 0, blue = 255, and green = 128:

image

 

My attempts at a screen capture video failed.  You’ll just have to download the code and try it yourself.  The only tricky component is actually drawing the wheel so I will walk through that piece.  As always, the rest of the code will be downloadable below.

 

Draw a Pie

private void DrawPie()

{

    //drawing board is just a canvas

    //let’s start with a clean slate

    DrawingBoard.Children.Clear();

 

 

    //red, green, and blue are member

    //variables adjusted with sliders

    double colorsum = red + green + blue;

    // a circle has 360 degrees or 2*Pi radians

    // we need to divide that into the different color

    // components.

    double redAngle = 2* Math.PI * (red / colorsum);

    double greenAngle = 2* Math.PI * (green / colorsum);

    double blueAngle = 2* Math.PI * (blue / colorsum);

 

    // this part is tricky

    // I am essentially doing a gamma function

    // to try to keep the intensities close

    // I eyeballed it so it’s not 100% accurate

    byte intensity =

        (byte)Math.Min(

        Math.Pow((red * .33

            + green * .34

            + blue * .33), 1.2), 255);

 

 

    Point start = new Point(0, 100);

    Random random = new Random();

    // we are going to divide our top

    // into 36 slices — 12 per color

    const int numSlices = 36;

    for (int i = 0; i < numSlices; i++)

    {

        SolidColorBrush brush = new SolidColorBrush(Colors.Black);

        double angle = 0;

        switch(i % 3)

        {

            case 0: //red

                angle = redAngle / (numSlices / 3);

                brush = new SolidColorBrush(

                    Color.FromArgb(255, intensity, 0, 0));

                break;

            case 1: //green

                angle = greenAngle / (numSlices / 3);

                brush = new SolidColorBrush(

                    Color.FromArgb(255, 0, intensity, 0));

                break;

            case 2: //blue

                angle = blueAngle / (numSlices / 3);

                brush = new SolidColorBrush(

                    Color.FromArgb(255, 0, 0, intensity));

                break;

        }

 

 

        // this code was essentially lifted from the Silverlight forums

        // http://forums.silverlight.net/forums/t/9013.aspx

        // Yi-Lun Luo posted it as a method for drawing pie graphs

        // I modified it for the color top

        // I will warn you that if your angle is > 90 degrees

        // you will most likely run into trouble

        Path path = new Path();

        PathGeometry geometry = new PathGeometry();

        PathFigure figure = new PathFigure();

        // The center point is 0,0.

        figure.StartPoint = new Point(0, 0);

        LineSegment line1 = new LineSegment();

        // Draw a line from the center of the circle to the start

        // point of the sector’s arc.

        line1.Point = start;

        figure.Segments = new PathSegmentCollection();

        figure.Segments.Add(line1);

        ArcSegment arc = new ArcSegment();

        arc.Size = new Size(100, 100);

        arc.SweepDirection = SweepDirection.Clockwise;

        Point p = start;

        //Perform a rotate on the start point of the arc

        // to compute the end point.

        p.X = start.X * Math.Cos(angle) – start.Y * Math.Sin(angle);

        p.Y = start.X * Math.Sin(angle) + start.Y * Math.Cos(angle);

        //The next sector’s arc will begin from this one’s end point.

        start = p;

        arc.Point = p;

        figure.Segments.Add(arc);

        LineSegment line2 = new LineSegment();

        // Draw another line from the sector’s end point

        // to the center of the circle.

        line2.Point = new Point(0, 0);

        figure.Segments.Add(line2);

        geometry.Figures = new PathFigureCollection();

        geometry.Figures.Add(figure);

        path.Data = geometry;

        // Assign a random color to this sector.

        path.Fill = brush;

        DrawingBoard.Children.Add(path);

        // Since our circle ranges from -100,-100 to 100,100,

        // we’ll set the Path’s Left and Top to show all the graph.

        path.SetValue(Canvas.LeftProperty, 100.0);

        path.SetValue(Canvas.TopProperty, 100.0);

    }

}

 

 

Summary

 

The most useful piece from this lesson might be the code for drawing a pie graph, but a bit of history is always fun.  The bottom line is that RGB comes from biology.  Our eyes are sensitive to those colors, so if we use them to mix we should be able to create images that mimic real life.

 

Download Code

http://cid-88e82fb27d609ced.office.live.com/embedicon.aspx/Blog%20Files/PhoneVision/PhoneVision%2005%20-%20HistoryOfColor.zip

 

Up Next: RGB Color Intensities