C# Performance

24 03 2011

I have been working C# for nearly 8 years and never lost sleep over performance. Sure there are times when things run slowly, but the bottleneck is usually a database query or a web service call. I have never really needed to look into C# performance because it was always fast enough.

A couple weeks ago, Danny Tuppeny wrote “Why I’m Close to Giving Up on Windows Phone 7, as a User and a Developer”.  He raised some pretty valid points, but of course this was a call to arms for the “Anything But Microsoft” crowd.  A side thread about C# performance caught my eye.  Here are a couple of the comments:

  • “My favorite feature of .Net, in general, is sluggish performance, but C# is the best language to write sluggish software in by far.”
  • “ObjC performance is also usually much better than even unsafe C# code, since it is essentially C.”

As I said before, I have never had reason to consider C# “sluggish”, but I also hadn’t really looked into its performance either.  Let’s cut to the chase.

The Results

These are the results (in seconds) for a naïve nth prime finder.  11, 101, 1001, 10001, and 1000001 are the “n’s” in nth.  31, 547, 7927, 104743, and 1299721 are the actual values for the nth prime.

  Version Trial 11 101 1001 10001 100001
31 547 7927 104743 1299721
C++ 1 1 0.000 0.000 0.017 2.101 263.429
2 0.000 0.000 0.021 2.100 263.472
3 0.000 0.000 0.016 2.103 264.520
4 0.000 0.000 0.016 2.109 263.482
5 0.000 0.000 0.015 2.098 265.413
Average 0.000 0.000 0.017 2.102 264.063
C# 1 1 0.001 0.001 0.017 2.172 264.570
2 0.001 0.001 0.017 2.133 264.378
3 0.001 0.001 0.017 2.114 264.246
4 0.001 0.001 0.016 2.133 264.401
5 0.001 0.001 0.017 2.128 264.654
Average 0.001 0.001 0.017 2.136 264.450

The difference is negligible.

The Code

I chose the nth prime problem for a couple reasons.  The code could be written nearly identical in C++ and C# removing any ambiguity about language semantics.  It also has a wide range of optimizations that can be applied.  Version 1 is an extremely naïve (i.e. not optimized) approach.

C++

#include "stdafx.h"

#include <time.h>

#include <iostream>

 

const int maxPrimeIndex = 100001;

int _tmain(int argc, _TCHAR* argv[])

{

       clock_t start, finish;

       start = clock();

 

       int currentPrime = 2;

       int primeCount = 1;

 

       while (primeCount < maxPrimeIndex)

       {

              bool isPrime = false;

              int candidate = currentPrime + 1;

              while (!isPrime)

              {

                     isPrime = true;

                     for (int factor = 2; factor< candidate; factor++)

                     {

                           if (candidate % factor == 0)

                           {

                                  isPrime = false;

                                  break;

                           }

                     }

                     if (!isPrime)

                     {

                           candidate++;

                     }

              }

              currentPrime = candidate;

              primeCount++;

       }

 

       finish = clock();

       double elapsed = ((double)(finish – start)) / CLOCKS_PER_SEC;

       std::cout << elapsed << std::endl << currentPrime << std::endl;

       return 0;

}

C#

using System;

 

namespace SpeedTest

{

    class Program

    {

        const int maxPrimeIndex = 100001;

        static void Main(string[] args)

        {

            DateTime start, finish;

            start = DateTime.Now;

 

            int currentPrime = 2;

            int primeCount = 1;

 

            while (primeCount < maxPrimeIndex)

            {

                bool isPrime = false;

                int candidate = currentPrime + 1;

                while (!isPrime)

                {

                    isPrime = true;

                    for (int factor = 2; factor < candidate; factor++)

                    {

                        if (candidate % factor == 0)

                        {

                            isPrime = false;

                            break;

                        }

                    }

                    if (!isPrime)

                    {

                        candidate++;

                    }

                }

                currentPrime = candidate;

                primeCount++;

            }

 

            finish = DateTime.Now;

            TimeSpan elapsed = finish – start;

            Console.WriteLine(elapsed.ToString());

            Console.WriteLine(currentPrime);

        }

    }

}

Why?

Why are the results so similar?  Because both C++ and C# are compiled.  When comparing these programs we are really comparing their compilers.  This program is pretty straight-forward (i.e. no function calls, memory allocation, array checking, etc.) so the compiler optimizations are probably pretty similar.  I used the Microsoft C++ compiler, but I also tested it with g++ and there was negligible difference. 

If you really need to see the C++ compiler “beat” the C# compiler change candidate from an int to a long.  The C++ code will run a little more than twice as fast.  My guess is that the % operator is much less efficient on longs in C#, but that’s just a guess.

C# is typically JIT compiled, but it doesn’t have to be.  You can use a tool called ngen to compile the image before running it.  You should have a really good reason before doing this though because it can cause headaches when managing updates and the results in many cases will not be dramatic.

Optimization

As I said before, this code was intentionally not optimized.  We are going to apply a very simple but very effective optimization by only checking possible factors up to (and including) the square root of the candidate.  Here are the results:

  Version Trial 11 101 1001 10001 100001
31 547 7927 104743 1299721
C++ 2 1 0.000 0.000 0.000 0.016 0.483
2 0.000 0.000 0.000 0.017 0.479
3 0.000 0.000 0.000 0.015 0.472
4 0.000 0.000 0.000 0.015 0.468
5 0.000 0.000 0.000 0.014 0.481
Average 0.000 0.000 0.000 0.015 0.477
C# 2 1 0.001 0.001 0.002 0.016 0.484
2 0.001 0.001 0.001 0.018 0.474
3 0.001 0.001 0.001 0.016 0.470
4 0.001 0.001 0.002 0.018 0.473
5 0.001 0.001 0.002 0.019 0.470
Average 0.001 0.001 0.002 0.017 0.474

With one simple optimization we have drastically increased our efficiency as n increases.  This raises another question: how fast is fast enough?  For the 100001st it was definitely worth our while.  For the 10001st we save a couple of seconds.  For the 1001st and below it took us more time to write this simple optimization than we’ll ever save.  Context is important.

Conclusion

The point is that people that make blanket statements about performance often have no idea what they are talking about.  Good programmers take the time to understand bottlenecks.  Their gut reaction isn’t “throw more hardware at it” or “should have written it in C”.  A great compiler isn’t going to save your application from a bad coder.

If you disagree, let me know.  Maybe you’re curious how JavaScript or Python compares?  I’d be happy to oblige.  You can find me on Twitter (@azzlsoft) or email (rich@azzlsoft.com).