Shrky's Blog: Java vs. C++: The Performance Showdown

by :Liviu Tudor

Since the early days of the Java programming language, some have argued that Java's being an interpreted language has made it inferior to the likes of C and C++ in terms of performance. Of course, C++ devotees would never, ever even consider Java a “proper” language, while the Java crowd always throws “write once, run everywhere” in the faces of the C++ programmers.

First things first, how well does Java perform when it comes to basic integer arithmetic? If I asked you how much 2 x 3 is, you would probably answer in no time. How long would it take a program? To check, here’s a basic test:

Generate first X numbers of random integer numbers.
Multiply those numbers with every number from 2 to Y.
Compute how long it takes to perform the whole set.

Because you’re not interested in how long it takes to generate a random number, it is important that you generate the random numbers upfront, before you start measuring.

Putting Java to the Test

In Java, generating the random numbers is a very simple task:

   private void generateRandoms()
  {
     randoms = new int[N_GENERATED];
    
     for( int i = 0; i < N_GENERATED; i++)
     {
        randoms[i] = (int)(i * Math.random());
     }
  }

The computations are just as easy:

   private void javaCompute()
  {
     int result = 0;
     for(int i = 2; i < N_MULTIPLY; i++)
     {
        for( int j = 0; j < N_GENERATED; j++ )
        {
           result = randoms[j] * i;
           result++;
        }
     }
  }

Now, you could simply measure the time it took you to execute the javaCompute method shown above and consider that a valid test. However, that wouldn’t be fair for a few reasons. First, during the execution of a program, the JVM will load classes as needed by the classloaders. The OS itself also will prepare various data structures needed by the JVM as it requests them. So, chances are the first execution of the above function actually spends a lot of time preparing the execution. That's why measuring the time as a one-off is probably not a good idea.

Java Needs to Get a Pair (and a Triple...)

You could instead run the program a few times and average the results. However, the first execution of the function will still suffer from the same problems in terms of execution time, which will prevent you from getting an accurate measurement of timing. On the other hand, the more you execute the method above, the greater the chances that some data will get cached—some by the JVM, some by the OS. As a result, you will end up with some data reflecting not necessarily the execution time but rather the result of OS code running optimization, which again will affect your averaging.

To counterbalance these problems, you can run the method above a few times and simply eliminate the “freakiest” cases: the lowest and the highest times. That should get you closer to the true average. And assuming you store the length of each time you ran the javaCompute method above in an array, here’s a simple method to achieve this:

   private static long testTime( long diffs[] )
  {
     long shortest = Long.MAX_VALUE;
     long longest = 0;
     long total = 0;
     for( int i = 0; i < diffs.length; i++ )
     {
        if( shortest > diffs[i] )
           shortest = diffs[i];
        if( longest < diffs[i] )
           longest = diffs[i];
        total += diffs[i];
     }
     total -= shortest;
     total -= longest;
     total /= ( diffs.length - 2 );
     return total;
  }

Now, putting all of this together, you get the following code (also found in IntMaths.java):

public static void main(String[] args)
  {
     IntMaths maths = new IntMaths();
     maths.generateRandoms();
     //compute in Java
     long timeJava[] = new long[N_ITERATIONS];
     long start, end;
     for( int i = 0; i < N_ITERATIONS; i++ )
     {
        start = System.currentTimeMillis();
        maths.javaCompute();
        end = System.currentTimeMillis();
        timeJava[i] = (end - start);
     }
     System.out.println( "Java computing took " + testTime(timeJava) );
  }

Notice that you’re measuring time in milliseconds here.

On my laptop (which is not quite high-spec but new enough to be representative), running the above code shows an average of about 25 milliseconds. That is, it takes my laptop about 25 milliseconds to multiply 10,000 random integer numbers with each number from 2 to 1,000, which basically means it takes 25ms to perform about 10,000,000 integer arithmetic operations.

Putting C++ to the Test

Now let’s try something similar in C++:

void generate_randoms( int randoms[] )
{
   for( int i = 0; i < N_GENERATED; i++ )
       randoms[i] = rand();
}

To be precise, the numbers generated in Java are not the same numbers generated in C++. Therefore, you cannot ensure that both implementations will use the same set of numbers. However, the differences between the two should be quite small.

The C++ computation is similar to the Java one:

void nativeCompute(int randoms[])
{
  int result = 0;
  for(int i = 2; i < N_MULTIPLY; i++)
  {
     for( int j = 0; j < N_GENERATED; j++ )
     {
        result = randoms[j] * i;
        result++;
     }
  }
}

Because C++ doesn't seem to offer a standard way to get millisecond precision for timing the operations, this code is specifically targeted to Windows platforms. Using the QueryPerformanceCounter function provides access to high-resolution timers. To use this function, the code utilizes a simple “stopwatch” class that has just two methods (Start and Stop). Based on the timing, this class records when each of these methods is called and returns the time difference in milliseconds:

// Stop watch class.
class CStopWatch
{
public:
   // Constructor.
   CStopWatch()
   {
       // Ticks per second.
       QueryPerformanceFrequency( &liPerfFreq );
   }

   // Start counter.
   void Start()
   {
       liStart.QuadPart = 0;
       QueryPerformanceCounter( &liStart );
   }

   // Stop counter.
   void Stop()
   {
       liEnd.QuadPart = 0;
       QueryPerformanceCounter( &liEnd );
   }

   // Get duration.
   long double GetDuration()
   {
       return ( (liEnd.QuadPart - liStart.QuadPart) / long double(liPerfFreq.QuadPart) ) * 1000;    //return result in milliseconds
   }

private:
   LARGE_INTEGER liStart;
   LARGE_INTEGER liEnd;
   LARGE_INTEGER liPerfFreq;
};

You can apply a similar pattern for finding the average of the times the stopwatch class took:

long double test_times( long double diffs[] )
{
     long double shortest = 65535;
     long double longest = -1;
     long double total = 0;
     for( int i = 0; i < N_ITERATIONS; i++ )
     {
        if( shortest > diffs[i] )
           shortest = diffs[i];
        else if( longest < diffs[i] )
           longest = diffs[i];
        total += diffs[i];
     }
     total -= shortest;
     total -= longest;
     total /= ( N_ITERATIONS - 2 );
     return total;
}

This pattern leaves you with the following implementation (present in IntMaths.c):

int main(int argc, char* argv[])
{
   int randoms[N_GENERATED];
   generate_randoms( randoms );
  
   CStopWatch watch;
  long double timeNative[N_ITERATIONS];

  for( int i = 0; i < N_ITERATIONS; i++ )
  {
      watch.Start();
     nativeCompute(randoms);
       watch.Stop();
     timeNative[i] = watch.GetDuration();
  }
  printf( "C computing took %lf\n", test_times(timeNative) );
  
  return 0;
}

Find the above code in the IntMaths project included in the JavaVsCPP.zip code download.

Running the above code on my laptop returns the following (the measurement is in milliseconds):

C computing took 0.001427

So, it took the C++ code about 1/1000th of a millisecond to perform about 10,000,000 arithmetic operations!

To compare: 25ms in Java, 0.001ms in C++. That's quite a difference! However, bear in mind that the C++ code was compiled using full compiler and linker optimization for speed. Simply disabling the optimization will you return the following result:

C computing took 70.179901

Ouch! That is three times slower than the Java version! The moral of the story is: yes, C++ performs better at first glance, but (and this is a big but) only if the compiler optimizes the code well!

Most C++ compilers nowadays will perform a decent optimization of the generated code. However, the difference between 1/1000th of a millisecond and 70 milliseconds is left in the hands of the compiler. Always bear that in mind when you switch to C++ for speed reasons!

Shrky's Blog

Tuesday, January 12, 2010

Java vs. C++: The Performance Showdown

Putting Java to the Test

Related Articles

Putting C++ to the Test

No comments:

Post a Comment

World Cup 2010 Countdown

About Me

Blog Archive