Friday, January 27, 2012

Scope of variables in Java, nulling and why this might be important

While playing with some performance improvements this week, I came across this article. Weird you'd say, but let me show my version of the test scenario.

First of all, I am using Java 1.6.0_24 HotSpot 64-bit and executing the code with the following options (i.e. limiting the heap size to 64Mb):


java -Xms64m -Xmx64m MemTest

Here is the first version of the code:


public class MemTest {
 public static int avg(int[] ns) {
  long sum = 0;
  for (int i = 0; i < ns.length; i++) sum += ns[i];
  return (int) (sum/ns.length);
 }

 public static void main(String[] args) {
  int[] nums = null;

  for (int j = 0; j < 1000000; j++) {
   System.out.print("Iteration: ");
   System.out.println(j);

   int sz = 2<<(j % 30);
   if (sz > 11162611) sz = 11162611;

   System.out.print("Dynamic size: ");
   System.out.println(sz);

   nums = new int[sz];
   for (int i = 0; i < nums.length; i++) nums[i] = i;

   int avr = avg(nums);
   System.out.print("Average: ");
   System.out.println(avr);
  }
 }
}

There is nothing complicated in this code, we allocate an array (int[] nums) outside the main "for" loop (this is the key point of this post by the way) and calculate the average. Everything works fine so far. Now let's remove the "for" loop responsible for assignments:


public class MemTest {
 public static int avg(int[] ns) {
  long sum = 0;
  for (int i = 0; i < ns.length; i++) sum += ns[i];
  return (int) (sum/ns.length);
 }

 public static void main(String[] args) {
  int[] nums = null;

  for (int j = 0; j < 1000000; j++) {
   System.out.print("Iteration: ");
   System.out.println(j);

   int sz = 2<<(j % 30);
   if (sz > 11162611) sz = 11162611;

   System.out.print("Dynamic size: ");
   System.out.println(sz);

   nums = new int[sz];
   int avr = avg(nums);
   System.out.print("Average: ");
   System.out.println(avr);
  }
 }
}

Now the execution fails with the OutOfMemoryError. Odd, isn't it? What do we typically do in such cases? Increase the heap size? Well, but the previous version was working fine with the same heap size. Ok, now let's do something which isn't considered as a good practise by adding the line "nums = null;":


public class MemTest {
 public static int avg(int[] ns) {
  long sum = 0;
  for (int i = 0; i < ns.length; i++) sum += ns[i];
  return (int) (sum/ns.length);
 }

 public static void main(String[] args) {
  int[] nums = null;

  for (int j = 0; j < 1000000; j++) {
   System.out.print("Iteration: ");
   System.out.println(j);

   int sz = 2<<(j % 30);
   if (sz > 11162611) sz = 11162611;

   System.out.print("Dynamic size: ");
   System.out.println(sz);

   nums = new int[sz];
   int avr = avg(nums);
   nums = null; // <<-- HERE IS THE NEW LINE

   System.out.print("Average: ");
   System.out.println(avr);
  }
 }
}

Hey, fantastic, it works again! And now let's do it properly by changing the scope of the "int[] nums":


public class MemTest {
 public static int avg(int[] ns) {
  long sum = 0;
  for (int i = 0; i < ns.length; i++) sum += ns[i];
  return (int) (sum/ns.length);
 }

 public static void main(String[] args) {
  for (int j = 0; j < 1000000; j++) {
   System.out.print("Iteration: ");
   System.out.println(j);

   int sz = 2<<(j % 30);
   if (sz > 11162611) sz = 11162611;

   System.out.print("Dynamic size: ");
   System.out.println(sz);

   int[] nums = new int[sz]; // <<-- LOCAL SCOPE
   int avr = avg(nums);

   System.out.print("Average: ");
   System.out.println(avr);
  }
 }
}

And it works again.

So, why is this happening? I have no idea. It could be a particularity of the JVM implementation, as the author of the article, I mentioned at the top, suggests. But, obviously we can draw a conclusion, variable scope is something worth considering (and probably coding standards aren't always just "cosmetic" features).