Monday, December 27, 2010

Re Java final keyword, immutability and reflection

Few days ago, in a conversation ("Sun Certified Java Programmer " group at http://www.linkedin.com/) regarding Immutability pattern and Java Reflection, somebody came with the following code example (slightly adjusted by me in order to highlight the class attributes initialization order), which I found quite interesting and decided to post it.

So, execute the following code:
import java.lang.reflect.Field; 

class Test {
 public Test() {
  doIt();
 }
 public void doIt() {}
}

public class TestReflection extends Test { 
 private final String name = "Immutable";
 private String n = "n";
 public TestReflection() { } 

 public void doIt() {
  System.out.println("1 = " + name); 
  System.out.println(n); 
  System.out.println("-----");
 }

 public static void main(String[] args) throws Exception { 
  TestReflection abc = new TestReflection();
  Class c1 = Class.forName("TestReflection");
  Field field2 = c1.getDeclaredField("name"); 
  field2.setAccessible(true); 

  System.out.println("2 = " + abc.name); 
  field2.set(abc, "Mutable"); 
  System.out.println("3 = " + abc.name); 
 } 
}

Output: 
1 = Immutable
null
-----
2 = Immutable
3 = Immutable

And the following version of the code:
import java.lang.reflect.Field;

class Test {
 public Test() {
  doIt();
 }
 public void doIt() {}
}
public class TestReflection extends Test { 
 private final String name; 
 private String n = "n";

 public TestReflection() { 
  name = "Immutable"; 
 } 

 public void doIt() {
  System.out.println("1 = " + name); 
  System.out.println(n); 
  System.out.println("-----");
 }

 public static void main(String[] args) throws Exception { 
  TestReflection abc = new TestReflection();
  Class c1 = Class.forName("TestReflection"); 
  Field field2 = c1.getDeclaredField("name"); 

  field2.setAccessible(true); 
  System.out.println("2 = " + abc.name); 
  field2.set(abc, "Mutable"); 
  System.out.println("3 = " + abc.name); 
 } 
}

Output: 
1 = null
null
-----
2 = Immutable
3 = Mutable

The answer to this behaviour can be found in JLS (http://java.sun.com/docs/books/jls/third_edition/html/j3TOC.html), section 17.5.3,

"Even then, there are a number of complications. If a final field is initialized to a compile-time constant in the field declaration, changes to the final field may not be observed, since uses of that final field are replaced at compile time with the compile-time constant. "

Wednesday, December 22, 2010

Java 32bits vs. 64bits

I was playing recently with a recursive method for building all the possible routes via a set of nodes (with some predefined rules to link each pair of nodes) in Java. Rather than persisting the calculated routes into a file, I keep them in a HashSet.

Having about 138 nodes, this recursive method generates more than 5mln routes. This is quite an intensive calculation that consumes (and this is the point)
- about 1GB RAM with 32 bits JVM and
- about 2GB with 64 bits JVM.

Execution time, for building the routes, is like 17 seconds with 32 bits JVM and 21 seconds with 64 bits JVM.

Obvious question is "What the heck?". Apparently, here are the problems:
1. 64 bits systems are supposed to use (correct) 64 bits pointers (please allow me to use this C/C++ term) for accessing the process address space, thus allowing the process to use/access more (virtual) memory (i.e. 4GB with 32 bits vs. 16TB with 64 bits). But keeping an allocation table for 64 bits pointers (comparing to 32 bits) requires much more memory. So, 1GB vs. 2GB RAM isn't really a JVM problem.

Conclusion: if at some point you decide to use 64 bits JVM, don't forget to adjust the heap volume accordingly, in order to avoid unpleasant surprises.

2. Problem is also related to the way memory chips (DDR SDRAM) are made. Without diving into details (more details here http://lwn.net/Articles/250967/), accessing physical memory is a very slow operation and CPU tends to store frequently accessed data in its cache memory (L1, L2 and/or L3, which are far better memory chips, more expensive though). Modern compilers also tend to help developers from this point of view, if they follow some very well known rules (e.g. use arrays, stack is "cache-able", don't use "volatile" unless you really know what it is, etc.). But larger pointers require different memory alignment, thus 64 bits pointers aren't so cache efficient in the way 32 bits pointers are (e.g. when thinking about arrays). As a result, accessing physical RAM (or CPU cache miss) is the result of the 4 seconds difference.

To conclude: take care when deciding about using 32 bits or 64 bits Java.

And few online resources:
http://benjchristensen.com/2007/02/16/32-bit-versus-64-bit-jdk-memory-usage/
http://portal.acm.org/citation.cfm?id=1107407

Monday, October 4, 2010

On an inequality

Few days ago I recalled an interesting inequality I used in my PhD work. It looks like, for any integrable function $f(x) > 0$, $\forall x\in [a,b]$ $$e^{\frac{1}{b-a}\int\limits_{a}^{b}\ln{f(x)}dx}\leq \frac{1}{b-a}\int\limits_{a}^{b}f(x)dx$$ This is, more or less, a generalised form of the inequality of arithmetic and geometric means. E.g. for any $x_i \in[a,b]$, $\forall i=1..n$, $a=x_1$, $b=x_n$ $$\sqrt[n]{\prod\limits_{i=1}^n f(x_i)}\leq \frac{1}{n} \left(\sum\limits_{i=1}^n f(x_i)\right)$$ The natural logarithm is monotonically increasing function: $$\frac{1}{n}\left(\sum\limits_{i=1}^n \ln{f(x_i)}\right)\leq \ln{\left(\frac{1}{n} \left(\sum\limits_{i=1}^n f(x_i)\right)\right)}$$ Let's consider $h=\frac{b-a}{n}$ $$\frac{1}{b-a}\cdot\frac{b-a}{n}\left(\sum\limits_{i=1}^n \ln{f(x_i)}\right)\leq \ln{\left(\frac{1}{b-a} \left(\sum\limits_{i=1}^n f(x_i)\cdot\frac{b-a}{n}\right)\right)}$$ Or $$\frac{1}{b-a} \left(\sum\limits_{i=1}^n h\cdot\ln{f(x_i)}\right)\leq \ln{\left(\frac{1}{b-a} \left(\sum\limits_{i=1}^n f(x_i)\cdot h\right)\right)}$$ Now let's consider $\lim\limits_{h\to 0}$ and the fact that the limit keeps the inequality: $$\lim\limits_{h\to 0} \frac{1}{b-a} \left(\sum\limits_{i=1}^n h\cdot\ln{f(x_i)}\right) = \frac{1}{b-a}\int\limits_{a}^{b}\ln{f(x)} dx$$ The natural logarithm function is continuous, so: $$\lim\limits_{h\to 0} \ln{\left(\frac{1}{b-a} \left(\sum\limits_{i=1}^n f(x_i)\cdot h\right)\right)}=\\ \ln{\left(\frac{1}{b-a} \cdot \lim\limits_{h\to 0} \left(\sum\limits_{i=1}^n f(x_i)\cdot h\right)\right)}= \ln{\left(\frac{1}{b-a} \cdot \int\limits_{a}^{b}f(x)dx\right)}$$ As a result $$\frac{1}{b-a}\int\limits_{a}^{b}\ln{f(x)} dx \leq \ln{\left(\frac{1}{b-a} \cdot \int\limits_{a}^{b}f(x)dx\right)}$$ Considering that the natural exponential function is also monotonically increasing function we receive the original inequality.

Another proof is the fact that the natural logarithm function is concave, i.e. for $\forall \alpha_i$ such that $\sum\limits_{i} \alpha_i=1$ $$\sum\limits_{i} \alpha_i \cdot \ln{f(x_i)} \leq \ln{\left(\sum\limits_{i} \alpha_i \cdot f(x_i)\right)}$$ Now, if we consider $\alpha_i=\frac{Δx_i}{b-a}=\frac{h}{b-a}$ we will receive the same result.

Tuesday, March 23, 2010

Re boxed primitives in Java and .NET

One of my colleges posted the following question once "why <i = i++> expression returns different results in C++ and Java/.NET?". For instance try this:

C/C++
int i = 1;
i = i++; 
printf("%d\n", i);

result is 2

Java
int i = 1;
i = i++; 
System.out.println(i);

result is 1

Many C++ developers will argue that <i = i++> is left undefined in C++ (i.e. it is up to compiler implementers to return whatever they think it is most appropriate). However, at a more practical level <i = i++> against a primitive int type, in C/C++, is a simple (optimized ASM code, avoiding all the formalities with moving to/from registers):

mov i, i
inc i

at the same "address location" (in a simplistic way). So the result is 2.

Now, let's look what C++ suggests about operator++(int). It suggests making a copy of the current instance, increasing current instance and returning the copy. Following this rule, the result is:

class MyInt {
private:
 int i;
public:
 MyInt(int iVal) { i = iVal; };
 int val() const { return i; };
 MyInt(const MyInt& t) { i = t.val(); };

 MyInt& operator=(const MyInt& t) {
  i = t.val();
  return *this;
 };

 MyInt operator++(int) {
  MyInt t = *this;
  i++;
  return t;
 };
};

MyInt func() {
 MyInt i = MyInt(1);
 i = i++;
 return i;
}

int _tmain(int argc, _TCHAR* argv[])
{
 int i = 1;
 i = i++;
 _tprintf(_T("%d\n"), i);

 MyInt t = func();
 _tprintf(_T("%d\n"), t.val());
 return 0;
}

Result is
2
1

But it is exactly what Java/.NET returns. From this, it is logical to conclude that Java/.NET primitives are boxed (which seems to be logical, otherwise it is hard to imagine how to support platform independence in Java/.NET, for example replacing the above class with a structure like "struct MyInt { int i : 32; };" in order to support 32 bits integers). Also this means that C/C++ works faster with primitives :)

http://www.codeproject.com/Articles/67392/Re-boxed-primitives-in-Java-and-NET.aspx

Friday, February 19, 2010

Re Microsoft HPC Server 2008

Microsoft is targeting Financial Markets with its new, I could say, product Microsoft HPC (High Performance Computing) Server 2008.
http://www.microsoft.com/hpc/en/us/financial-services.aspx

It is indeed new, because it is in “beta” and the latest news from Microsoft HPC Server 2008 team’s blog dates with Nov 2009:
http://blogs.technet.com/WindowsHPC/

It is also free for evaluation from Microsoft web site:
http://www.microsoft.com/hpc/en/us/default.aspx

I also heard that, while most of the financial market providers use Real Time Linux based systems, London Stock Exchange (together with Microsoft) is trying (http://www.theregister.co.uk/2009/11/26/lse_crash_again/) to achieve the desired real time/latency related requirements with Windows based solutions and I guess they will switch to using Microsoft HPC Server 2008 soon.

In the meantime, Wikipedia suggests that some configurations using Microsoft HPC Server 2008 succeeded to rank the 23rd position in the list of top 500 fastest super-computers:
http://en.wikipedia.org/wiki/Windows_HPC_Server_2008

Thursday, February 18, 2010

Re DirectCompute API (high speed calculations)

This is just a brief introduction to the subject. Few years ago programmers realized that GPU (graphical processor unit, those on NVIDIA and ATI) can be used for various (Float Point, Matrix/Vector, etc.) calculations. So NVIDIA started its CUDA project:
http://www.ddj.com/cpp/207200659

Recently (almost), Microsoft decided to continue this idea in a more standard way, by defining DirectCompute API, as part of DirectX:
http://www.nvidia.com/object/directcompute.html

And Microsoft decided to develop Windows 7 based on this addition to DirectX:
http://www.ditii.com/2009/08/22/gpu-computing-via-directcompute-in-windows-7/

So, it seems that DirectCompute is going to become standard API for super fast calculations in Windows, especially in finance for real time pricing computation.