Java Maps and load factors

When writing Java code be that for standalone Java, Domino agents or even XPages you often need a data structure to map keys to values. In Java this is a map (java.util.Map) and it works much like a List in LotusScript. Maps are used a lot but since Map is an interface you need an implementation whih is most often a java.util.HashMap.

The way to construct a HashMap is using one of the constructors and if you are like me a couple of months ago you just use the default constructor that is the one that doesn’t take any arguments. Now this is well and fine if you know what you’re doing. Most times the use of the default constructor is the wrong choice and can / will lead to memory being wasted and even worse is worse performance when adding more than 12 elements.

Lets take each in turn. The wasted memory comes from the classes used internally inside a HashMap to keep track of keys and values. For this post knowing that not filling the map (e.g. creating a HashMap using the default constructor and only stuffing 5 elements in it) will waste memory. A most excellent introduction is From Java Code to Java Heap. On a related note a key take-away from that article is to consider LinkedList instead of ArrayList if you not know the ultimate number of elements and hence cannot size correctly at object construction.

This memory waste can be avoided by always using the constructor that takes an initial size and hence size the Map correctly. Now it turns out that that’s only part of the story as a HashMap also has what’s called a “load factor”. The load factor is a decimal number – i.e. a percentage – specifying how much the map may be filled before an extension and hence a full rehash of the keys take place. This means that with the default load factor of 0.75 you may only put 12 elements in a HashMap before it is extended and hence you take a performance hit. Again size your map correctly from the get go taking the load factor into consideration. The last point is that the size should always a power of 2 meaning that if you specify a size of 20 it will actually be 32.

A most nice discussion that caused me to look deeper into this can be hear on episode 391 of the Java Posse podcast (Java Posse #391 – Newscast for August 10th 2012) from 0:58 to 1:05.