HashMap and TreeMap in Java: Differences and Similarities

The performance of a Java program and the proper use of resources are often depend on a collection a developer chose for storing data. Hence, it is very important to understand the difference between the implementations. That's why questions related to collections are in the top of interviews for Java Junior developer applicants.

In this article, we take a glimpse on two implementations of the Map interface, HashMap and TreeMap, and try to answer the question about their differences and when programmer should use the first and the second.

I hope that the reader is well acquainted with the concepts of interface and implementation, and I will give only the basic definitions to make this reading simpler. I will also allow myself some references to other articles and documentation for those who have forgotten some details.

What is Map

The Map interface is a part of Java Collection framework. You can imagine Map as a kind of dictionary, where each element represents a key-value pair. Both keys and values are objects. Map allows you to search for an object by a given key. An object associated with the key is a value. All keys are unique, while values can be duplicated. Some Map implementations allow null keys and null values. The main operations of any Map are insertion, remove, and search of elements.

So, a key is a unique identifier of an object in Map. For example, Map<String, Student> contains a key as a string — student's unique ID which is connected to some object Student.

Both HashMap and TreeMap are the implementations of Map interfaces. Briefly, HashMap is a data structure that hashes keys, and TreeMap uses natural order of keys to organize a search tree.

What is HashMap

HashMap is a data structure that implements Map<Key,Value> interface and it based on hashing principle. If you've never heard of this structure, try an article for beginners and take a glimpse at docs.

To understand what Hashmap is, first you should know about hashing and hash functions. Algorithmic details are beyond the scope of this article, but I am going to give you a definition of hash function (as well as binary tree for the other subject of this article, TreeMap) and a brief description of HashMap's internal work for better understanding.

Hash Principle

A hash function is a function that converts input data of any (usually large) size to a fixed-size data, usually compact. The result of this function work is called hash code.

Each Java object has a hash code. It is usually a number, and it is calculated using the hashCode method of the Object class. The good idea is to override this method for your own classes along with the equals method associated with it.

Hash codes helps programs run faster. Suppose we compare volume objects s1 and s2 of the Student type and declare that the operation s1.equals(s2) takes about 500 ms. In that case, the comparison of the hash codes s1.hashCode() == s2.hashCode() takes about 20 ms.

Hash functions are widely used in cryptography, and other areas as well. However, the magic is not for software development: you can't put something big in a small vessel without losses.

The main rules of the hash codes:

  • A particular object always has the same hash code.
  • If objects are equal, their hash codes are the same, but not vice versa.
  • If the hash codes are different, then the objects are definitely not equal.
  • Different objects may (although very unlikely) have the same hash codes. Well... here we have found data loss! This situation is called a collision. The "good" hash code should minimize a probability of collisions.

Inside the HashMap

HashMap lets us store keys on the principle of hashing. There are two main methods — put(key, value) and get(key) for storing and retrieving Objects from HashMap. Key-value pairs are stored in so-called "buckets", all the buckets together are a "table", a kind of internal array of linked lists.

So the first element of the linked list is stored in the bucket. This linked list is a chain of objects, and each of them has a link to the next object from the chain. Hence, having the first element you can get to the chain of all the elements of the list. A linked list item is an object of the Entry class that contains a key, a value, and a link to the next Entry.

When we call put(key, value), HashMap calls hashCode method on the key object. Then it applies the hashcode we got into its own hashing function, that helps to find a bucket location for storing an Entry object. HashMap stores key and value objects as a Map.Entry in a bucket.

What is TreeMap

Java TreeMap is a data structure that implements Map<Key,Value> interface and it based on Red-Black tree data structure.

Red-Black Tree

A Tree is a hierarchical data structure that consists of "nodes" and lines that connect nodes ("branches"). The "root" node is at the top of the tree and from the root there can branches and the nodes ("children" of the root). Every child node can have its own children (nodes that lie lower) as well. Nodes without children are called "leaf nodes", "end-nodes", or "leaves".

In a binary tree every node has zero, one, or two children. Every internal node of a binary search tree stores a key (and sometimes an associated value) and has two distinguished sub-trees, commonly denoted "left" and "right". You can imagine this tree as a binary search algorithm realisation.

A self-balancing binary search tree is a binary search tree that automatically keeps its height (or maximal number of levels below the root) small in the face of arbitrary item insertions and deletions.

Red-black tree is a balanced binary tree with next properties:

  • Every node is either red or black
  • The root is always black
  • Every leaf is a NIL node and it is black
  • If a node is red, both of its children are black. Therefore, a red node can't have a red child.
  • Every simple path from a node to a descendant leaf contains the same number of black nodes.

Check out this article for more info on Red-Black trees

TreeMap

TreeMap is a Map implementation that keeps its entries sorted according to the natural ordering of its keys. For numbers it means ascending order, for strings — alphabetical order. However it is possible to use a comparator if you need to change the logic of ordering.

"Cool", you may think... "Now I can call the toString method and get all the object sorted or to iterate them in natural way" and you'll be right. However that's not the main advantage of the TreeMap implementation. The great thing about it is that you can find some objects using different filters and conditions.

For example let's choose all the cats from letters "b" to "s" from a cat collection. We are going to use a subMap() method for this.

public class Solution {  
    public static void main(String[] args) throws Exception {
        String[] cats = new String[]{"Fluffy", "Abby", "Boris", "Ginger", "Grey", "Snowy", "Boss", "Waldo", "Tom", "Garfield"};

        TreeMap<String, Cat> treeMap = addCatsToTreeMap(cats);
        System.out.println(treeMap.subMap("Boris", true,"Snowy",true));
    }

    public static TreeMap<String, Cat> addCatsToTreeMap(String[] cats) {
        TreeMap<String,Cat> myCats = new TreeMap<String, Cat>();
        for (int i = 0; i < cats.length; i++) {
            Cat cat = new Cat(cats[i]);
            myCats.put(cat.name, cat);
        }
        return myCats;
    }

    public static class Cat {
        String name;

        public Cat(String name) {
            this.name = name;
        }

        @Override
        public String toString() {
            return name != null ? name.toUpperCase() : null;
        }
    }
}

The output:

{Boris=BORIS, Boss=BOSS, Fluffy=FLUFFY, Garfield=GARFIELD, Ginger=GINGER, Grey=GREY, Snowy=SNOWY}

Here we've got all sorted Cats from Boris to Snowy in alphabetical order. Sure we can do the same with a HashMap, but we should code all the logic of sorting and so on.

HashMap vs TreeMap: Main Differences

Ordering

HashMap is not ordered, while TreeMap sorts by key. How items are stored depends on the hash function of the keys and seems to be chaotic.

TreeMap, which implements not only Map but also NavigableMap automatically sorts pairs by their keys natural orders (according to their compareTo() method or an externally supplied Comparator).

Example. Let's have two maps, HashMap and TreeMap, where the keys are cats names from a String Array.

import java.util.HashMap;  
import java.util.TreeMap;

public class Test {  
    public static void main(String[] args) throws Exception {
        String[] cats = new String[]{"Fluffy", "Abby", "Boris", "Ginger", "Grey", "Snowy", "Boss", "Waldo", "Tom", "Garfield"};
        Integer age;
        HashMap<String, Integer> hMap = new HashMap<>();
        for (int i = 0; i < cats.length; i++) {
            hMap.put(cats[i], i);
        }
        System.out.println("HashMap ordered by hash:");
        System.out.println(hMap);
        System.out.println();

        TreeMap<String, Integer> tMap = new TreeMap<>();
        for (int i = 0; i < cats.length; i++) {
            tMap.put(cats[i], i);
        }
        System.out.println("TreeMap ordered by keys (alphabetical order of the cats' names:");
        System.out.println(tMap);

    }
}

The output:

HashMap ordered by hash:  
{Fluffy=0, Boss=6, Snowy=5, Tom=8, Garfield=9, Abby=1, Boris=2, Waldo=7, Ginger=3, Grey=4}

TreeMap ordered by keys (alphabetical order of the cats' names):

{Abby=1, Boris=2, Boss=6, Fluffy=0, Garfield=9, Ginger=3, Grey=4, Snowy=5, Tom=8, Waldo=7}

Performance

HashMap is faster and provides average constant time performance O(1) for the basic operations get() and put(), if the hash function disperses the elements properly among the buckets. It usually works as is, but in reality sometimes collisions happen. In this case HashMap handles collision using a linked list to store collided elements and performance reduces up to O(n).

To improve the performance in case of frequent collisions, in JDK 8 is used balanced tree instead of linked list. JDK8 switches to balanced tree in case of more than 8 entries in one bucket, it improves the worst-case performance from O(n) to O(log (n)).

According to its structure, HashMap requires more memory than just to keep its elements. The performance of a hash map depends on two parameters — Initial Capacity and Load Factor. The Initial Capacity is a quantity of buckets of a new created HashMap. The load factor measures a percentage of fullness. The default initial capacity is 16 and default load factor is 0.75. We can change these values.

TreeMap is based on binary tree that provides time performance O(log(n)).

Thus, HashMap almost always works faster than TreeMap. The larger the object that's stored, the faster HashMap will be in comparison to TreeMap. However, a TreeMap uses the optimal amount of memory to hold its items, unlike a HashMap.

Null Keys and Null Values

HashMap allow you to store one null key and multiple null values. It keeps entry with a null key in index[0] of an internal bucket. HashMap also allows storing many null values. Example:

import java.util.HashMap;

public class Test {  
    public static void main(String[] args) throws Exception {

        HashMap<String, Integer> hashMap = new HashMap<>();
        hashMap.put(null, null);
        hashMap.put ("Fluffy", 7);
        hashMap.put("Kid", null);

        System.out.println(hashMap);
    }
}

In output we'll get a HashMap with three elements, first with a null key and value, second is an "ordinary" one, and the third with a null value as well.

{null=null, Fluffy=7, Kid=null}

What if we try to add one more element with a null key?

import java.util.HashMap;

public class Test {  
    public static void main(String[] args) throws Exception {

        HashMap<String, Integer> hashMap = new HashMap<>();
        hashMap.put(null, null);
        hashMap.put(null, 5);
        hashMap.put ("Fluffy", 7);
        hashMap.put("Kid", null);

        System.out.println(hashMap);
    }
}

The new entry keeps in index[0] of an internal bucket, so it will be overwritten:

{null=5, Fluffy=7, Kid=null}

TreeMap sorts elements in natural order and doesn't allow null keys because compareTo() method throws NullPointerException if compared with null.

So if we try to run the next code:

TreeMap<String, Integer> treeMap = new TreeMap<>();  
treeMap.put(null, 5);  
treeMap.put ("Fluffy", 7);  
treeMap.put("Kid", null);

System.out.println(treeMap);  

We've got a java.lang.NullPointerException.

If you are using TreeMap with user-defined Comparator, work with null entries depends on the implementation of compare() method.

What is in Common?

Both TreeMap and HashMap implement the Map interface, so they don't support duplicate keys.

They are not thread-safe, so you can't use them safely in a multi-threaded application.

Conclusions

HashMap is a general purpose Map implementation. It provides a performance of O(1), while TreeMap provides a performance of O(log(n)) to add, search, and remove items. Hence, HashMap is usually faster.

A TreeMap uses memory way more effective so it is a good Map implementation for you if you are not sure of elements quantity that have to be stored in memory.

Use a TreeMap if you need to keep all entries in natural order.

About the Author

John Selawsky is a senior Java developer and Java tutor at Learning Tree International programming courses. Visit his personal Medium blog to read more John's Java thoughts and advices.