Introduction
The Java Collections Framework is a fundamental and essential framework that any strong Java developer should know like the back of their hand.
A Collection in Java is defined as a group or collection of individual objects that act as a single object.
There are many collection classes in Java and all of them extend the java.util.Collection
and java.util.Map
interfaces. These classes mostly offer different ways to formulate a collection of objects within a single object.
Java Collections is a framework that provides numerous operations over a collection - searching, sorting, insertion, manipulation, deletion, etc. It's a high-level framework that provides a consistent API for working with collections.
This is the first part of a series of Java Collections articles:
- The List Interface (you are here)
- The Set Interface
- The Map Interface
- The Queue and Deque Interfaces
Problems with Arrays
Arrays are one of the first things a new baked Java developer is introduced with. They're a simple and straightforward way to store multiple objects in a single object.
An array of objects, very similar to a collection, represents a group of objects as a single object.
Both an array and a collection are an object representing multiple other objects, so why is there a need for both?
Let's consider a collection of products:
Product door = new Product("Wooden Door", 35);
Product floorPanel = new Product("Floor Panel", 25);
We have a wooden door and a door panel, weighing 35 kg and 25 kg respectively. These are POJOs, which means that they only have a couple of getter and setter methods and a toString()
method.
With this, it's quite simple to instantiate an array of these objects:
Product[] products = { door, floorPanel };
Printing Arrays
There are many reasons why someone would want to print an array, including debugging or returning the results:
System.out.println(products);
However, when we try to print it out, we're greeted with a result that isn't very human friendly:
com.demo.collections.Product;@14ae5a5
In fact, we need to rely on the helper class java.util.Arrays
to get a sensible result:
System.out.println(Arrays.toString(products));
This time around, we're seeing something that makes more sense:
[Product{name="Wooden Door", weight=35}, Product{name="Floor Panel", weight=25}]
This is because the Arrays.toString()
method is implemented to print out the elements of the array in a human-readable format, which isn't built into the array itself.
Adding and Removing Elements
Our collection of products just got bigger, and we're supposed to add a window to the array:
final Product window = new Product("Window", 15);
products = add(window, products);
System.out.println(Arrays.toString(products));
public static Object[] add(Object[] array, Object... elements) {
Object[] tempArray = new Object[array.length + elements.length];
System.arrayCopy(array, 0, tempArray, 0, array.length);
for(int i = 0; i < elements.length; i++) {
tempArray[array.length+i] = elements[i];
return tempArray;
}
}
This is exactly the type of situation where you'd probably rather shoot yourself in the leg - because arrays don't resize.
To add an element, we have to make a copy of the array into a new array, instantiate it with the new elements and assign the new array to our reference variable. This is a very inefficient way to add elements to an array. The same goes for removing elements.
Arrays are a more low-level construct and don't give us many features whereas Collections are made to combat that very problem and offer many features and great functionality.
Collections
The Java Collections Framework ships with the JDK itself. It's worth remembering that in the old days, especially for people who wrote C code, developers weren't presented with data structures to choose from. In fact, people used to write their own data structures, which some do even today.
There are legitimate performance reasons why someone might find a custom data structure to be great for a specific project. But, for most developers, relying on the existing framework is a good choice.
Java is used to build big and complex systems and applications. That being said, almost every Java application will end up using the collections framework at one point or another.
All of the collection classes have an underlying data structure that they're implementing - Trees, HashTables, HashMaps, Queues, etc. Implementing these data structures yourself, while potentially fun, can be very hard - there are many corners you have to get right. There's no need to reinvent the wheel if it's already served to you unless you wish to practice and challenge yourself to come up with innovative and alternative solutions.
We'll be taking a look at a few different types of collections in Java:
- Lists - A sequential (ordered) Collection. They keep track of the positions of all elements, like arrays and offer search, iteration and range-view operations of their elements. Lists can have duplicate elements.
- Sets - Enforces uniqueness constraints - can't contain duplicate elements. It doesn't concern itself with the order of iteration within itself, as it models the mathematical set abstraction. Sets offer no additional functionality other than the inherited from Collections.
- Queues - Introduce modification order, that's to say, if you add elements in a certain order, you have to follow a certain order. Queues offer additional insertion, removal and inspection operations upon its elements. It's unique for Queues to follow the FIFO (First in, first out) structure.
- Deques - Similar to queues, double-ended queues (shortened to deques) additionally offer the possibility to perform operations on elements from both sides of the queue.
- Maps - Although implementations of
java.util.Map
aren't considered "true collections", they offer collection-view operations which practically enables collection-level manipulation. This collection isn't a collection of individual values, but pairs. These are associations between unique keys and values (Maps) that can be looked up from those keys. It's important to note that the keys are unique, and each key is associated with a value, but a value can be associated with more than one key.
Interface Collection
As mentioned above, all of the collection interfaces within the Java API extend a common interface - java.util.Collection
. This main interface provides all of the common collections functionality.
Each sub-interface has several implementations, and some of these sub-interfaces offer additional operations:
The key point to understand is that each interface defines behavior and functional characteristics where we can utilize multiple data structures while implementations define performance characteristics, utilize a specific data structure and are instantiable. The interfaces are like the contracts, while the implementations are the concrete classes that fulfill those contracts.
The most commonly used methods in the Collection
interface are:
Method Name | Method Description |
---|---|
size() | Get the number of elements in the Collection |
isEmpty() | True if size() == 0, false otherwise |
add(element) | Add the element at the beginning of this collection |
addAll(collection) | Add all the elements of the argument collection to this collection |
remove(element) | Remove the element from this collection |
removeAll(collection) | Remove all the elements of the argument collection to this collection |
retainAll() | Remove all the elements of this collection not in the argument collection |
contains(element) | True if the element is in this collection, false otherwise |
containsAll(collection) | True if all the elements of the argument collection are in this collection |
clear() | Remove all elements from this collection |
Lists
The first, and probably most commonly used interface - java.util.List
.
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
Each element within the list has an index, an int
value which defines their position. The indexing count starts at 0, the same as the indexing we can encounter with arrays.
The java.util.List
interface also adds a couple of other operations beyond the regular common collection operations:
get(int index)
set(int index, Object object)
These operations are pretty self-explanatory and don't need further explanation. Though, let's take a look at some code examples.
Adding an element
Using the add()
method, we can easily add objects to our list:
List<String> products = new ArrayList<>();
products.add("Mug");
products.add("Wallet");
products.add("Phone");
System.out.println(products);
Output:
[Mug, Wallet, Phone]
Note: We're instantiating the list as its concrete implementation ArrayList
. In most cases, we'd use this implementation for a List.
Another note: You can specify the initial size of the ArrayList via the constructor to avoid resizing if you know a definitive size.
The interface also provides another version of the add()
method, including an index. In this case, we add the element to the given index, and if the index is already taken by another element, all of the elements after the added one shift to the right by one:
products.add(2, "Pen");
System.out.println(products);
Output:
[Mug, Wallet, Pen, Phone]
Retrieving Elements
Using the get()
method with the given index, we can retrieve a specific element in the list:
System.out.println(products.get(0));
Output:
[Mug]
Removing Elements
Using the remove()
method, we can remove an element from the list. Calling this method will return the element as well as shift the elements after it one index back, to fill in the now existent hole in the sequence:
System.out.println(products.remove(1));
Output:
[Wallet]
Setting Elements
Using the set()
method, we can replace an existing element given an index:
products.set(1, "Book");
System.out.println(products);
Output:
[Mug, Book, Phone]
Searching for Elements
Using the indexOf()
method, we can also look up values, given an index. If the search fails, and no object exists with the given index, the list will return -1
. In the case of multiple equal objects, the list will return only the first index.
Using the lastIndexOf()
will return the last index of the given element.
System.out.println(products.indexOf(5));
Output:
-1
Iterating Elements
Although possible to iterate with for
and enhanced-for
loops, the interface provides two new helper classes that allow us to iterate through lists - Iterator
and ListIterator
:
for (Iterator<E> iterator = list.iterator(); iterator.hasNext(); ) {
E element = iterator.next();
element.someMethod();
iterator.remove(element);
//...
}
for (ListIterator<E> iterator = list.listIterator(); iterator.hasNext(); ) {
E element = iterator.next();
element.someMethod();
iterator.remove(element);
//...
}
Note: The ListIterator
offers more control over the list iteration as it allows for traversal in both directions, while Iterator
only allows traversal in one direction.
Additionally, Java 8 introduces us with a really simple way to print out the elements using a method reference:
list.forEach(System.out::println);
Implementations and Differences
ArrayList: implements java.util.List
as a dynamically resizing array:
- Good general-purpose implementation
- Used as default
- More CPU cache sympathetic
LinkedList: implements java.util.List
as a doubly-linked list:
- Worse performance for many operations
- Use when adding elements at the start
- Use when adding/removing a lot
Generally speaking, ArrayList
is much more commonly used than LinkedList
. And to quote Joshua Bloch, the man who wrote LinkedList
:
"Does anyone actually use LinkedList? I wrote it, and I never use it."
Performance Comparison
Due to their different natures, these implementations have different approaches and method runtimes.
Depending on the requirements, you'll have to choose which to use. Generally speaking, due to its doubly-linked nature, LinkedList
is good for frequent addition and removal whereas ArrayList
is good for searching due to random access.
Conclusion
The Java Collections framework is a fundamental framework that every Java developer should know how to use.
In the article, we've talked about collections in general, the problems with arrays and how the framework combats them. Afterwards, we jumped into the implementations of this interface, their advantages, and disadvantages, as well as the operations you'll most certainly use at one point or another.
If you're interested in reading more about the collection interfaces, continue reading - Java Collections: The Set Interface.