Java Collections: The Set Interface

Introduction

The Java Collections Framework is a fundamental and essential framework that any strong Java developer should know like the back of their hand.

A Collection in Java is defined as a group or collection of individual objects that act as a single object.

There are many collection classes in Java and all of them extend the java.util.Collection and java.util.Map interfaces. These classes mostly offer different ways to formulate a collection of objects within a single object.

Java Collections is a framework that provides numerous operations over a collection - searching, sorting, insertion, manipulation, deletion etc.

This is the first part of a series of Java Collections articles:

  • The List Interface
  • The Set Interface (you are here)
  • Queues, Deques, Stacks (coming soon)
  • The Map Interface (coming soon)

Sets

The next common interface from the framework is java.util.Set.

Sets don't offer additional methods, other than the methods inherited from the Collection interface.

A Set models the mathematical set abstraction, and can't contain duplicate elements. That being said, it's also worth noting that these elements have no specific order within the set:

List<String> names = Arrays.asList("David", "Scott", "Adam", "Jane", "Scott", "David", "Usman");  
System.out.println(names);

Set<String> uniqueNames = new HashSet<>(names);  
System.out.println(uniqueNames);  

Running this piece of code would yield:

[David, Scott, Adam, Jane, Scott, David, Usman]
[Adam, David, Jane, Scott, Usman]

As you can notice, the list names contains duplicate entries, and the set uniqueNames removes the duplicate ones and prints them out without a specific order.

Adding an Element

Using the add() method, similar as in Lists we can add objects to Set:

Set<String> uniqueNames = new HashSet<>();  
uniqueNames.add("David");  
uniqueNames.add("Scott");  
uniqueNames.add("Adam");  
uniqueNames.add("Jane");  
uniqueNames.add("Scott");  
uniqueNames.add("David");  
uniqueNames.add("Usman");

System.out.println(uniqueNames);  

Running this piece of code will yield:

[Adam, David, Jane, Scott, Usman]

Removing Elements

Using the boolean remove() method, we can remove the specified element from this set if it is present:

System.out.println(uniqueNumbers.remove(2));

System.out.println(uniqueNumbers);  

Output:

true  
[1, 3]

Another option is to use the clear() method to remove all elements of the Set:

List<String> names = Arrays.asList("David", "Scott", "Adam", "Jane", "Scott", "David", "Usman");  
Set<String> uniqueNames = new HashSet<>(names);

uniqueNames.clear();  
System.out.println(uniqueNames);  

Running this piece of code would yield:

[]

Alternatively, we could rely on the removeAll() method:

List<String> names = Arrays.asList("David", "Scott", "Adam", "Jane", "Scott", "David", "Usman");  
List<String> newNames = Arrays.asList("David", "Adam");  
Set<String> uniqueNames = new HashSet<>(names);

uniqueNames.removeAll(newNames);  
System.out.println(uniqueNames);  

Running this piece of code would yield:

[Jane, Scott, Usman]

It's important to notice that the removeAll() method accepts a Collection as an argument. This can be used to remove all common elements from two different collections, in this case a List and a Set.

Also keep in mind that you can use this method to remove all the elements from the Collection itself:

uniqueName.removeAll(uniqueNames);  

This will, of course, end up with an empty set. However, this approach isn't recommended as calling the removeAll() method costs a lot more than the clear() method.

This is due to the removeAll() method comparing every single element from the argument collection with the collection that calls the method whereas clear() simply points them all to null and sets the size to 0.

Contains the Element

Using the boolean contains() method with the given object, we can check if this Set contains a specified element:

List<String> names = Arrays.asList("David", "Scott", "Adam", "Jane", "Scott", "David", "Usman");  
Set<String> uniqueNames = new HashSet<>(names);

System.out.println(uniqueNames.contains("David"));  
System.out.println(uniqueNames.contains("Scott"));  
System.out.println(uniqueNames.contains("Adam"));  
System.out.println(uniqueNames.contains("Andrew"));  

Running this code would yield:

true  
true  
true  
false  

Iterating Elements

The same as with lists, although possible to iterate with for and enhanced-for loops, it's better to use the Java Collections' Iterator for this task:

Set<E> set = new TreeSet<E>();  
...
for(Iterator<E> iterator = set.iterator(); iterator.hasNext()) {  
    E element = iterator.next();
    element.someMethod();
    iterator.remove(element);
}

Additionally, Java 8 introduces us with a really simple way to print out the elements using method references:

set.forEach(System.out::println);  

Retrieving Size

If you'd like to retrieve the size of a set:

List<String> names = Arrays.asList("David", "Scott", "Adam", "Jane", "Scott", "David", "Usman");  
Set<String> uniqueNames = new HashSet<>(names);

System.out.println(uniqueNames.size());  

Running this piece of code would yield:

5  

Checking if Empty

If you'd like to run a check to se whether a Set is empty or not before performing any operations on it:

List<String> names = Arrays.asList("David", "Scott", "Adam", "Jane", "Scott", "David", "Usman");  
Set<String> uniqueNames = new HashSet<>(names);

System.out.println(uniqueNames.isEmpty());  

Running this piece of code would yield:

false  

Implementations and Differences

HashSet:

  • Based upon HashMap (Calls hashCode() on the element and looks up the location)
  • Good general purpose implementation (Resizes when it runs out of space)

TreeSet:

  • Based upon TreeMap (Uses a Binary Tree with a required sort order)
  • Keeps elements in the given order

EnumSets:

  • Specialized implementation for enums (Uses a bitset based upon the ordinal of the enum)
  • Use when storing sets of enums

Algorithmic Comparison

performance_comparison_hashset_treeset_enumset

Conclusion

The Java Collections framework is a fundamental framework that every Java developer should know how to use.

In the article, we've talked the Set Interface and its implementations, their advantages, and disadvantages as well as the operations you'll most certainly use at one point or another.

If you're interested in reading more about the collection interfaces, continue reading - Java Collections: Queues, Deques and Stacks (coming soon).

Author image
About Vuk Skobalj