Introduction to Java 8 Streams

Introduction

The main subject of this article is advanced data processing topics using a new functionality added to Java 8 – The Stream API and the Collector API.

To get the most out of this article you should already be familiar with the main Java APIs, the Object and String classes, and the Collection API.

Stream API

The java.util.stream package consists of classes, interfaces, and many types to allow for functional-style operations over elements. Java 8 introduces a concept of a Stream that allows the programmer to process data descriptively and rely on a multi-core architecture without the need to write any special code.

What is a Stream?

A Stream represents a sequence of objects derived from a source, over which aggregate operations can be performed.

From a purely technical point of view, a Stream is a typed interface - a stream of T. This means that a stream can be defined for any kind of object, a stream of numbers, a stream of characters, a stream of people, or even a stream of a city.

From a developer point of view, it is a new concept that might just look like a Collection, but it is in fact much different from a Collection.

There are a few key definitions we need to go through to understand this notion of a Stream and why it differs from a Collection:

A Stream does not Hold any Data

The most common misconception which I'd like to address first - a stream does not hold any data. This is very important to keep that in mind and understand.

There is no data in a Stream, however, there is data held in a Collection.

A Collection is a structure that holds its data. A Stream is just there to process the data and pull it out from the given source, or move it to a destination. The source might be a Collection, though it might also be an array or I/O resource. The stream will connect to the source, consume the data, and process the elements in it in some way.

A Stream shouldn't Modify the Source

A stream should not modify the source of the data it processes. This is not really enforced by the compiler of the JVM itself, so it is merely a contract. If I am to build my own implementation of a stream, I should not modify the source of the data I am processing. Although it is perfectly fine to modify the data in the stream though.

Why is that so? Because if we want to process this data in parallel, we are going to distribute it among all the cores of our processors and we do not want to have any kind of visibility or synchronization issues that could lead to bad performances or errors. Avoiding this kind of interference means that we shouldn't modify the source of the data while we're processing it.

A Source may be Unbounded

Probably the most powerful point out of these three. It means that the stream in itself can process as much data as we want. Unbounded does not mean that a source has to be infinite. In fact, a source may be finite, but we might not have access to the elements contained in that source.

Suppose the source is a simple text file. A text file has a known size even if it is very big. Also suppose that the elements of that source are, in fact, the lines of this text file.

Now, we might know the exact size of this text file but if we do not open it and manually go through the content, we'll never know how many lines it has. This is what unbounded means - we might not always know beforehand the number of elements a stream will process from the source.

Those are the three definitions of a stream. So we can see from those three definitions that a stream really has nothing to do with a collection. A collection holds its data. A collection can modify the data it holds. And of course, a collection holds a known and finite amount of data.

Stream Characteristics

  • Element sequence - Streams provide a set of elements of a particular type in a sequential manner. The stream gets an element on demand and never stores an item.
  • Source - Streams take a collection, array, or I/O resources as a source for their data.
  • Aggregate operations - Streams support aggregate operations such as forEach, filter, map, sorted, match, and others.
  • Overriding - Most operations over a Stream returns a Stream, which means their results can be chained. The function of these operations is to take input data, process it, and return the target output. The collect() method is a terminal operation that is usually present at the end of operations to indicate the end of the Stream processing.
  • Automated iterations - Stream operations carry out iterations internally over the source of the elements, as opposed to collections where explicit iteration is required.

Creating a Stream

We can generate a stream with the help of a few methods:

stream()

The stream() method returns the sequential stream with a Collection as its source. You can use any collection of objects as a source:

private List<String> list = new Arrays.asList("Scott", "David", "Josh");  
list.stream();  
parallelStream()

The parallelStream() method returns a parallel stream with a Collection as its source:

private List<String> list = new Arrays.asList("Scott", "David", "Josh");  
list.parallelStream().forEach(element -> method(element));  

The thing with parallel streams is that when executing such an operation, the Java runtime segregates the stream into multiple substreams. It executes the aggregate operations and the combines the result. In our case, it calls the method with each element in the stream in parallel.

Although, this can be a double-edged sword, since executing heavy operations this way could block other parallel streams since it blocks the threads in the pool.

Stream.of()

The static of() method can be used to create a Stream from an array of objects or individual objects:

Stream.of(new Employee("David"), new Employee("Scott"), new Employee("Josh"));  
Stream.builder()

And lastly, you can use the static .builder() method to create a Stream of objects:

Stream.builder<String> streamBuilder = Stream.builder();

streamBuilder.accept("David");  
streamBuilder.accept("Scott");  
streamBuilder.accept("Josh");

Stream<String> stream = streamBuilder.build();  

By calling the .build() method, we pack the accepted objects into a regular Stream.

Filtering with a Stream

public class FilterExample {  
    public static void main(String[] args) {
    List<String> fruits = Arrays.asList("Apple", "Banana", "Cherry", "Orange");

    // Traditional approach
    for (String fruit : fruits) {
        if (!fruit.equals("Orange")) {
            System.out.println(fruit + " ");
        }
    }

    // Stream approach
    fruits.stream() 
            .filter(fruit -> !fruit.equals("Orange"))
            .forEach(fruit -> System.out.println(fruit));
    }
}

A traditional approach to filtering out a single fruit would be with a classic for-each loop.

The second approach uses a Stream to filter out the elements of the Stream that match the given predicate, into a new Stream that is returned by the method.

Additionally, this approach uses a forEach() method, that performs an action for each element of the returned stream. You can replace this with something called a method reference. In Java 8, a method reference is the shorthand syntax for a lambda expression that executes just one method.

The method reference syntax is simple, and you can even replace the previous lambda expression .filter(fruit -> !fruit.equals("Orange")) with it:

Object::method;  

Let's update the example and use method references and see how it looks like:

public class FilterExample {  
    public static void main(String[] args) {
    List<String> fruits = Arrays.asList("Apple", "Banana", "Cherry", "Orange");

    fruits.stream()
            .filter(FilterExample::isNotOrange)
            .forEach(System.out::println);
    }

    private static boolean isNotOrange(String fruit) {
        return !fruit.equals("Orange");
    }
}

Streams are easier and better to use with Lambda expressions and this example highlights how simple and clean the syntax looks compared to the traditional approach.

Mapping with a Stream

A traditional approach would be to iterate through a list with an enhanced for loop:

List<String> models = Arrays.asList("BMW", "Audi", "Peugeot", "Fiat");

System.out.print("Imperative style: " + "\n");

for (String car : models) {  
    if (!car.equals("Fiat")) {
        Car model = new Car(car);
        System.out.println(model);
    }
}

On the other hand, a more modern approach is to use a Stream to map:

List<String> models = Arrays.asList("BMW", "Audi", "Peugeot", "Fiat");

System.out.print("Functional style: " + "\n");

models.stream()  
        .filter(model -> !model.equals("Fiat"))
//      .map(Car::new)                 // Method reference approach
//      .map(model -> new Car(model))  // Lambda approach
        .forEach(System.out::println);

To illustrate mapping, consider this class:

private String name;

public Car(String model) {  
    this.name = model;
}

// getters and setters

@Override
public String toString() {  
    return "name='" + name + "'";
}

It's important to note that the models list is a list of Strings – not a list of Car. The .map() method expects an object of type T and returns an object of type R.

We're converting String into a type of Car, essentially.

If you run this code, the imperative style and functional style should return the same thing.

Collecting with a Stream

Sometimes, you'd want to convert a Stream to a Collection or Map. Using the utility class Collectors and the functionalities it offers:

List<String> models = Arrays.asList("BMW", "Audi", "Peugeot", "Fiat");

List<Car> carList = models.stream()  
        .filter(model -> !model.equals("Fiat"))
        .map(Car::new)
        .collect(Collectors.toList());

Matching with a Stream

A classic task is to categorize objects according to certain criteria. We can do this by matching the needed information to the object information and check if that's what we need:

List<Car> models = Arrays.asList(new Car("BMW", 2011), new Car("Audi", 2018), new Car("Peugeot", 2015));

boolean all = models.stream().allMatch(model -> model.getYear() > 2010);  
System.out.println("Are all of the models newer than 2010: " + all);

boolean any = models.stream().anyMatch(model -> model.getYear() > 2016);  
System.out.println("Are there any models newer than 2016: " + any);

boolean none = models.stream().noneMatch(model -> model.getYear() < 2010);  
System.out.println("Is there a car older than 2010: " + none);  
  • allMatch() - Returns true if all elements of this stream match the provided predicate.
  • anyMatch() - Returns true if any element of this stream match the provided predicate.
  • noneMatch() - Returns true if no element of this stream matches the provided predicate.

In the preceding code example, all of the given predicates are satisfied and all will return true.

Conclusion

Most people today are using Java 8. Though not everybody is using Streams. Just because they represent a newer approach to programming and represent a touch with functional style programming along with lambda expressions for Java, doesn't necessarily mean that it's a better approach. They simply offer a new way of doing things. It's up to developers themselves to decide whether to rely on functional or imperative style programming. With a sufficient level of exercise, combining both principles can help you improve your software.

As always, we encourage you to check out the official documentation for additional information.

Author image
About Vuk Skobalj