Java 8 Streams: Guide to reduce()

Java 8 Streams: Guide to reduce()

Introduction

The reduce() method is Java 8's answer to the need for a fold implementation in the Stream API.

Folding is a very useful and common functional programming feature. It operates on a collection of elements to return a single result using some sort of operation.

Note: Folding is also known as reducing, aggregating, accumulating and compressing, and these terms all apply to the same concept.

That being said - it's one of the most malleable, flexible and applicable operation - and it's very commonly used to calculate aggregate results of collections and widely employed in one form or another in analytical and data-driven applications. The reduce() operation equips the Stream API with similar fold capabilities.

Thus, if you have some int values such as, say, [11, 22, 33, 44, 55], you could use reduce() to find their sum, amongst other results.

In functional programming, finding the sum of those numbers would apply steps such as these:

0 + 11 = 11
11 + 22 = 33
33 + 33 = 66
66 + 44 = 110
110 + 55 = 165

Using the reduce() method, this is achieved as:

int[] values = new int[]{11, 22, 33, 44, 55};

IntStream stream = Arrays.stream(values);
int sum = stream.reduce(0, (left, right) -> left + right);

The sum is:

165

The reduce() is straightforward enough. If you look at the functional routine, for example, you could call all the values on the left side of the + operator left; and those on the right, right. Then, after every sum operation, the result becomes the new left of the next summing.

Likewise, the Java's reduce() method does exactly what the functional routine does. It even includes a starting value, 0, which the functional routine has too.

Operation-wise, the reduce() method adds a left value to the next right value. Then it adds that sum to the next right value...and so on.

You could even visualize how reduce() implements folding on those values as:

((((0 + 11) + 22) + 33) + 44) + 55 = 165

The Stream API does not offer the folding capabilities of reduce() like in the example above only, though.

It goes all out to include its functional interfaces in three reduce() method implementations. As you will see in more detail in subsequent sections, the API offers reduce() in flavors such as:

T reduce(T identity, BinaryOperator<T> accumulator)

This version is what we used earlier. Where, 0 was the identity; and, (left, right) -> left + right) was the accumulator that implemented the BinaryOperator functional interface.

And:

Optional<T> reduce(BinaryOperator<T> accumulator)

And:

<U> U reduce(U identity,
             BiFunction<U,? super T,U> accumulator,
             BinaryOperator<U> combiner)

Note: The sum(), average(), max() and min() operations of the Stream API are reduction variations.

The sum(), max() and min() methods are essentially wrappers for the reduce() operation:

// Equivalent to stream.sum()
stream.reduce(0, Integer::sum);
// Equivalent to stream.max()
stream.reduce(0, Integer::max);
// Equivalent to stream.min()
stream.reduce(0, Integer::min);

In the proceeding sections, we'll dive deep into the reduce() method, its variants, use-cases and good practices, leaving you with a deeper understanding and appreciation for the underlying mechanism.

reduce() Flavors and Examples

The Stream API offers three reduce() operation variants. Let's go through each of them, their definitions and practical usage.

1. reduce() whose Result is the Same Type as the Stream's Elements

Method signature:

T reduce(T identity, BinaryOperator<T> accumulator)

Official documentation's definition:

Performs a reduction on the elements of this stream, using the provided identity value and an associative accumulation function, and returns the reduced value.

By now, we know how this type of reduce() operates. But, there is a little matter that you should be careful with when using this reduce() type. (Actually, with any reduction operation):

The associative nature of your reduce() implementation.

When you use reduce(), you should provide the possibility for your routines to run in a parallel settings as well. Reduction operations are not constrained to perform sequentially.

To this end, associativity is crucial because it will enable your accumulator to produce correct results regardless of the stream elements' encounter order. If associativity didn't hold here, the accumulator would be unreliable.

Case in point: say, you have three int values, [8, 5, 4].

Associativity demands operating on these values in any order should always produce matching results. For example:

(8 + 5) + 6 == 8 + (5 + 6)

Also, when parallelization occurs the accumulation may handle these values in even smaller units. For instance, take a stream which contains the values [7, 3, 5, 1]. A parallel stream may make the accumulation work in a fashion like:

7 + 3 + 5 + 1 == (7 + 3) + (5 + 1)

But, these demands effectively bar you from using some types of operations with the reduce() method. You cannot, for example, do subtraction operations with reduce(). That is because it would violate the associativity principle.

See, say you use the values from one of the previous examples: [8, 5, 4]. And then try to use reduce() to find their cumulative difference.

It would look something like this:

(8 - 5) - 6 != 8 - (5 - 6)

Otherwise, the identity parameter is another factor to be careful of. Choose an identity value, i, such that: for each element e in a stream, applying an operation op on it should always return e.

What this means is that:

e op identity = e

In case of addition, the identity is 0. In case of multiplication, the identity is 1 (as multiplication with 0 will always be 0, not e). In case of strings, the identity is a String, etc.

This operation can be functionally used in Java as:

IntStream intStream = IntStream.of(11, 22, 33, 44, 55);
Stream stringStream = Stream.of("Java", "Python", "JavaScript");

int sum = intStream.reduce(0, (left, right) -> left + right);
int max = intStream.reduce(0, Integer::max);
int min = intStream.reduce(0, Integer::min);

// Mapping elements to a stream of integers, thus the return type is the same type as the stream itself
int sumOfLengths = stringStream.mapToInt(String::length)
        .reduce(0, Integer::sum);

These reduce() calls were so common, that they were replaced with a higher-level call - sum(), min(), max(), and you could by all means use those instead of the reduce() calls, though keep in mind that they were modified to return Optional variants:

int sum = intStream.sum();
OptionalInt max = intStream.max();
OptionalInt min = intStream.min();

Where reduce() shines is in cases where you want any scalar result from any sequence - such as reducing a collection to an element that has the greatest length, which results in an Optional. We'll take a look at that now.

2. reduce() whose Result is an Optional

Method signature:

Optional<T> reduce(BinaryOperator<T> accumulator)

Official documentation's definition:

Performs a reduction on the elements of this stream, using an associative accumulation function, and returns an Optional describing the reduced value, if any.

Operationally, this is the simplest way of using the reduce() method. It asks for only one parameter. A BinaryOperator implementation, which would serve as an accumulator.

So, instead of this:

int sum = stream
        .reduce(0, (left, right) -> left + right);

You would only have to do this (i.e., leave out the identity value):

Optional<Integer> sum = stream
        .reduce((left, right) -> left + right);

The difference between the former and the latter is that in the latter the result may not contain any value.

That would occur when you pass an empty stream for evaluation, for example. Yet, that does not happen when you use an identity as one of the parameters because reduce() returns the identity itself as result when you offer it an empty stream.

Another example would be reducing collections to certain elements, such as reducing the stream created by several Strings to a single one:

List<String> langs = List.of("Java", "Python", "JavaScript");

Optional longest = langs.stream().reduce(
        (s1, s2) -> (s1.length() > s2.length()) ? s1 : s2);

What's going on here? We're streaming a list and reducing it. For each two elements (s1, s2), their lengths are compared, and based on the results, either s1 or s2 are returned, using the ternary operator.

The element with the greatest length will be propagated through these calls and the reduction will result in it being returned and packed into an Optional, if such an element exists:

longest.ifPresent(System.out::println);  

This results in:

JavaScript

3. reduce() which Uses a Combining Function

Method signature:

<U> U reduce(U identity, BiFunction<U,? super T,U> accumulator, BinaryOperator<U> combiner)

Official documentation's definition:

Performs a reduction on the elements of this stream, using the provided identity, accumulation and combining functions.

While this definition seems straightforward enough, it hides a powerful capability.

This reduce() variant can allow you to process a result whose type does not match that of a stream's elements.

Haven't we done this before? Not really.

int sumOfLengths = stringStream
    .mapToInt(String::length)
    .reduce(0, Integer::sum);

The mapToInt() method returns an IntStream, so even though we start out with a stream of Strings - the reduce() method is called on an IntStream, and returns an integer, which is the type of the elements in the stream.

The mapToInt() is a quick hack that allowed us to "return a different type", though, it didn't really return a different type.

Take the case where you want to calculate the cumulative length of a paragraph of words, or the length of the words like we've had before.

That suggests that you may have a stream of String elements. Yet, you need the return type of the reduce() operation to have an int value to denote the length of the paragraph.

This is where the combiner comes into play:

String string = "Our Mathematical Universe: My Quest for the Ultimate Nature of Reality";
List<String> wordList = List.of(string.split(" "));

  int length = wordList
        .stream()
        .reduce(
                0,
                (parLength, word) -> parLength + word.length(),
                (parLength, otherParLength) -> parLength + otherParLength
        );

System.out.println(String.format("The sum length of all the words in the paragraph is %d", length));

This code sums the length of all strings in the paragraphs, broken down on each space (so whitespaces aren't included in the calculation) and results in:

The sum length of all the words in the paragraph is 60

The feature that is worth noting with this reduce() variant is that it serves parallelization pretty well.

Take the accumulator in the example:

(parLength, word) -> parLength + word.length()

The reduce() operation will call it multiple times, no doubt. Yet, in a parallelized stream there may end up being quite a few accumulators in the pipeline. And that is where the combiner function steps in.

The combiner function in the example is:

(parLength, otherParLength) -> parLength + otherParLength

It sums the results of the available accumulators to produce the final result.

And that allows the reduce() operation to break down a chunky process into many, smaller, and probably faster operations. This also segways us into the next significantly important topic - parallelization.

Using reduce() with Parallel Streams

You can turn any sequential stream into a parallel one by calling the parallel() method on it.

Likewise, let us consider a use case where you want to sum all the int values in a given range to test how reduce() works in parallel.

There are several ways of generating a sequence of int values within a given range using the Stream API:

  1. Using Stream.iterate
  2. Using IntStream.rangeClosed

Using Stream.iterate()

private final int max = 1_000_000;
Stream<Integer> iterateStream = Stream.iterate(1, number -> number + 1).limit(max);

Using IntStream.rangeClosed()

IntStream rangeClosedStream = IntStream.rangeClosed(1, max);

So, if we have these two ways of producing a stream of int values, is one more efficient than the other for our use case?

The answer is a resounding yes.

The Stream.iterate() is not as efficient as the IntStream.rangeClosed() when you apply the reduce() operation to them. We'll see why shortly.

When you use the two tactics to find the sum of numbers you would write code such as this:

Free eBook: Git Essentials

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

Integer iterateSum = iterateStream
            .parallel()
            .reduce(0, (number1, number2) -> number1 + number2);
int rangeClosedSum = rangeClosedStream
            .parallel()
            .reduce(0, (number1, number2) -> number1 + number2);

True, both ways will always produce matching and correct results.

If you set the variable max to 1,000,000, for example, you will get 1,784,293,664 from both reduce() methods.

Yet, calculating iterateSum is slower than rangeClosedSum.

The cause for this is the fact that Stream.iterate() applies unboxing and boxing to all the number values it encounters in its pipeline. For instance, notice that we supplied int values to it and it returned an Integer object as the result.

IntStream.rangeClosed() does not suffer from this shortcoming because it deals with int values directly and even returns an int value as a result, for example.

Here are some more tests on GitHub that illustrate this phenomenon. Clone that repo and run the tests to explore further for yourself how reduce() performs when running in Stream.iterate() and IntStream.rangeClosed().

When to not Use reduce()

The reduce() operation requires the use of a stateless and non-interfering accumulator.

That means that the accumulator should ideally be immutable. And, to achieve this, most accumulators create new objects to hold the value the next accumulation.

Take a case where you want to join several elements of String objects into one String object. Where you want to make a sentence out of several words, for example. Or, even a word by chaining several char values.

The official documentation offers one such example:

String concatenated = strings.reduce("", String::concat);

Here, the reduce() operation will create very many string objects if the strings stream has a large number of elements.

And, depending on how large the strings stream is, the performance will take a dip fast because of all the object allocation that is going on.

To get a clearer picture of how this operation works consider its for loop equivalent. Then, note how new String objects materialize with every loop pass:

String concatenated = "";
for (String string : strings) {    
    concatenated += string;
}

Yet, you could attempt to remedy the creation of new objects in reduce() operations by using mutable objects in the first place.

However, keep in mind that if you try to remedy that shortcoming by using a mutable identity container like a List we expose that container to ConcurrentModification exceptions.

Take a case where you want to reduce() a stream of int values into a List of Integer objects. You could do something like this:

Stream<Integer> numbersStream = Arrays.asList(12, 13, 14, 15, 16, 17).stream();
List<Integer> numbersList = numbersStream.reduce(
        // Identity
        new ArrayList<>(),
        // Accumulator
        (list, number) -> {
            list.add(number);
            return list;
       },
        // Combiner
        (list1, list2) -> {
            list1.addAll(list2);
            return list1;
        }
);

This code will give you a correct result:

[12, 13, 14, 15, 16, 17]

But, it will come at a cost.

First, the accumulator in this case is interfering with the identity. It is introducing a side effect by adding a value to the list acting as identity.

Then, if you happen to turn the stream, numbersStream, to a parallel one, you will expose the list accumulation to concurrent modification. And, this is bound to make the operation throw a ConcurrentModification at some point.

Thus, your entire reduce() operation may fail altogether.

Putting reduce() into Practice

Because of its functional nature, the Stream API demands a total rethinking of how we design Java code. It requires the use of methods that can fit in the patterns of functional interfaces that operations such as reduce() use.

As a result, we will design our code such that when we call the reduce() operation on it, it will result in terse code. One that can you rewrite with members references, for example.

But, first, let us explore the use case that we will use to test reduce() operations with.

  • We have a grocery store that sells various products. Examples include, cheese, tomatoes, and cucumbers.
  • Now, every product has attributes such as a name, price, and unit weight
  • Customers get products from the store through transactions.

As a manager of such a grocery store, you come in one day and ask the clerk a few questions:

  • How much money did you make from all your transactions?
  • How heavy were the sold items? That is, what was the cumulative weight of the products that you sold?
  • What was the value of the transaction that a customer paid the most for?
  • Which transaction had the lowest value (in terms of its total price value)?

Designing the Domain

We will create a class Product to represent the items that will that the grocery store will stock:

public class Product {

    private final String name;
    private final Price price;
    private final Weight weight;

    public Product(String name, Price price, Weight weight) {
        this.name = name;
        this.price = price;
        this.weight = weight;
    }

    // Getters
}

Notice that we have included two value classes as fields of Product named Weight and Price.

Yet, if we had wanted to do it naively, we would have made these two fields have double values.

Like this:

public Product(String name, double price, double weight) {    
    this.name = name;
    this.price = price;
    this.weight = weight;
}

There is an absolutely good reason for doing this, and you shall find out why soon. Otherwise, both Price and Weight are simple wrappers for double values:

public class Price {
    private final double value;
    
    public Price(double value) {
        this.value = value;
    }  
          
    //Getters
 }
 
public class Weight {
    private final double value;
    
    public Weight(double value) {
        this.value = value;
    }
    
    // Getters
}

Then, we have the Transaction class. This class will contain a Product and the int value which represents the quantity of the product that a customer will buy.

Thus, Transaction should be able to inform us the total Price and Weight of Product which a customer bought. It should thus include methods such as:

public class Transaction {
    private final Product product;
    private final int quantity;
    
    public Transaction(Product product, int quantity) {
        this.product = product;
        this.quantity = quantity;
    }    
    
    //Getters ommited 
    
    public Price getTotalPrice() {
        return this.product.getPrice().getTotal(quantity);
    }    
    
    public Weight getTotalWeight() { 
        return this.product.getWeight().getTotal(quantity);
    }
}

Note how the methods getTotalPrice() and getTotalWeight() delegate their calculations to Price and Weight.

These delegations are quite important, and the reason we used classes instead of simple double fields.

They suggest that Price and Weight should be able to do accumulations of their types.

And remember, the reduce() operation always takes a BinaryOperator as its accumulator. So, this is the juncture where we start to pre-build accumulators for our classes.

Thus, add the following methods to serve as accumulators for Price and Weight:

public class Price {
    // Fields, constructor, getters
    
    public Price add(Price otherPrice) {
        return new Price(value + otherPrice.getValue());
    }    
    
    public Price getTotal(int quantity) {
        return new Price(value * quantity);
    }
}

public class Weight {
    // Fields, constructor, getters

    public Weight add(Weight otherWeight) {
        return new Weight(value + otherWeight.getValue());
    }    
    
    public Weight getTotal(int quantity) { 
        return new Weight(value * quantity);
    }
}

There are variants of the reduce() operation which also require identity parameters. And since an identity is a starting point of a calculation (which may be the object with the lowest value), we should go ahead and create the identity versions of Price and Weight.

You could do this by simply including the identity versions of these classes as global variables. So, let's add the fields named NIL to Price and Weight:

public class Price {
    // Adding NIL
    public static final Price NIL = new Price(0.0);
    
    private final double value;
    public Price(double value) {
        this.value = value;
     }
}

public class Weight {
    // Adding NIL
    public static final Weight NIL = new Weight(0.0);  
     
    private final double value;
    public Weight(double value) {
        this.value = value;
    }
}

As the name NIL suggests, these fields represent Price or Weight which has the minimum value. With that done, it is time to create the Grocery object which will conduct the transactions:

public class Grocery {
    public static void main(String[] args) {
        //Inventory
        Product orange = new Product("Orange", new Price(2.99), new Weight(2.0));
        Product apple = new Product("Apple", new Price(1.99), new Weight(3.0));
        Product tomato = new Product("Tomato", new Price(3.49), new Weight(4.0));
        Product cucumber = new Product("Cucumber", new Price(2.29), new Weight(1.0));
        Product cheese = new Product("Cheese", new Price(9.99), new Weight(1.0));
        Product beef = new Product("Beef", new Price(7.99), new Weight(10.0));
        
        //Transactions
        List<Transaction> transactions = Arrays.asList(
                new Transaction(orange, 14),
                new Transaction(apple, 12),
                new Transaction(tomato, 5),
                new Transaction(cucumber, 15),
                new Transaction(cheese, 8),
                new Transaction(beef, 6)
        );
    }
}

As the code shows, the Grocery has few Product objects in its inventory. And, a few Transaction events occurred.

Still, the store's manager had asked for some data regarding the transactions. We should thus proceed to put reduce() to work to help us answer those queries.

Money Made from All Transactions

The total price of all the transactions is a result of summing the total price of all transactions.

Thus, we map() all the Transaction elements to their Price values first.

Then, we reduce the Price elements to a sum of their values.

Here, the abstraction of the accumulator into the Price object itself has made the code highly readable. Also, the inclusion of the Price.NIL identity has made the reduce() operation read as functionally as possible:

Price totalPrice = transactions.stream()
                .map(Transaction::getTotalPrice)
                .reduce(Price.NIL, Price::add);
                
System.out.printf("Total price of all transactions: %s\n", totalPrice);

After running that code snippet, the output you should expect is:

Total price of all transactions: $245.40

Note too that we delegate the printing of the price value to the Print object's toString() method to simplify debugging further:

Using the toString() method to provide human friendly description of an object's value is always good practice.

@Override
public String toString() {
    return String.format("$%.2f", value);
}
Total Weight of All Sold Products

Similar to what we did with Price, here we task Weight with summing the values of several elements.

Of course we need map() each Transaction element in the pipeline to a Weight object first.

Then we task the Weight elements with doing the accumulation of their values themselves:

Weight totalWeight = transactions.stream()
                .map(Transaction::getTotalWeight)
                .reduce(Weight.NIL, Weight::add);

System.out.printf("Total weight of all sold products: %s\n", totalWeight);

On running this snippet, you should an output such as:

Total weight of all sold products: 167.00 lbs
Price of Highest Value Transaction

This query demands a bit of a redesign of how a Price finds a minimum or maximum value between two Price elements.

Remember, in the preceding tasks, all we did was accumulate the values when executing reduce(). However, finding a minimum or maximum value is another matter altogether.

Whereas we did summing with previous accumulations, here we have to start with the value of the first Price element. Then we will replace it with another value if that value is greater than what we have. Thus, in the end, we end up with the highest value. This logic applies to when you are seeking the minimum value too.

Hence, include this code to calculate your max and min values for Price elements:

public class Price {
    // Fields, getters, constructors, other methods
    
    public Price getMin(Price otherPrice){
        return new Price(Double.min(value, otherPrice.getValue()));
    }
    
    public Price getMax(Price otherPrice){
          return new Price(Double.max(value, otherPrice.getValue()));
    }
}

And when you include these capabilities in your Grocery objects calculations, you will get a reduce() operation that looks like this:

transactions.stream()
        .map(Transaction::getTotalPrice)
        .reduce(Price::getMax)
        .ifPresent(price -> System.out.printf("Highest transaction price: %s\n", price));

With an output of:

Highest transaction price: $79.92

Note too, that we have used the reduce() variant that takes only one parameter: a BinaryOperator. The thinking is: we do not need an identity parameter because we will not need a default starting point for this operation.

When you are seeking the maximum value from a collection of elements, you start testing those elements directly without involving any external default value.

Lowest Value Transaction

Continuing with the trend we started with the preceding tasks, we delegate the query on which is the lowest value transaction to the Transaction elements themselves.

Further, because we need a result that contains an entire Transaction element's details, we direct all the interrogation to a stream of Transaction elements without mapping them into any other type.

Still, there is a bit of work you should do to make a Transaction element gauge its value in terms of Price.

First, you will need to find the minimum Price of two Transaction objects.

Then, check which Transaction had that minimum Price and return it.

Otherwise, you will accomplish that by using a routine such as this getMin method:

public class Transaction {
    // Fields, getters, constructors, other methods
    
    public Transaction getMin(Transaction otherTransaction) {
        Price min = this.getTotalPrice().getMin(otherTransaction.getTotalPrice());
        return min.equals(this.getTotalPrice()) ? this : otherTransaction;
    }
}

With that done, it becomes fairly simple to incorporate the routine into a reduce() operation such as this one:

transactions.stream()
        .reduce(Transaction::getMin)
        .ifPresent(transaction -> {
                System.out.printf("Transaction with lowest value: %s\n", transaction);
        });

To get an output of:

Transaction with lowest value { Product: Tomato; price: $3.49 Qty: 5 lbs Total price: $17.45}

Again, an output such as this one is attainable when you exploit the toString() fully. Use it to generate as much information as possible to make an object's value human friendly when you print it out.

Conclusion

As Java's implementation of the commonplace fold routine, reduce() is fairly effective. Yet, as we have seen, it demands a total rethink of how you design your classes to be able to exploit it fully.

Keep in mind, though, that reduce() can lower your code's performance if you use it wrongly. The operation works in both sequential and parallel streams. It can however get tricky in when you use it with huge streams because reduce() is not efficient in mutable reduction operations.

We saw a case, for example, where you could use reduce() to concatenate String elements. Remember String objects are immutable. Thus, when we used reduce() for accumulation, we actually created very many String objects in every accumulation pass.

Yet, if you try to remedy that shortcoming by using a mutable identity container like a List we exposed that container to ConcurrentModification exceptions.

Otherwise, we have explored a use case of a grocery store's transactions. We designed the code for this scenario in such a way that every accumulation carries out small and fast calculations.

Yes, new object allocations are still there for every accumulation we call with reduce(). But, we have made them as simple as possible. As a result, our implementation can work just as well when you parallelize the Transaction streams.

The code used for this article comes complete with unit tests. So, feel free to explore the code and its inner workings on GitHub.

Last Updated: October 21st, 2021
Was this article helpful?

Improve your dev skills!

Get tutorials, guides, and dev jobs in your inbox.

No spam ever. Unsubscribe at any time. Read our Privacy Policy.

Hiram KamauAuthor

In addition to catching code errors and going through debugging hell, I also obsess over whether writing in an active voice is truly better than doing it in passive.

Want a remote job?

    Prepping for an interview?

    • Improve your skills by solving one coding problem every day
    • Get the solutions the next morning via email
    • Practice on actual problems asked by top companies, like:
     
     
     

    Better understand your data with visualizations

    With over 330+ pages, you'll learn the ins and outs of visualizing data in Python with popular libraries like Matplotlib, Seaborn, Bokeh, and more.

    © 2013-2021 Stack Abuse. All rights reserved.