Guide to Java 8 Collectors: reducing()

Introduction

A stream represents a sequence of elements and supports different kinds of operations that lead to the desired result. The source of a stream is usually a Collection or an Array, from which data is streamed from.

Streams differ from collections in several ways; most notably in that the streams are not a data structure that stores elements. They're functional in nature, and it's worth noting that operations on a stream produce a result and typically return another stream, but do not modify its source.

To "solidify" the changes, you collect the elements of a stream back into a Collection.

In this guide, we'll take a look at how to reduce elements through a downstream collector, with the help of Collectors.reducing().

Reduction operations are one of the most common and powerful operations in Functional Programming. Additionally, you can reduce elements via the reduce() method - however, it's typically associated with reducing a collection to a single value. reducing() on the other hand is associated with collecting a stream into a list of reduced values instead.

Note: Both approaches can be used to produce lists of reduced values, as well. In general - you'll use map() and reduce() if you're reducing a stream from the get-go into a result, and you'll use reducing() as a downstream collector within an operation pipeline with other collectors and operations.

If you'd like to read more about reduce() - read our Java 8 Streams: Definitive Guide to reduce()!

Collectors and Stream.collect()

Collectors represent implementations of the Collector interface, which implements various useful reduction operations, such as accumulating elements into collections, summarizing elements based on a specific parameter, etc.

All predefined implementations can be found within the Collectors class.

You can also very easily implement your own collector and use it instead of the predefined ones, though - you can get pretty far with the built-in collectors, as they cover the vast majority of cases in which you might want to use them.

To be able to use the class in our code we need to import it:

import static java.util.stream.Collectors.*;

Stream.collect() performs a mutable reduction operation on the elements of the stream.

A mutable reduction operation collects input elements into a mutable container, such as a Collection, as it processes the elements of the stream.

Guide to Collectors.reducing()

Within the Collectors class is a vast number of methods, allowing us to collect streams in a myriad of ways. As reduction is a very common operation - it offers a reduction method that operates on all the elements of a stream - returning their reduced variants.

There are three different overloaded variants of this method. They differ from each other by the number of arguments they take in, what those arguments do, as well as the return value. We'll be discussing them all separately in detail as we go along in this guide.

The arguments are the exact ones you'd expect from a reduction operation, and exactly the same ones reduce() uses:

public static <T> Collector<T,?,Optional<T>> reducing(BinaryOperator<T> op)
    
public static <T> Collector<T,?,T> reducing(T identity, BinaryOperator<T> op)
    
public static <T,U> Collector<T,?,U> reducing(U identity,
                                              Function<? super T,? extends U> mapper,
                                              BinaryOperator<U> op)

Note: The generic T in the method signatures represents the type of the input elements we're working with. The generic U in the third method signature represents the type of the mapped values.

In essence - you're dealing with the identity, mapper and combiner. The identity is the value that, when applied to itself, returns the same value. The mapper maps objects we're reducing to another value - commonly being one of the fields of the object. A combiner, well, combines the results into the final result returned to the user.

The reducing() collector is most useful when used in a multi-level reduction operation, downstream of groupingBy() or partitioningBy(). Otherwise, we could reasonably substitute it with Stream.map() and Stream.reduce() to perform a simple map-reduce on a stream instead.

If you're unfamiliar with these two collectors, read our Guide to Java 8 Collectors: groupingBy() and Guide to Java 8 Collectors: partitioningBy()!

Before we jump in and cover the different overloads of reducing(), let's go ahead and define a Student class that we'll be reducing in the upcoming examples:

public class Student {
    private String name;
    private String city;
    private double avgGrade;
    private int age;
    
    // Constructor, getters, setters and toString()
}

Let's also instantiate our students in a List:

List<Student> students = Arrays.asList(
    new Student("John Smith", "Miami", 7.38, 19),
    new Student("Mike Miles", "New York", 8.4, 21),
    new Student("Michael Peterson", "New York", 7.5, 20),
    new Student("James Robertson", "Miami", 9.1, 20),
    new Student("Joe Murray", "New York", 7.9, 19),
    new Student("Kyle Miller", "Miami", 9.83, 20)
);

Collectors.reducing() with a BinaryOperator

The first overload of the reducing() method takes in only one parameter - BinaryOperator<T> op. This parameter, as the name implies, represents an operation used to reduce the input elements.

A BinaryOperator is a functional interface so it can be used as the assignment target for a lambda expression or a method reference. Natively, BinaryOperator has two methods - maxBy() and minBy() both of which take a Comparator. The return value of these two methods is a BinaryOperator that returns the greater/lesser of the two elements.

In simpler terms - it accepts two inputs, and returns one output, based on some criteria.

If you'd like to read more about Functional Interfaces and Lambda Expressions - read our Guide to Functional Interfaces and Lambda Expressions in Java!

Free eBook: Git Essentials

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

Let's assume that from within our List of students we want to find the student with the best and worst grades in their respective city. We'll first need to use a collector that accepts another downstream collector, such as the partitioningBy() or groupingBy() collectors, after which we'll use the reducing() method to perform the required reduction.

Of course, we could also reduce them from the get-go via Stream.reduce() without grouping them first:

Map<String, Optional<Student>> reduceByCityAvgGrade = students.stream()
    .collect(Collectors
             .groupingBy(Student::getCity,
             Collectors.reducing(BinaryOperator
                                 .maxBy(Comparator
                                          .comparing(Student::getAvgGrade)))));

The student List is transformed into a Stream using the stream() method, after which we collect the grouped elements into groups, reducing() the list of students in each city to a single student in each city with the highest grade. This variant of the method always returns a Map<T, Optional<T>>.

After running this code, we get the following output:

{
New York=Optional[Student{name='Mike Miles', city='New York', avgGrade=8.4, age=21}], Miami=Optional[Student{name='Kyle Miller', city='Miami', avgGrade=9.83, age=20}]
}

Collectors.reducing() with a BinaryOperator and Identity

In the previous code sample - the result is wrapped in an optional. If there is no value - an Optional.empty() is returned instead. This is because there is no default value that can be used instead.

To deal with this, and remove the Optional wrapping, we can use the second variant of the reducing() overload, the one that takes in two arguments - a BinaryOperator and an Identity. The Identity represents the value for the reduction, and also the value that is returned when there are no input elements!

This time around, we pass in a 'default' value that kicks in if a value isn't present, and is used as the identity of the result:

Map<String, Student> reduceByCityAvgGrade = students.stream()
    .collect(Collectors
             .groupingBy(Student::getCity,
                         Collectors.reducing(new Student("x", "x", 0.0, 0),
                                 BinaryOperator.maxBy(Comparator
                                          .comparing(Student::getAvgGrade)))));

In our case, for Identity we use a new Student object. The name, city and age fields have no impact on our result while using the reducing() method, so it doesn't really matter what we put as these three values. However, as we are reducing our input data by the avgGrade field, that one matters. Any value that can be logically correct here is valid.

We've put a 0.0 grade as the default one, with "x" for the name and city, denoting an empty result. The lowest grade can be 6.0 so 0.0 and the missing name signals an empty value - but we can actually expect Student objects instead of Optionals now:

{
New York=Student{name='Mike Miles', city='New York', avgGrade=8.4, age=21},
Miami=Student{name='Kyle Miller', city='Miami', avgGrade=9.83, age=20}
}

Collectors.reducing() with a BinaryOperator, Identity and Mapper

The last of the three overloaded variants takes in one extra argument in addition to the previous two - a mapper. This argument represents a mapping function to apply to each element.

You don't have to group by a city to perform the reducing() operation:

double largestAverageGrade = students.stream()
    .collect(Collectors.reducing(0.0, Student::getAvgGrade,
                                 BinaryOperator.maxBy(Comparator.comparingDouble(value -> value))));

This would return 9.83, which in fact is the largest value assigned of all of the avgGrade fields assigned to all of the student objects within the List. However, if you're using an IDE or tool that detects code-smell, you'll quickly get recommended to change the above line into the following:

double largestAverageGrade = students.stream()
    .map(Student::getAvgGrade)
    .reduce(0.0, BinaryOperator.maxBy(Comparator.comparingDouble(value -> value)));

map() and reduce() is preferred if you're not really doing anything else. reducing() is preferred as a downstream collector.

With a mapper - you can map the values you've reduced to something else. Commonly, you'll map objects to one of their fields. We can map Student objects to their names, cities or grades, for instance. In the following code snippet, we'll group students by their city, reduce each city list based on their grades to the highest-grade student, and then map this student to their grade, resulting in a single value per city:

Map<String, Double> reduceByCityAvgGrade1 = students.stream()
    .collect(Collectors
             .groupingBy(Student::getCity,
                         Collectors.reducing(6.0, Student::getAvgGrade,
                                 BinaryOperator.maxBy(Comparator
                                          .comparingDouble(i->i)))));

This gives us a slightly different output than we earlier had:

{New York=8.4, Miami=9.83}

Considering the amount of collectors you can use instead and chain this way - you can do a lot of work using just the built-in collectors and stream operations.

Conclusion

In this guide we've covered the usage of the reducing() method from the Collectors class. We covered all three of its overloads and discussed their usages through practical examples.

Last Updated: March 29th, 2023
Was this article helpful?

Improve your dev skills!

Get tutorials, guides, and dev jobs in your inbox.

No spam ever. Unsubscribe at any time. Read our Privacy Policy.

Make Clarity from Data - Quickly Learn Data Visualization with Python

Learn the landscape of Data Visualization tools in Python - work with Seaborn, Plotly, and Bokeh, and excel in Matplotlib!

From simple plot types to ridge plots, surface plots and spectrograms - understand your data and learn to draw conclusions from it.

© 2013-2024 Stack Abuse. All rights reserved.

AboutDisclosurePrivacyTerms