Introduction
A stream represents a sequence of elements and supports different kinds of operations that lead to the desired result. The source of a stream is usually a Collection or an Array, from which data is streamed from.
Streams differ from collections in several ways; most notably in that the streams are not a data structure that stores elements. They're functional in nature, and it's worth noting that operations on a stream produce a result and typically return another stream, but do not modify its source.
To "solidify" the changes, you collect the elements of a stream back into a Collection
.
In this guide, we'll take a look at how to reduce elements through a downstream collector, with the help of Collectors.reducing().
Reduction operations are one of the most common and powerful operations in Functional Programming. Additionally, you can reduce elements via the reduce()
method - however, it's typically associated with reducing a collection to a single value. reducing()
on the other hand is associated with collecting a stream into a list of reduced values instead.
Note: Both approaches can be used to produce lists of reduced values, as well. In general - you'll use map()
and reduce()
if you're reducing a stream from the get-go into a result, and you'll use reducing()
as a downstream collector within an operation pipeline with other collectors and operations.
If you'd like to read more about
reduce()
- read our Java 8 Streams: Definitive Guide to reduce()!
Collectors and Stream.collect()
Collectors represent implementations of the Collector
interface, which implements various useful reduction operations, such as accumulating elements into collections, summarizing elements based on a specific parameter, etc.
All predefined implementations can be found within the
Collectors
class.
You can also very easily implement your own collector and use it instead of the predefined ones, though - you can get pretty far with the built-in collectors, as they cover the vast majority of cases in which you might want to use them.
To be able to use the class in our code we need to import it:
import static java.util.stream.Collectors.*;
Stream.collect()
performs a mutable reduction operation on the elements of the stream.
A mutable reduction operation collects input elements into a mutable container, such as a Collection
, as it processes the elements of the stream.
Guide to Collectors.reducing()
Within the Collectors
class is a vast number of methods, allowing us to collect streams in a myriad of ways. As reduction is a very common operation - it offers a reduction method that operates on all the elements of a stream - returning their reduced variants.
There are three different overloaded variants of this method. They differ from each other by the number of arguments they take in, what those arguments do, as well as the return value. We'll be discussing them all separately in detail as we go along in this guide.
The arguments are the exact ones you'd expect from a reduction operation, and exactly the same ones reduce()
uses:
public static <T> Collector<T,?,Optional<T>> reducing(BinaryOperator<T> op)
public static <T> Collector<T,?,T> reducing(T identity, BinaryOperator<T> op)
public static <T,U> Collector<T,?,U> reducing(U identity,
Function<? super T,? extends U> mapper,
BinaryOperator<U> op)
Note: The generic T
in the method signatures represents the type of the input elements we're working with. The generic U
in the third method signature represents the type of the mapped values.
In essence - you're dealing with the identity, mapper and combiner. The identity is the value that, when applied to itself, returns the same value. The mapper maps objects we're reducing to another value - commonly being one of the fields of the object. A combiner, well, combines the results into the final result returned to the user.
The reducing()
collector is most useful when used in a multi-level reduction operation, downstream of groupingBy()
or partitioningBy()
. Otherwise, we could reasonably substitute it with Stream.map()
and Stream.reduce()
to perform a simple map-reduce on a stream instead.
If you're unfamiliar with these two collectors, read our Guide to Java 8 Collectors: groupingBy() and Guide to Java 8 Collectors: partitioningBy()!
Before we jump in and cover the different overloads of reducing()
, let's go ahead and define a Student
class that we'll be reducing in the upcoming examples:
public class Student {
private String name;
private String city;
private double avgGrade;
private int age;
// Constructor, getters, setters and toString()
}
Let's also instantiate our students in a List
:
List<Student> students = Arrays.asList(
new Student("John Smith", "Miami", 7.38, 19),
new Student("Mike Miles", "New York", 8.4, 21),
new Student("Michael Peterson", "New York", 7.5, 20),
new Student("James Robertson", "Miami", 9.1, 20),
new Student("Joe Murray", "New York", 7.9, 19),
new Student("Kyle Miller", "Miami", 9.83, 20)
);
Collectors.reducing() with a BinaryOperator
The first overload of the reducing()
method takes in only one parameter - BinaryOperator<T> op
. This parameter, as the name implies, represents an operation used to reduce the input elements.
A BinaryOperator
is a functional interface so it can be used as the assignment target for a lambda expression or a method reference. Natively, BinaryOperator
has two methods - maxBy()
and minBy()
both of which take a Comparator
. The return value of these two methods is a BinaryOperator
that returns the greater/lesser of the two elements.
In simpler terms - it accepts two inputs, and returns one output, based on some criteria.
If you'd like to read more about Functional Interfaces and Lambda Expressions - read our Guide to Functional Interfaces and Lambda Expressions in Java!
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
Let's assume that from within our List
of students we want to find the student with the best and worst grades in their respective city. We'll first need to use a collector that accepts another downstream collector, such as the partitioningBy()
or groupingBy()
collectors, after which we'll use the reducing()
method to perform the required reduction.
Of course, we could also reduce them from the get-go via Stream.reduce()
without grouping them first:
Map<String, Optional<Student>> reduceByCityAvgGrade = students.stream()
.collect(Collectors
.groupingBy(Student::getCity,
Collectors.reducing(BinaryOperator
.maxBy(Comparator
.comparing(Student::getAvgGrade)))));
The student List
is transformed into a Stream
using the stream()
method, after which we collect the grouped elements into groups, reducing()
the list of students in each city to a single student in each city with the highest grade. This variant of the method always returns a Map<T, Optional<T>>
.
After running this code, we get the following output:
{
New York=Optional[Student{name='Mike Miles', city='New York', avgGrade=8.4, age=21}], Miami=Optional[Student{name='Kyle Miller', city='Miami', avgGrade=9.83, age=20}]
}
Collectors.reducing() with a BinaryOperator and Identity
In the previous code sample - the result is wrapped in an optional. If there is no value - an Optional.empty()
is returned instead. This is because there is no default value that can be used instead.
To deal with this, and remove the Optional
wrapping, we can use the second variant of the reducing()
overload, the one that takes in two arguments - a BinaryOperator
and an Identity
. The Identity
represents the value for the reduction, and also the value that is returned when there are no input elements!
This time around, we pass in a 'default' value that kicks in if a value isn't present, and is used as the identity of the result:
Map<String, Student> reduceByCityAvgGrade = students.stream()
.collect(Collectors
.groupingBy(Student::getCity,
Collectors.reducing(new Student("x", "x", 0.0, 0),
BinaryOperator.maxBy(Comparator
.comparing(Student::getAvgGrade)))));
In our case, for Identity
we use a new Student
object. The name
, city
and age
fields have no impact on our result while using the reducing()
method, so it doesn't really matter what we put as these three values. However, as we are reducing our input data by the avgGrade
field, that one matters. Any value that can be logically correct here is valid.
We've put a 0.0
grade as the default one, with "x"
for the name and city, denoting an empty result. The lowest grade can be 6.0
so 0.0
and the missing name signals an empty value - but we can actually expect Student
objects instead of Optionals now:
{
New York=Student{name='Mike Miles', city='New York', avgGrade=8.4, age=21},
Miami=Student{name='Kyle Miller', city='Miami', avgGrade=9.83, age=20}
}
Collectors.reducing() with a BinaryOperator, Identity and Mapper
The last of the three overloaded variants takes in one extra argument in addition to the previous two - a mapper. This argument represents a mapping function to apply to each element.
You don't have to group by a city to perform the reducing()
operation:
double largestAverageGrade = students.stream()
.collect(Collectors.reducing(0.0, Student::getAvgGrade,
BinaryOperator.maxBy(Comparator.comparingDouble(value -> value))));
This would return 9.83
, which in fact is the largest value assigned of all of the avgGrade
fields assigned to all of the student objects within the List
. However, if you're using an IDE or tool that detects code-smell, you'll quickly get recommended to change the above line into the following:
double largestAverageGrade = students.stream()
.map(Student::getAvgGrade)
.reduce(0.0, BinaryOperator.maxBy(Comparator.comparingDouble(value -> value)));
map()
andreduce()
is preferred if you're not really doing anything else.reducing()
is preferred as a downstream collector.
With a mapper - you can map the values you've reduced to something else. Commonly, you'll map objects to one of their fields. We can map Student
objects to their names, cities or grades, for instance. In the following code snippet, we'll group students by their city, reduce each city list based on their grades to the highest-grade student, and then map this student to their grade, resulting in a single value per city:
Map<String, Double> reduceByCityAvgGrade1 = students.stream()
.collect(Collectors
.groupingBy(Student::getCity,
Collectors.reducing(6.0, Student::getAvgGrade,
BinaryOperator.maxBy(Comparator
.comparingDouble(i->i)))));
This gives us a slightly different output than we earlier had:
{New York=8.4, Miami=9.83}
Considering the amount of collectors you can use instead and chain this way - you can do a lot of work using just the built-in collectors and stream operations.
Conclusion
In this guide we've covered the usage of the reducing()
method from the Collectors
class. We covered all three of its overloads and discussed their usages through practical examples.