HomeArticles

Java 8 - Difference Between map() and flatMap()

Introduction

While Java is primarily an Object Oriented Language, many concepts of Functional Programming have been incorporated into the language. Functional programming uses functions to create and compose programming logic, typically in a declarative manner (i.e. telling the program what's wanted and not how to do it).

If you'd like to read more about Functional Interfaces and a holistic view into Functional Programming in Java - read our Guide to Functional Interfaces and Lambda Expressions in Java!

With the introduction of JDK 8, Java added a number of key Functional Programming constructs - including map() and flatMap().

Note: This guide covers these two functions in the context of their differences.

The map() function is used to transform a stream from one form to another while flatMap() function is a combination of map and flattening operations.

If you'd like to read more about these functions individually with in-depth details, efficiency benchmarks, use-cases and best-practices - read our Java 8 Streams: Definitive Guide to flatMap() and Java 8 - Stream.map() Examples!

Let's begin by first highlighting their differences in Optionals!

Difference Between map() and flatMap() in Optionals

To understand the difference between map() and flatMap() in Optionals, we need to briefly understand the concept of Optionals first. The optional class was introduced in Java 8 to introduce the easiest way to deal with NullPointerException.

As per the official documentation:

Optional is a container object which may or may not contain a non-null value.

The optional class serves the purpose of representing whether a value is present or not. The Optional class has a wide range of methods that are grouped into two categories:

Creation Methods: These methods are in charge of creating Optional objects according to the use case.
Instance Methods: These methods operate on an existing Optional object, determining whether the value is present or not, retrieving the wrapper object, manipulating it, and finally returning the updated Optional object.

map() and flatMap() can both be used with the Optional class, and because they were frequently used to wrap and unwrap nested optionals - they were added methods in the class itself as well.

The signature of the map() function in Optional is:

public<U> Optional<U> map(Function<? super T, ? extends U> mapper)

The signature of the flatMap() in Optional is:

public<U> Optional<U> flatMap(Function<? super T, Optional<U>> mapper)

Both the map() and flatMap() functions take mapper functions as arguments and output an Optional<U>. The distinction between these two is noticed when the map() function is used to transform its input into Optional values. The map() function would wrap the existing Optional values with another Optional, whereas the flatMap() function flattens the data structure so that the values keep just one Optional wrapping.

Let us try to understand the problem with the following code:

Optional optionalObj1 = Optional.of("STACK ABUSE")
  .map(s -> Optional.of("STACK ABUSE"));
System.out.println(optionalObj1);

The following is the output of the above:

Optional[Optional[STACK ABUSE]]

As we can see, the output of map() has been wrapped in an additional Optional. On the other hand, when using a flatMap() instead of a map():

Optional optionalObj2 = Optional.of("STACK ABUSE")
  .flatMap(s -> Optional.of("STACK ABUSE"));
System.out.println(optionalObj2);

We end up with:

Optional[STACK ABUSE]

flatMap() doesn't re-wrap the result in another Optional, so we're left with the original one. This same behavior can be used to unwrap optionals.

Since simple examples like the one we've covered just now don't perfectly convey when this mechanism really makes or breaks a feature - let's create a small environment in which it does. The following example depicts a Research Management System, which well, keeps track of researchers in an institute.

Given a mock service that fetches a researcher based on some researcherId - we're not guaranteed to have a result back, so each Researcher is wrapped as an optional. Additionally, their StudyArea might not be present for some reason (such as an area not being assigned yet if a researcher is new to the institute), so it's an optional value as well.

That being said, if you were to fetch a researcher and get their area of study, you'd do something along these lines:

Optional<Researcher> researcherOptional = researcherService.findById(researcherId);

Optional<StudyArea> studyAreaOptional = researcherOptional
    .map(res -> Researcher.getResearchersStudyArea(res.getId()))
    .filter(studyArea -> studyArea.getTopic().equalsIgnoreCase("Machine Learning"));

System.out.println(studyAreaOptional.isPresent());
System.out.println(studyAreaOptional);
System.out.println(studyAreaOptional.get().getTopic());

Let's check the result of this code:

true 
Optional[StudyArea@13969fbe] 
Machine Learning

Because the StudyArea, which is an optional value depends on another optional value - it's wrapped as a double optional in the result. This doesn't work really well for us, since we'd have to get() the value over and over again. Additionally, even if the StudyArea was in fact, null, the isPresent() check would return true.

An optional of an empty optional, isn't empty itself.

Optional optional1 = Optional.empty();
Optional optional2 = Optional.of(optional1);

System.out.println(optional2.isPresent());
// true

In this scenario - isPresent() checks for something we're not really wanting to check, the second line doesn't really print the StudyArea we want to view and the final line will throw a NullPointerException if the StudyArea isn't actually present. Here - map() does quite a bit of damage because:

Map returns an empty optional if the Researcher object is absent in the optionalResearcher object.
Map returns an empty optional if the getResearchersStudyArea returns null instead of the StudyArea object.

Alternatively, you could visualize the pipeline:

The statement optionalResearcher.map(res -> Researcher.getResearchersStudyArea(res.getId()) will now produce an Optional<Optional<Researcher>> object. We may solve this problem by using flatMap() as it won't wrap the result in another Optional:

Optional<StudyArea> studyAreaOptional = optionalResearcher
        .flatMap(res -> Researcher.getResearchersStudyArea(res.getId()))
        .filter(studyArea -> studyArea.getTopic().equalsIgnoreCase("Machine Learning"));

Free eBook: Git Essentials

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

This way - all three lines we've used to display information about the researcher work as intended!

Difference Between map() and flatMap() in Streams

To understand the difference between map() and flatMap() in Streams, it's worth reminding ourselves how Streams work. The Streams API was introduced in Java 8 and has proven to be an extremely powerful tool for working with collections of objects. A stream can be characterized as a sequence of data, stemming from a source, in which numerous different procedures/transformations can be piped together to produce the desired outcome.

There are three stages to the stream pipeline:

Source: It denotes the origin of a stream.
Intermediate Operations: These are the intermediary processes that change streams from one form to another, as the name implies. Stream processing can have zero or several intermediate processes.
Terminal Operations: This is the last step in the process that results in a final state that is the end result of the pipeline. The most common terminal operation is collecting the stream back into a tangible Collection. Without this stage, the outcome would be impossible to obtain.

map() and flaMap() both are the intermediate operations offered by the Stream in the java.util.stream.Stream package.

The signature of the map() is:

<R> Stream<R> map(Function<? super T, ? extends R> mapper)

The signature of the flatMap() is:

<R> Stream<R> flatMap(Function<? super T, ? extends Stream<? extends R>> mapper)

As can be seen from the method signatures, both the map() and flatMap() take mapping functions as arguments and return a Stream<R> as output. The only difference in the arguments is that the map() takes in a Stream<T> as input while flatMap() takes in a Stream<Stream<T>> as input.

In short - map() is accepts a Stream<T> and maps its elements onto Stream<R> where each resulting R has a corresponding initial T, while flatMap() accepts a Stream<Stream<T>> and maps each sub-stream's element into a new Stream<R> that represents a flattened list of original streams.

Furthermore, map() and flatMap() can be distinguished in a way that map() generates a single value against an input while flatMap() generates zero or any number values against an input. In other words, map() is used to transform the data while the flatMap()is used to transform and flatten the stream.

Following is the example of one-to-one mapping in map():

List<String> websiteNamesList = Stream.of("Stack", "Abuse")
            .map(String::toUpperCase)
            .collect(Collectors.toList());

System.out.println(websiteNamesList);

This results in:

[STACK, ABUSE]

We've mapped the original values to their uppercase counterparts - it was a transformative process where a Stream<T> was mapped onto Stream<R>.

On the other hand, if we were working with more complex Streams:

Stream<String> stream1 = Stream.of("Stack", "Abuse");
Stream<String> stream2 = Stream.of("Real", "Python");
Stream<Stream<String>> stream = Stream.of(stream1, stream2);

List<String> namesFlattened = stream
        .flatMap(s -> s)
        .collect(Collectors.toList());

System.out.println(namesFlattened);

Here - we've got a stream of streams, where each stream contains a couple of elements. When flat-mapping, we're dealing with streams, not elements. Here, we've just decided to leave the streams as they are (run no operations on them) via s->s, and collect their elements into a list. flatMap() collects the elements of the substreams into a list, not the streams themselves, so we end up with:

[Stack, Abuse, Real, Python]

A more illustrative example could build upon the Research Management System. Say we want to group data from researchers into categories based on their areas of study in a Map<String, List<Researcher>> map where the key is an area of study and the list corresponds to the people working in it. We would have a list of researchers to work with before grouping them, naturally.

In this entry set - we might want to filter or perform other operations on the researchers themselves. In most cases, map() will not work or behave oddly because we cannot apply many methods, such as filter(), straight to the Map<String, List<Researcher>>. This leads us to the use of flatMap(), where we stream() each list and then perform operations on those elements.

With the preceding scenario in mind, consider the following example, which demonstrates flatMap()'s one-to-many mapping:

ResearchService researchService = new ResearchService();
Map<String, List<Researcher>> researchMap = new HashMap<>();
List<Researcher> researcherList = researchService.findAll();

researchMap.put("Machine Learning", researcherList);

List<Researcher> researcherNamesList = researchMap.entrySet().stream()
        // Stream each value in the map's entryset (list of researchers)
        .flatMap(researchers -> researchers.getValue().stream())
        // Arbitrary filter for names starting with "R"
        .filter(researcher -> researcher.getName().startsWith("R"))
        // Collect Researcher objects to list
        .collect(Collectors.toList());

researcherNamesList.forEach(researcher -> {
    System.out.println(researcher.getName());
});

The Researcher class only has an id, name and emailAddress:

public class Researcher {
    private int id;
    private String name;
    private String emailAddress;

    // Constructor, getters and setters 
}

And the ResearchService is a mock service that pretends to call a database, returning a list of objects. We can easily mock the service by returning a hard-coded (or generated) list instead:

public class ResearchService {

    public List<Researcher> findAll() {
        Researcher researcher1 = new Researcher();
        researcher1.setId(1);
        researcher1.setEmailAddress("[email protected]");
        researcher1.setName("Reham Muzzamil");

        Researcher researcher2 = new Researcher();
        researcher2.setId(2);
        researcher2.setEmailAddress("[email protected]");
        researcher2.setName("John Doe");
        
        // Researcher researcherN = new Researcher();
        // ...
        
        return Arrays.asList(researcher1, researcher2);
    }
}

If we run the code snippet, even though there's only one list in the map - the entire map was flattened to a list of researchers, filtered out with a filter and the one researcher left is:

Reham Muzzamil

If we visualize the pipeline, it would look something like this:

If we were to replace flatMap() with map():

.map(researchers -> researchers.getValue().stream()) // Stream<Stream<Researcher>>

We wouldn't be able to proceed with the filter(), since we'd be working with a nested stream. Instead, we flatten the stream of streams into a single one, and then run operations on these elements.

Conclusion

In this guide, we have seen the difference between map() and flatMap() in Optional and Stream along with their use-cases and code examples.

To sum up, in the context of the Optional class, both map() and flatMap() are used to transform Optional<T> to Optional<U> but if the mapping function generates an optional value, map() adds an additional layer while flatMap() works smoothly with nested optionals and returns the result in a single layer of optional values.

Similarly, map() and flatMap() can also be applied to Streams - where map() takes in a Stream<T> and returns a Stream<R> where T values are mapped to R, while flatMap() takes in a Stream<Stream<T>> and returns a Stream<R>.