Transforming Data with Java 8 Streams API

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

Internal

Overview

The Stream API offers the possibility to intercept a stream and converts its elements into elements of another type, offered also as a stream. This operations is conventionally named mapping. The world mapping is used because it has a meaning similar to transforming, but with the nuance of "creating a new version" rather than "modifying".

Mapping is an intermediate operation, in that the result of it is also a stream.

Mapping Data

https://docs.oracle.com/javase/10/docs/api/java/util/stream/Stream.html#flat(java.util.function.Function)

The Stream API exposes the map() method, which converts the stream's elements into elements of another type, offered also as a stream. The conversion if performed by a Function<T, R> presented as the argument of the map() method.

public interface Stream<T> {

    ...

    <R> Stream<R> map(Function<? super T, ? extends R> mappingFunction);

    ...

}

The above call performs autoboxing when the result of the lambda expression is a primitive type. In these cases, it is advisable to use a specialized mapping function and specialized primitive streams, to avoid autoboxing:

public interface Stream<T> {

    ...

    IntStream mapToInt(ToIntFunction<? super T> mapper);
    LongStream mapToLong(ToLongFunction<? super T> mapper);
    DoubleStream mapToDouble(ToDoubleFunction<? super T> mapper);
   
    ...

}

Note that applying map() to a primitive type stream results automatically in the same primitive type stream. To "generalize" again, use mapToObj().

Flat-Mapping Data

https://docs.oracle.com/javase/10/docs/api/java/util/stream/Stream.html#flatMap(java.util.function.Function)

There are situations when it is convenient to use a mapping function that produces a stream - in breaks down the elements of the original stream into sub-streams. If we used the map() function directly, the result would be a Stream<Stream<T>> which in most cases has no practical uses. It would be a lot more useful to produce a Stream containing the merged content of the sub-streams. This functionality is provided by the flatMap(). According to the documentation, flatMap() returns a stream consisting of the results of replacing each element of this stream with the contents of a mapped stream produced by applying the provided mapping function to each element. Each mapped stream is closed after its contents have been placed into this stream. If a mapped stream is null an empty stream is used, instead.

public interface Stream<T> {

    ...

    <R> Stream<R> flatMap(Function<? super T, ? extends Stream<? extends R>> mapper);

    ...

}

Example: If we have functionality that breaks down the content of a file into a stream of lines, and then a function that breaks down a line into worlds, we can get a stream of words by flat-mapping the stream of line into a stream of words - using map() would result a stream of stream of words:

String content = "a\nb c something d\nsomething else f\n";

Stream<String> streamOfLines = Stream.of(content.split("\n"));

//
// 's -> Stream.of(s.split(" +"))' lambda produces a stream of words
//
Stream<String> streamOfWords = streamOfLines.flatMap(s -> Stream.of(s.split(" +")));

The above call performs autoboxing when the result of the lambda expression is a primitive type. In these cases, it is advisable to use a specialized mapping function and specialized primitive streams, to avoid unnecessary autoboxing:

public interface Stream<T> {

    ...

    IntStream flatMapToInt(Function<? super T,? extends IntStream> mapper);
    LongStream flatMapToLong(Function<? super T,? extends LongStream> mapper);
    DoubleStream flatMapToDouble(Function<? super T,? extends DoubleStream> mapper);
   
    ...

}