Java 8 Streams API - Parallel Streams

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

Internal

Parallel Streams

TODO

Explain how data is partitioned.

Overview

A parallel stream is a stream that splits its elements into multiple chunks, processing each chunk with a different thread. This way the workload is automatically partitioned on all cores of a multicore processor.

A parallel stream is produced by a collection by invoking its parallelStream() method. An existing stream can be parallelized by invoking parallel() on the stream itself. Calling parallel() on a sequential stream does not imply any concrete transformation on the stream itself. Internally, a boolean flag is set to signal that the subsequent operations should be run in parallel. A parallel stream can be turned into a sequential stream by invoking sequential(). The last call on parallel() or sequential() affects the pipeline globally, it is not possible to have parts of the pipeline executed in parallel and parts of it executed sequentially.

Parallel stream implementations uses fork/join framework introduced in Java 7, specifically the ForkJoinPool.

Configuring the Thread Pool Used by Parallel Streams

Parallel streams use internally the default ForkJoinPool, which by default has as many threads as the machine has processors, as returned by Runtime.getRuntime().availableProcessors(). The size of the pool can be changed globally per VM setting the "java.util.concurrent.ForkJoinPool.common.parallelism" system property.