Java 8 Streams API - Parallel Streams: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
=Internal=
=Internal=


{{Internal|Java 8 Streams API#Parallel_Streams|Parallel Streams}}
{{Internal|Java 8 Streams API#Parallel_Streams|Java 8 Streams API}}


=TODO=
=TODO=
Line 18: Line 18:


Parallel stream implementations uses [[Java_7_Fork/Join_Framework#Overview|fork/join framework]] introduced in Java 7, specifically the ForkJoinPool.
Parallel stream implementations uses [[Java_7_Fork/Join_Framework#Overview|fork/join framework]] introduced in Java 7, specifically the ForkJoinPool.
Some stream operations are more parallelizable than others. For example, a Stream.iterate() operation cannot be split in chunks that can be executed independently because the input of a function invocation is dependent on the result of the previous invocation. [[Java_8_Streams_API_Stream_Creation#Numeric_Ranges|LongStream.rangeClosed()]], on the other hand, produces ranges of numbers that can be split into independent chunks.


=Configuring the Thread Pool Used by Parallel Streams=
=Configuring the Thread Pool Used by Parallel Streams=


Parallel streams use internally the default <tt>ForkJoinPool</tt>, which by default has as many threads as the machine has processors, as returned by <tt>Runtime.getRuntime().availableProcessors()</tt>. The size of the pool can be changed globally per VM setting the "java.util.concurrent.ForkJoinPool.common.parallelism" system property.
Parallel streams use internally the default <tt>ForkJoinPool</tt>, which by default has as many threads as the machine has processors, as returned by <tt>Runtime.getRuntime().availableProcessors()</tt>. The size of the pool can be changed globally per VM setting
 
System.setProperty("java.util.concurrent.ForkJoinPool.common.parallelism", "100");
 
This is a global setting that will affect all parallel streams in the code.

Latest revision as of 20:47, 6 April 2018

Internal

Java 8 Streams API

TODO

Explain how data is partitioned.

Overview

A parallel stream is a stream that splits its elements into multiple chunks, processing each chunk with a different thread. This way the workload is automatically partitioned on all cores of a multicore processor.

A parallel stream is produced by a collection by invoking its parallelStream() method. An existing stream can be parallelized by invoking parallel() on the stream itself. Calling parallel() on a sequential stream does not imply any concrete transformation on the stream itself. Internally, a boolean flag is set to signal that the subsequent operations should be run in parallel. A parallel stream can be turned into a sequential stream by invoking sequential(). The last call on parallel() or sequential() affects the pipeline globally, it is not possible to have parts of the pipeline executed in parallel and parts of it executed sequentially.

Parallel stream implementations uses fork/join framework introduced in Java 7, specifically the ForkJoinPool.

Some stream operations are more parallelizable than others. For example, a Stream.iterate() operation cannot be split in chunks that can be executed independently because the input of a function invocation is dependent on the result of the previous invocation. LongStream.rangeClosed(), on the other hand, produces ranges of numbers that can be split into independent chunks.

Configuring the Thread Pool Used by Parallel Streams

Parallel streams use internally the default ForkJoinPool, which by default has as many threads as the machine has processors, as returned by Runtime.getRuntime().availableProcessors(). The size of the pool can be changed globally per VM setting

System.setProperty("java.util.concurrent.ForkJoinPool.common.parallelism", "100");

This is a global setting that will affect all parallel streams in the code.