Java 8 Streams API: Difference between revisions
(32 intermediate revisions by the same user not shown) | |||
Line 6: | Line 6: | ||
* [[Java#Java_8|Java]] | * [[Java#Java_8|Java]] | ||
=TODO= | |||
<font color=red> | |||
* Process https://docs.oracle.com/javase/8/docs/api/java/util/stream/package-summary.html | |||
</font> | |||
=Overview= | =Overview= | ||
The Streams API provides a method to represent and process sequenced data in parallel, transparently taking advantage of multi-core architectures. The Streams API offers a higher level of abstraction, based on the concept of transforming a stream of objects into a stream of different objects, while delegating parallel processing concerns to the runtime. The Streams API obviates the need for explicitly programming Threads and the [[Java synchronized mechanism#Inefficiencies|<tt>synchronized</tt> mechanism]]. This represents a shift to focusing on partitioning the data rather than coordinating access to it. | The Streams API provides a method to represent and process sequenced data in parallel, transparently taking advantage of multi-core architectures. The Streams API offers a higher level of abstraction than Java [[Java Collections|Collections]], based on the concept of transforming a stream of objects into a stream of different objects, while delegating parallel processing concerns to the runtime. The Streams API obviates the need for explicitly programming Threads and the [[Java synchronized mechanism#Inefficiencies|<tt>synchronized</tt> mechanism]]. This represents a shift to focusing on partitioning the data rather than coordinating access to it. | ||
=Stream= | =Stream= | ||
Line 41: | Line 49: | ||
=Source= | =Source= | ||
Stream data sources are collections, arrays and I/O resources. | Stream data sources are collections, arrays and I/O resources. For more details on how streams are created see [[#Stream_Creation|Stream Creation]]. | ||
Data elements generated by an ordered collection will have the same order in the stream. | Data elements generated by an ordered collection will have the same order in the stream. | ||
Line 48: | Line 56: | ||
[[#Stream_Operations|Stream operations]] are composed into a ''stream pipeline'' to perform a computation. | [[#Stream_Operations|Stream operations]] are composed into a ''stream pipeline'' to perform a computation. | ||
=<span id='Empty_Stream'></span><span id='From_Collections'></span><span id='Numeric_Ranges'></span><span id='From_Values'></span><span id='From_Arrays'></span><span id='From_Nullable'></span><span id='From_Files'></span><span id='From_Functions'></span>Stream Creation= | |||
{{Internal|Java 8 Streams API Stream Creation|Stream Creation}} | |||
=Stream Operation= | =Stream Operation= | ||
Stream operations | Stream operations have two important characteristics: | ||
# '''Pipelining'''. Most stream operations return a stream, allowing operations to be chained and form a larger [[#Stream_Pipeline|pipeline]] and enabling certain operations such as ''laziness'' and ''short-circuiting'' and | |||
# '''Internal Iteration'''. | |||
The stream operations that return a stream, and thus can be connected, are called [[#Intermediate_Operations|intermediate operations]]. The stream operations that close the stream and return a non-stream result are called [[#Terminal_Operations|terminal operations]]. | |||
Ideally, stream operations must be based on functions that don't interact - the encapsulated functionality must not access shared state. For more details see [[Functional_Programming#Overview|Functional Programming]]. | |||
==Intermediate Operations== | ==Intermediate Operations== | ||
A stream operation that returns | A stream operation that returns another stream, and thus can be connected to other stream operations to form a [[#Stream_Pipeline|pipeline]], is called ''intermediate operation''. Intermediate operations do not consume from streams, their purpose is to serve as a processing element in a pipeline. Intermediate operations do not perform any processing until a [[#Terminal_Operations|terminal operation]] is invoked on the stream pipeline. It is said that the intermediate operations are lazy. | ||
The idea behind a stream pipeline is similar to the [[Builder Pattern#Overview|builder pattern]]. | The idea behind a stream pipeline is similar to the [[Builder Pattern#Overview|builder pattern]]. | ||
Line 83: | Line 100: | ||
==Terminal Operations== | ==Terminal Operations== | ||
A stream operation that closes the stream and returns a non-stream result is called ''terminal operation''. | A stream operation that consumes and closes the stream and returns a non-stream result is called ''terminal operation''. | ||
===Reduction=== | ===Reduction=== | ||
A ''reduction operation'' is an operation through which a stream is reduced to a value | A ''reduction operation'' is an operation through which a stream is reduced to a value | ||
{{Internal|Java 8 Streams API - Reduction#Overview|Stream Reduction}} | |||
===Stream-Level Predicates=== | ===Stream-Level Predicates=== | ||
Line 179: | Line 127: | ||
==Stateful Operations== | ==Stateful Operations== | ||
=Short-Circuiting= | =Short-Circuiting= | ||
Line 295: | Line 155: | ||
Also see [[#Numeric_Ranges|Numeric Ranges]] above. | Also see [[#Numeric_Ranges|Numeric Ranges]] above. | ||
= | =Parallel Streams= | ||
{{Internal|Java 8 Streams API - Parallel Streams|Parallel Streams}} | |||
Latest revision as of 19:45, 6 April 2018
External
Internal
TODO
Overview
The Streams API provides a method to represent and process sequenced data in parallel, transparently taking advantage of multi-core architectures. The Streams API offers a higher level of abstraction than Java Collections, based on the concept of transforming a stream of objects into a stream of different objects, while delegating parallel processing concerns to the runtime. The Streams API obviates the need for explicitly programming Threads and the synchronized mechanism. This represents a shift to focusing on partitioning the data rather than coordinating access to it.
Stream
A stream is a sequence of data items of a specific type, which are conceptually produced one at a time by a source, and that supports sequential and parallel aggregate operations. There may be several intermediate operations, connected into a stream pipeline, and one terminal operation that executes the stream pipeline and produces a non-stream result.
All streams implement java.util.stream.Stream<T>.
Unlike collections, which are data structures for storing and accessing elements with specific time/space complexities, streams are about expressing computations on data. Unlike collections, which store all the values of the data structure in memory, and every element of the collection has to be computed before it can be added to collection, streams' elements are computed on demand. The Streams API uses behavior parameterization and expects code that parameterizes the behavior of its operations to be passed to the API. Collections are mostly about storing and accessing data, whereas the Streams API is mostly about describing computations on data.
A stream can be traversed only once. After it was traversed, a stream is said to be consumed. An attempt to consume it again throws:
java.lang.IllegalStateException: stream has already been operated upon or closed
The Streams API is meant as an alternative way of processing collections, in a declarative manner, by using internal iterations. Unlike in a collection's external iteration case, the loop over elements is managed internally inside the library. The API users provides a function specifying the computation that needs to be done.
Encounter Order
The encounter order specifies the order in which items logically appear in the stream. For example, a stream generated from a List will expose elements in the same order in which they appear in the List.
The fact that a stream has an encounter order or not depends on the source (List) and on the intermediate operations (sorted()). Some operations may render an ordered stream unordered (BaseStream.unordered()). unordered() makes sense because in cases where the stream has an encounter order, but the user does not care about it, explicit de-ordering may improve parallel performance for some stateful or terminal operations. For sequential streams, the presence or absence of an encounter order does not affect performance, only determinism. If a stream is ordered, repeated execution of identical stream pipelines on an identical source will produce an identical result; if it is not ordered, repeated execution might produce different results. For parallel streams, relaxing the ordering constraint can sometimes enable more efficient execution. Operations that are intrinsically tied to encounter order, such as limit(), may require buffering to ensure proper ordering, undermining the benefit of parallelism. Most stream pipelines, however, still parallelize efficiently even under ordering constraints.
Ordered Stream
An ordered stream is a stream that has a defined encounter order.
If a stream is ordered, most operations are constrained to operate on the elements in their encounter order.
Source
Stream data sources are collections, arrays and I/O resources. For more details on how streams are created see Stream Creation.
Data elements generated by an ordered collection will have the same order in the stream.
Stream Pipeline
Stream operations are composed into a stream pipeline to perform a computation.
Stream Creation
Stream Operation
Stream operations have two important characteristics:
- Pipelining. Most stream operations return a stream, allowing operations to be chained and form a larger pipeline and enabling certain operations such as laziness and short-circuiting and
- Internal Iteration.
The stream operations that return a stream, and thus can be connected, are called intermediate operations. The stream operations that close the stream and return a non-stream result are called terminal operations.
Ideally, stream operations must be based on functions that don't interact - the encapsulated functionality must not access shared state. For more details see Functional Programming.
Intermediate Operations
A stream operation that returns another stream, and thus can be connected to other stream operations to form a pipeline, is called intermediate operation. Intermediate operations do not consume from streams, their purpose is to serve as a processing element in a pipeline. Intermediate operations do not perform any processing until a terminal operation is invoked on the stream pipeline. It is said that the intermediate operations are lazy.
The idea behind a stream pipeline is similar to the builder pattern.
Filtering Data
Filtering in this context means dropping certain elements based on a criterion.
Transforming Data
Sorting Data
Stream<T> sorted();
This form applies to streams whose elements have a natural order (they implement Comparable). If the elements of this stream are not Comparable, a java.lang.ClassCastException may be thrown when the terminal operation is executed.
Stream<T> sorted(Comparator<? super T> comparator);
Sorting operations are stateful unbouned.
Terminal Operations
A stream operation that consumes and closes the stream and returns a non-stream result is called terminal operation.
Reduction
A reduction operation is an operation through which a stream is reduced to a value
Stream-Level Predicates
These "match" operations apply a predicate to all elements of the stream, subject to short-circuiting, and return a boolean result.
Find Methods
Other Terminal Operations
forEach() consumes each element from a stream and applies a lambda to each of them. The method returns void.
void forEach(Consumer<? super T> action);
Stateful Operations
Short-Circuiting
Some operations do not need to process the whole stream to produce a result. For example, the evaluation of anyMatch may stop at the first element that matches.
Autoboxing and Specialized Interfaces
The Streams API supplies primitive stream specializations that support specialized method to work with primitive types. These interfaces eliminate the need for autoboxing.
These interfaces bring new methods to perform common numeric reductions such as sum() and max(). In addition they have methods to convert back to a stream of objects when necessary. Example:
IntStream intStream = ...;
Stream<Integer> s = intStream.boxed();
Examples of specialized API: mapping and flat-mapping.
Specialized Interface Numeric Ranges
IntStream and LongStream expose static methods that generate numeric ranges: range() and rangeClosed().
Also see Numeric Ranges above.