GroupedValues takes a
PCollection>>,
such as the result of
GroupByKey, applies a specified
CombineFn to each of the input
KV>elements to produce a combined output
KV element, and returns a
PCollection> containing all the combined output elements. It is common for
InputT == OutputT, but not required. Common combining functions include sums, mins,
maxes, and averages of numbers, conjunctions and disjunctions of booleans, statistical
aggregations, etc.
Example of use:
PCollection> pc = ...;
See also
#perKey/
PerKey, which captures the common pattern of
"combining by key" in a single easy-to-use
PTransform.
Combining for different keys can happen in parallel. Moreover, combining of the
Iterable values associated a single key can happen in parallel, with different subsets
of the values being combined separately, and their intermediate results combined further, in an
arbitrary tree reduction pattern, until a single result value is produced for each key.
By default, the
Coder of the keys of the output
PCollection>is that of the keys of the input
PCollection>, and the
Coder of
the values of the output
PCollection> is inferred from the concrete type
of the
CombineFn's output type
OutputT.
Each output element has the same timestamp and is in the same window as its corresponding
input element, and the output
PCollection has the same
org.apache.beam.sdk.transforms.windowing.WindowFn associated with it as the input.
See also
#globally/
Globally, which combines all the values
in a
PCollection into a single value in a
PCollection.