Skip to content

Latest commit

 

History

History
24 lines (15 loc) · 810 Bytes

spark-Aggregator.adoc

File metadata and controls

24 lines (15 loc) · 810 Bytes

Map/Reduce-side Aggregator

Aggregator is a set of functions used to aggregate distributed data sets:

createCombiner: V => C
mergeValue: (C, V) => C
mergeCombiners: (C, C) => C
Note
Aggregator is created in combineByKeyWithClassTag transformations to create ShuffledRDDs and is eventually passed on to ShuffleDependency. It is also used in ExternalSorter.

updateMetrics Internal Method

Caution
FIXME

combineValuesByKey Method

Caution
FIXME

combineCombinersByKey Method

Caution
FIXME