Friday, March 14, 2014

Departition Components

Departition components combine the multiple flow partitions of data records into single flow as follows:

Concatenate :

Concatenate appends multiple flows of data records one after the other.

1)It reads all the records from the in port and copies them to the out port.
2)After reading all the records it will read the records from the second flow in port and append it after the first flow data records.

Gather:

Gather combines the data records from multiple flow partitions arbitarly .
Not key-based
Result ordering is unpredictable
Has no affect on the upstream processing
Most useful method for efficient collection of data from multiple flows.        
Multiple partitions and for repartitioning
Used most frequently.


Merge:
         Key-based.
         Result ordering is sorted if each input is sorted.
         Possibly synchronizes pipelined computation.
         May even serialize.
         Useful for creating ordered data flows.    
         Other than the ‘Gather’ , the Merge is the other ‘departitioner’ of choice.

1 comment: