+1 vote
in Azure by
Name the data flow partitioning schemes in Azure

1 Answer

0 votes
by
Partitioning Scheme is a way to optimise the performance of Data Flow. This partitioning scheme setting can be accessed on the Optimize tab of the configuration panel for the Data Flow Activity.

‘Use current partitioning’ is the default setting recommended by Microsoft in most cases that uses native partitioning schemes.

The ‘Single Partition’ option is used when users want to output to a single destination, for example, a single file in ADLS Gen2.

Some partition schemes are:

Round Robin: Simple Partition scheme that spreads data evenly across partitions

Hash: Hash of columns used to create uniform partitions (similar values in a partition)

Dynamic Range: Spark dynamic range based on given columns or expressions

Fixed Range: Partition for fix range of values based on user-provided expressions

Key: Partition for each unique value
...