Apr 24, 2020 in Big Data | Hadoop
  • To get good performance and organization of data, we create the partition.

What if after partition also latency is high.

  • For that, you perform bucketing. This helps to achieve the best optimization.
  • The movement of files into buckets is decided by Hashing.

For example, F(x)%(number of buckets)=(which bucket data will go).

{The data will go to a bucket with matching hash code}

Related questions

0 votes
Apr 24, 2020 in Big Data | Hadoop