0 votes
in Big Data | Hadoop by

Explain about the SMB Join in Hive.

1 Answer

0 votes
by

In SMB join in Hive, each mapper reads a bucket from the first table and the corresponding bucket from the second table and then a merge sort join is performed. Sort Merge Bucket (SMB) join in hive is mainly used as there is no limit on file or partition or table join. SMB join can best be used when the tables are large. In SMB join the columns are bucketed and sorted using the join columns. All tables should have the same number of buckets in SMB join.

Related questions

0 votes
asked Apr 24, 2020 in Big Data | Hadoop by Hodge
0 votes
asked Apr 24, 2020 in Big Data | Hadoop by Hodge
...