Jan 12, 2020 in Big Data | Hadoop
Q: What is the use of CLUSTERED BY clause during table creation in Hive?

1 Answer

0 votes
Jan 12, 2020

CLUSTERED BY in Hive is same as DISTRIBUTE BY and SORT

BY. When we specify CLUSTERED BY, it will first distribute the data into different reducers by using a Hash. Once data is distributed, it will sort the data.

 

We have to specify CLUSTERED BY clause during table creation. But it is useful in querying of data in Hive.

Related questions

0 votes
Jan 12, 2020 in Big Data | Hadoop
0 votes
Jan 31, 2020 in Cassandra
...