in DevOps by
Q:
Difference between sort by or order by clause in Hive? Which is the fast?

1 Answer

0 votes
by

ORDER BY – sort the data in one reducer. Sort by much faster than order by.

SORT BY – sort the data within each reducer. You can use n number of reducers for sort.

In the first case (order by) maps sends each value to the single reducer and count them all.

In the second case (sort by) maps splits up the values to many reducers and each reduce generates its list and finds the count. So it can sort quickly.

Example:

SELECT name, id, cell FROM user_table ORDER BY id, name;

SELECT name, id, cell FROM user_table DISTRIBUTE BY id SORT BY name;

Click here to read more about DevOps
Click here to read more about Insurance

Related questions

0 votes
0 votes
asked Mar 31, 2020 in DBMS by amita rallin
0 votes
asked Apr 1, 2020 in Big Data | Hadoop by AdilsonLima
0 votes
asked Oct 18, 2019 in C Sharp by Robin
0 votes
asked Jan 31, 2020 in Gradle by rahuljain1
0 votes
asked Dec 15, 2020 in Sql by SakshiSharma
...