in DevOps by
Difference between sort by or order by clause in Hive? Which is the fast?

► Click here to show 1 Answer

0 votes

ORDER BY – sort the data in one reducer. Sort by much faster than order by.

SORT BY – sort the data within each reducer. You can use n number of reducers for sort.

In the first case (order by) maps sends each value to the single reducer and count them all.

In the second case (sort by) maps splits up the values to many reducers and each reduce generates its list and finds the count. So it can sort quickly.


SELECT name, id, cell FROM user_table ORDER BY id, name;

SELECT name, id, cell FROM user_table DISTRIBUTE BY id SORT BY name;

Learn More with Madanswer

Related questions

0 votes
0 votes
asked Mar 31, 2020 in DBMS by amita rallin