ORDER BY – sort the data in one reducer. Sort by much faster than order by.
SORT BY – sort the data within each reducer. You can use n number of reducers for sort.
In the first case (order by) maps sends each value to the single reducer and count them all.
In the second case (sort by) maps splits up the values to many reducers and each reduce generates its list and finds the count. So it can sort quickly.
Example:
SELECT name, id, cell FROM user_table ORDER BY id, name;
SELECT name, id, cell FROM user_table DISTRIBUTE BY id SORT BY name;