+1 vote
in DevOps by
If you run select * query in Hive, why it’s not run Mpareduce?

1 Answer

0 votes
by

It’s an optimization technique. hive.fetch.task.conversion property can (FETCH task) minimize latency of mapreduce overhead. When queried SELECT, FILTER, LIMIT queries, this property skip mapreduce and using FETCH task. As a result Hive can execute query without run mapreduce task.

By default it’s value “minimal”. Which optimize: SELECT STAR, FILTER on partition columns, LIMIT queries only, where as another value is “more” which optimize : SELECT, FILTER, LIMIT only (+TABLESAMPLE, virtual columns).

Related questions

0 votes
asked Sep 18, 2022 in Business Skills Track by sharadyadav1986
0 votes
asked Dec 3, 2021 in Cloud Computing by DavidAnderson
...