There are following components in Hive data model:
1. Tables: These are tables similar to Relation Database (RDBMS). We can create filters on tables. We can run joins and union on tables. Table data is stored in HDFS. Rows in a table are organized like columns in a RDBMS. Each row can have
2. Partitions: In each table in Hive, we can specify partition keys that determine how the data is stored. With partition we can optimize the queries to only looks specific dataset instead of scanning whole table.
3. Buckets: In each partition, we can divide the data into buckets. Each bucket is based on the hash of a column in a table. There is a separate file in which each bucket of a partition is
stored. With buckets it is efficient to evaluate queries based on a sample of data.