in Big Data | Hadoop by

1 Answer

0 votes
HBase MemStore

The MemStore is a write buffer where HBase accumulates data in memory before a permanent write.

Its contents are flushed to disk to form an HFile when the MemStore fills up.

It doesn't write to an existing HFile but instead forms a new file on every flush.

The HFile is the underlying storage format for HBase.

HFiles belong to a column family(one MemStore per column family). A column family can have multiple HFiles, but the reverse isn't true.

size of the MemStore is defined in hbase-site.xml called hbase.hregion.memstore.flush.size.

What happens, when the server hosting a MemStore that has not yet been flushed crashes?

Every server in HBase cluster keeps a WAL to record changes as they happen. The WAL is a file on the underlying file system.A write isn't considered successful until the new WAL entry is successfully written, this guarantees durability.

If HBase goes down, the data that was not yet flushed from the MemStore to the HFile can be recovered by replaying the WAL, taken care by Hbase framework.

Related questions

0 votes
asked Sep 7, 2019 in Big Data | Hadoop by john ganales
0 votes
asked Oct 12, 2019 in Big Data | Hadoop by GeorgeBell
0 votes
asked Oct 12, 2019 in Big Data | Hadoop by RShastri