- GETTING STARTED
- BASIC JAVA
- JAVA STRINGS
- EXCEPTION HANDLING
- JAVA 9
- EFFECTIVE JAVA
HBase is a distributed database, meaning it is designed to run on a cluster of few to possibly thousands of servers. As a result it is more complicated to install. All the typical problems of distributed computing begin to come into play such as coordination and management of remote processes, locking, data distribution, network latency and number of round trips between servers. Fortunately HBase makes use of several other mature technologies, such as Apache Hadoop and Apache ZooKeeper, to solve many of these issues.
HBase is a column-oriented data store, meaning it stores data by columns rather than by rows. In HBase if there is no data for a given column family, it simply does not store anything at all; contrast this with a relational database which must store null values explicitly. In addition, when retrieving data in HBase, you should only ask for the specific column families you need; because there can literally be millions of columns in a given row, you need to make sure you ask only for the data you actually need.
Like HDFS, HBase architecture follows the traditional master slave model where you have a master which takes decisions and one or more slaves which does the real task. In HBase, the master is called HMaster and slaves are called HRegionServers
As you can see from the above diagram, typically, the HBase cluster has one Master node, called HMaster and multiple Region Servers called HRegionServer. Each Region Server contains multiple Regions called HRegions. Also HBase uses ZooKeeper as a distributed coordination service to maintain server state in the cluster.
Data in HBase is stored in Tables and these Tables are stored in Regions. When a Table becomes too big, the Table is partitioned into multiple Regions. These Regions are assigned to Region Servers across the cluster. Each Region Server hosts roughly the same number of Regions.
Lets see each component in more detail.
HMaster is the implementation of the Master Server. Region assignment, DDL (create, delete tables) operations are handled by the HBase Master. The Master server is responsible for monitoring all RegionServer instances in the cluster, and is the interface for all metadata changes. In a distributed cluster, the Master typically runs on the NameNode.
The HMaster in the HBase is responsible for:
HRegionServer is the RegionServer implementation. It is responsible for serving and managing regions. In a distributed cluster, a RegionServer runs on a DataNode.
HBase Tables are divided horizontally by row key range into Regions. A region contains all rows in the table between the region’s start key and end key. Regions are assigned to the nodes in the cluster, called Region Servers, and these serve data for reads and writes. A region server can serve about 1,000 regions.
The HRegionServer perform the following task:
A Region Server runs on an HDFS data node and has the following components:
When Region Server (RS) receives write request, it directs the request to specific Region. Each Region stores set of rows. Rows data can be separated in multiple column families (CFs). Data of particular CF is stored in HStore which consists of Memstore and a set of HFiles.
Each Region Server contains a Write-Ahead Log (called HLog) and multiple Regions. Each Region in turn is made up of a MemStore and multiple StoreFiles (HFile). The data lives in these StoreFiles in the form of Column Families (explained below). The MemStore holds in-memory modifications to the Store (data).
The mapping of Regions to Region Server is kept in a system table called .META. When trying to read or write data from HBase, the clients read the required Region information from the .META table and directly communicate with the appropriate Region Server.
Each RegionServer adds updates (Puts, Deletes) to its write-ahead log (WAL) first, and then to the MemStore for the affected Store. This ensures that HBase has durable writes. Without WAL, there is the possibility of data loss in the case of a RegionServer failure before each MemStore is flushed and new StoreFiles are written. HLog is the HBase WAL implementation, and there is one HLog instance per RegionServer.
Regions are the basic element of availability and distribution for tables, and are comprised of a Store per Column Family. Regions are nothing but tables that are split up and spread across the region servers. The heirarchy of objects is as follows:
Region(Regions for the table)
Store(Store per ColumnFamily for each Region for the table)
MemStore(MemStore for each Store for each Region for the table)
StoreFile(StoreFiles for each Store for each Region for the table)
Block(Blocks within a StoreFile within a Store for each Region for the table)
Table (HBase table) →Region (Regions for the table) →→Store (Store per ColumnFamily for each Region for the table) →→→MemStore (MemStore for each Store for each Region for the table) →→→→StoreFile (StoreFiles for each Store for each Region for the table) →→→→→Block (Blocks within a StoreFile within a Store for each Region for the table)
Initially there is one region per table. When a region grows too large say larger than
hbase.hregion.max.filesize, it splits into two child regions. Both child regions, representing one-half of the original region, are opened in parallel on the same Region server, and then the split is reported to the HMaster.
For load balancing reasons, the HMaster may schedule for new regions to be moved off to other servers. This results in the new Region server serving data from a remote HDFS node until a major compaction moves the data files to the Regions server’s local node. HBase data is local when it is written, but when a region is moved (for load balancing or recovery), it is not local until major compaction.
When something is written to HBase, it is first written to an in-memory store, called the MemStore. When the MemStore reaches a certain size, it is flushed to disk into a StoreFile. The store files that are created on disk are immutable. Compaction is a process where store files are merged together.
Compaction, comes in two flavors: major and minor. Minor compactions will usually pick up a couple of the smaller adjacent StoreFiles and rewrite them as one. Minors do not drop deletes or expired cells, only major compactions do this. You can tune the number of HFiles to compact and the frequency of a minor compaction. Minor compactions are important because without them, reading a particular row can require many disk reads and cause slow overall performance.
Sometimes a minor compaction will pick up all the StoreFiles in the Store and writes to a single Store file and in this case it actually promotes itself to being a major compaction. Major compactions will usually have to be done manually on large systems.
The Write Ahead Log (WAL) records all changes to data in HBase, to file-based storage. Under normal operations, the WAL is not needed because data changes info move from the MemStore to StoreFiles. However, if a RegionServer crashes or becomes unavailable before the MemStore is flushed, the WAL ensures that the changes to the data can be replayed. If writing to the WAL fails, the entire operation to modify the data fails. Usually, there is only one instance of a WAL per RegionServer. The RegionServer records Puts and Deletes to it, before recording them to the MemStore for the affected store. The WAL resides in HDFS in the
/hbase/WALs/ directory (prior to HBase 0.94, they were stored in /hbase/.logs/), with subdirectories per region.
With a single WAL per RegionServer, the RegionServer must write to the WAL serially, because HDFS files must be sequential. This causes the WAL to be a performance bottleneck. HBase 1.0 introduces support of MultiWal. MultiWAL allows a RegionServer to write multiple WAL streams in parallel, by using multiple pipelines in the underlying HDFS instance, which increases total throughput during writes. This parallelization is done by partitioning incoming edits by their Region.
HLog : The class which implements the WAL is called HLog.
A Store hosts a MemStore and 0 or more StoreFiles (HFiles). A Store corresponds to a column family for a table for a given region.
Data is stored in an HFile which contains sorted key/values. When the MemStore accumulates enough data, the entire sorted KeyValue set is written to a new HFile in HDFS. This is a sequential write. It is very fast, as it avoids moving the disk drive head. Key value pairs are stored in increasing order in HFile
HBase comes integrated with Zookeeper. When You start HBase, Zookeeper instance is also started. The reason is that the Zookeeper helps us in keeping a track of all region servers that are there for HBase. Zookeeper keeps track of how many region servers are there, which region servers are holding from which data node to which data node. It keeps track of smaller data sets where Hadoop is missing out. It decreases the overhead on top of Hadoop which keeps track of most of your Meta data. Hence HMaster gets the details of region servers by actually contacting Zookeeper.
You can also start HBase without inbuilt ZooKeeper. To point HBase at an existing ZooKeeper cluster, one that is not managed by HBase, set HBASE_MANAGES_ZK in conf/hbase-env.sh to false and next set ensemble locations and client port, if non-standard, in hbase-site.xml.