☰
ZooKeeper, while being a coordination service for distributed systems, is a distributed application on its own. ZooKeeper follows a simple client-server model where clients are nodes (i.e., machines) that make use of the service, and servers are nodes that provide the service. A collection of ZooKeeper servers forms a ZooKeeper ensemble. Once a ZooKeeper ensemble starts after the leader election process, it will wait for the clients to connect. At any given time, one ZooKeeper client is connected to one ZooKeeper server. Each ZooKeeper server can handle a large number of client connections at the same time. Each client periodically sends pings to the ZooKeeper server it is connected to let it know that it is alive and connected. The ZooKeeper server in question responds with an acknowledgment of the ping, indicating the server is alive as well. When the client doesn't receive an acknowledgment from the server within the specified time, the client connects to another server in the ensemble, and the client session is transparently transferred over to the new ZooKeeper server.
In zookeeper ensemble, all zookeeper server must all know about each other zookeeper server. They maintain an in-memory image of state, along with a transaction logs and snapshots in a persistent store.
According to wikipedia, A quorum is the minimum number of members of a deliberative assembly necessary to conduct the business of that group. Similarly in zookeeper terms, A quorum is represented by a strict majority of nodes in a ZooKeeper ensemble.
Reads are always read from the ZooKeeper server connected to the client, so their performance doesn't change as the number of servers in the ensemble changes. However, writes are successful only when written to a quorum of nodes. A successful return code is then returned to the client who initiated the write request. If a quorum of nodes are not available in an ensemble, the ZooKeeper service is nonfunctional. This means that the write performance decreases as the number of nodes in the ensemble increases since writes have to be written to and coordinated among more servers.
If you would like to run one server, that's fine from ZooKeeper's perspective; you just won't have a highly reliable or available system. A three-node ZooKeeper ensemble will support one failure without loss of service, which is probably fine for most users and arguably the most common deployment topology. However, to be safe, use five nodes in your ensemble. A five-node ensemble allows you to take one server out for maintenance or rolling upgrade and still be able to withstand a second unexpected failure without interruption of service.
In short three, five, or seven is the most typical number of nodes in a ZooKeeper ensemble, the more members an ensemble has, the more tolerant the ensemble is of host failures. Keep in mind that the size of your ZooKeeper ensemble has little to do with the size of the nodes in your distributed system. The nodes in your distributed system would be clients to the ZooKeeper ensemble, and each ZooKeeper server can handle a large number of clients scalably. For example, HBase (a distributed database on Hadoop) relies upon ZooKeeper for leader election and lease management of region servers. You can have a large 50-node HBase cluster running with a relatively small (say, five) node ZooKeeper ensemble.
Note: Because a quorum of servers is necessary for progress, ZooKeeper cannot make progress in the case that enough servers have permanently failed that no quorum can be formed. It is OK if servers are brought down and eventually boot up again, but for progress to be made, a quorum must eventually boot up.
Sincere thanks from CoreJavaGuru to the author for submitting the above article.