- GETTING STARTED
- BASIC JAVA
- JAVA STRINGS
- EXCEPTION HANDLING
- JAVA 9
- EFFECTIVE JAVA
The fundamental idea of YARN is to split up the functionalities of resource management and job scheduling/monitoring into separate daemons. The idea is to have a global ResourceManager (RM) and per-application ApplicationMaster (AM).
The YARN framework was intentionally designed to be as simple as possible; as such, it doesn’t know or care about the type of applications that are running. Nor does it care about keeping any historical information about what has executed on the cluster. These design decisions are the primary reasons that YARN can scale beyond the levels of MapReduce.
YARN comprises a framework that’s responsible for resource scheduling and monitoring, and applications that execute application-specific logic in a cluster. The ResourceManager and NodeManagers running on individual nodes come together to form the core of YARN and constitute the YARN platform. ApplicationMasters and the corresponding containers come together to form a YARN application. Lets understand these YARN platform and YARN Applications part separately.
The primary function of YARN Framework/Platform is to schedule resources in a cluster. Applications in a cluster talk to the YARN framework, asking for application-specific containers to be allocated, and the YARN framework evaluates these requests and attempts to fulfill them. An important part of the YARN scheduling also includes monitoring currently executing containers.
There are two reasons that container monitoring is important: Once a container has completed, the scheduler can then use freed-up capacity to schedule more work. Additionally, each container has a contract that specifies the system resources that it’s allowed to use, and in cases where containers overstep these bounds, the scheduler can terminate the container to avoid rogue containers impacting other applications.
There are two primary components that comprise the YARN Framework/Platform :
The ResourceManager is the YARN master process. A Hadoop cluster has a single ResourceManager (RM) for the entire cluster. Its sole function is to arbitrate all the available resources on a Hadoop cluster. ResourceManager tracks usage of resources, monitors the health of various nodes in the cluster, enforces resource-allocation invariants, and arbitrates conflicts among users.
It works together with the following components:
The ResourceManager has many core components. Lets see some of the main ones:
The Scheduler is responsible for allocating resources to the various running applications subject to familiar constraints of capacities, queues etc. The Scheduler is pure scheduler in the sense that it performs no monitoring or tracking of status for the application. Also, it offers no guarantees about restarting failed tasks either due to application failure or hardware failures.
The Scheduler performs its scheduling function based on the resource requirements of the applications; it does so based on the abstract notion of a resource Container which incorporates elements such as memory, cpu, disk, network etc. The Scheduler has a pluggable policy which is responsible for partitioning the cluster resources among the various queues, applications etc.
Read more about YARN Scheduler policies here.
The ApplicationsManager is responsible for maintaining a collection of submitted applications. It is also responsible for accepting job-submissions, negotiating the first container for executing the application specific ApplicationMaster and provides the service for restarting the ApplicationMaster container on failure. The per-application ApplicationMaster has the responsibility of negotiating appropriate resource containers from the Scheduler, tracking their status and monitoring for progress.
After application submission, ApplicationsManager rejects any application that requests unsatisfiable resources for its ApplicationMaster if there is no node in the cluster that has enough resources to run the ApplicationMaster itself. It also ensures that no other application was already submitted with the same application ID.
ApplicationsManager keeps a cache of completed applications long after applications finish to support users’ requests for application data (via web UI or command line).
The ResourceManager has a web application that exposes information about the state of the cluster; metrics; lists of active, healthy, and unhealthy nodes; lists of applications, their state and status.
In YARN, while every other container’s launch is initiated by an ApplicationMaster, the ApplicationMaster itself is allocated and prepared for launch on a NodeManager by the ResourceManager itself. The ApplicationMaster Launcher is responsible for this job.
It also talks to NodeManagers for cleaning up the ApplicationMaster when an application finishes normally or is forcefully terminated.
The NodeManager is the slave process of YARN. It runs on every node in a cluster. Its job is to create, monitor, and kill containers. It services requests from the ResourceManager and ApplicationMaster to create containers, and it reports on the status of the containers to the ResourceManager. The ResourceManager uses the data contained in these status messages to make scheduling decisions for new container requests.
On start-up, the NodeManager registers with the ResourceManager; it then sends heartbeats with its status and waits for instructions. Its primary goal is to manage application containers assigned to it by the ResourceManager.
Before launching a container, the NodeManager copies all the necessary libraries—data files, executables, tarballs, jar files, shell scripts, and so on—to the local file system.
The Node-Manager also kills containers as directed by the ResourceManager. Whenever a container exits, the NodeManager will clean up its working directory in local storage.
The YARN framework/platform exists to manage applications, so let’s take a look at what components a YARN application is composed of. A YARN application implements a specific function that runs on Hadoop. A YARN application involves 3 components:
Launching a new YARN application starts with a YARN client communicating with the ResourceManager to create a new YARN ApplicationMaster instance. Part of this process involves the YARN client informing the ResourceManager of the ApplicationMaster’s physical resource requirements.
The ApplicationMaster is the master process of a YARN application. It doesn’t perform any application-specific work, as these functions are delegated to the containers. Instead, it’s responsible for managing the application-specific containers: asking the ResourceManager of its intent to create containers and then liaising with the NodeManager to actually perform the container creation.
The ApplicationMaster is the process that coordinates an application’s execution in the cluster. Each application has its own unique ApplicationMaster, which negotiates resources (containers) from the ResourceManager and works with the NodeManager(s) to execute and monitor the tasks.
Once the ApplicationMaster is started (as a container), it will periodically send heartbeats to the ResourceManager to affirm its health and to update the record of its resource demands.
As part of the process, the ApplicationMaster must specify the resources that each container requires in terms of which host should launch the container and what the container’s memory and CPU requirements are.
The ApplicationMaster is also responsible for the specific fault-tolerance behavior of the application. It receives status messages from the ResourceManager when its containers fail, and it can decide to take action based on these events (by asking the ResourceManager to create a new container), or to ignore these events.
A container is an application-specific process that’s created by a NodeManager on behalf of an ApplicationMaster. At the fundamental level, a container is a collection of physical resources such as RAM, CPU cores, and disks on a single node. There can be multiple containers on a single node. The ApplicationManager itself is also a container, created by the ResourceManager. A container created by an ApplicationManager can be an arbitrary process - for example, a container process could simply be a Linux command such as awk, a Python application, or any process that can be launched by the operating system.
A container thus represents a resource (memory, CPU) on a single node in a given cluster. A container is supervised by the NodeManager and scheduled by the ResourceManager. Each application starts out as an ApplicationMaster, which is itself a container (often referred to as container 0). Once started, the ApplicationMaster must negotiate with the ResourceManager for more containers. Container requests (and releases) can take place in a dynamic fashion at run time.