The NodeManager is responsible for launching and managing containers on a node. Containers execute tasks as specified by the AppMaster.
The NodeManager runs services to determine the health of the node it is executing on. The services perform checks on the disk as well as any user specified tests. If any health check fails, the NodeManager marks the node as unhealthy and communicates this to the ResourceManager, which then stops assigning containers to the node. Communication of the node status is done as part of the heartbeat between the NodeManager and the ResourceManager. The intervals at which the disk checker and health monitor(described below) run don’t affect the heartbeat intervals. When the heartbeat takes place, the status of both checks is used to determine the health of the node.
The disk checker checks the state of the disks that the NodeManager is configured to use(local-dirs and log-dirs, configured using yarn.nodemanager.local-dirs and yarn.nodemanager.log-dirs respectively). The checks include permissions and free disk space. It also checks that the filesystem isn’t in a read-only state. The checks are run at 2 minute intervals by default but can be configured to run as often as the user desires. If a disk fails the check, the NodeManager stops using that particular disk but still reports the node status as healthy. However if a number of disks fail the check(the number can be configured, as explained below), then the node is reported as unhealthy to the ResourceManager and new containers will not be assigned to the node.
The following configuration parameters can be used to modify the disk checks:
Configuration Name | Allowed Values | Description |
---|---|---|
yarn.nodemanager.disk-health-checker.enable | true, false | Enable or disable the disk health checker service |
yarn.nodemanager.disk-health-checker.interval-ms | Positive integer | The interval, in milliseconds, at which the disk checker should run; the default value is 2 minutes |
yarn.nodemanager.disk-health-checker.min-healthy-disks | Float between 0-1 | The minimum fraction of disks that must pass the check for the NodeManager to mark the node as healthy; the default is 0.25 |
yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage | Float between 0-100 | The maximum percentage of disk space that may be utilized before a disk is marked as unhealthy by the disk checker service. This check is run for every disk used by the NodeManager. The default value is 90 i.e. 90% of the disk can be used. |
yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb | Integer | The minimum amount of free space that must be available on the disk for the disk checker service to mark the disk as healthy. This check is run for every disk used by the NodeManager. The default value is 0 i.e. the entire disk can be used. |
Users may specify their own health checker scripts that will be invoked by the health checker service. Users may specify a timeout as well as options to be passed to the script. If the script times out, results in an exception being thrown or outputs a line which begins with the string ERROR, the node is marked as unhealthy. Please note that:
Users can specify up to 4 scripts to run individually with the yarn.nodemanager.health-checker.scripts configuration. Also these options can be configured for all scripts (global configurations):
Configuration Name | Allowed Values | Description |
---|---|---|
yarn.nodemanager.health-checker.script | String | The keywords for the health checker scripts separated by a comma. The default is “script”. |
yarn.nodemanager.health-checker.interval-ms | Positive integer | The interval, in milliseconds, at which health checker service runs; the default value is 10 minutes. |
yarn.nodemanager.health-checker.timeout-ms | Positive integer | The timeout for the health script that’s executed; the default value is 20 minutes. |
The following options can be set for every health checker script. The %s symbol is substituted with each keyword provided in yarn.nodemanager.health-checker.script .
Configuration Name | Allowed Values | Description |
---|---|---|
yarn.nodemanager.health-checker.%s.path | String | Absolute path to the health check script to be run. Mandatory argument for each script. |
yarn.nodemanager.health-checker.%s.opts | String | Arguments to be passed to the script when the script is executed. Mandatory argument for each script. |
yarn.nodemanager.health-checker.%s.interval-ms | Positive integer | The interval, in milliseconds, at which health checker service runs. |
yarn.nodemanager.health-checker.%s.timeout-ms | Positive integer | The timeout for the health script that’s executed. |
The interval and timeout options are not required to be specified. In that case the global configurations will be used.
This document gives an overview of NodeManager (NM) restart, a feature that enables the NodeManager to be restarted without losing the active containers running on the node. At a high level, the NM stores any necessary state to a local state-store as it processes container-management requests. When the NM restarts, it recovers by first loading state for various subsystems and then letting those subsystems perform recovery using the loaded state.
Step 1. To enable NM Restart functionality, set the following property in conf/yarn-site.xml to true.
Property | Value |
---|---|
yarn.nodemanager.recovery.enabled | true , (default value is set to false) |
Step 2. Configure a path to the local file-system directory where the NodeManager can save its run state.
Property | Description |
---|---|
yarn.nodemanager.recovery.dir | The local filesystem directory in which the node manager will store state when recovery is enabled. The default value is set to $hadoop.tmp.dir/yarn-nm-recovery . |
Step 3: Enable NM supervision under recovery to prevent running containers from getting cleaned up when NM exits.
Property | Description |
---|---|
yarn.nodemanager.recovery.supervised | If enabled, NodeManager running will not try to cleanup containers as it exits with the assumption it will be immediately be restarted and recover containers The default value is set to ‘false’. |
Step 4. Configure a valid RPC address for the NodeManager.
Property | Description |
---|---|
yarn.nodemanager.address | Ephemeral ports (port 0, which is default) cannot be used for the NodeManager’s RPC server specified via yarn.nodemanager.address as it can make NM use different ports before and after a restart. This will break any previously running clients that were communicating with the NM before restart. Explicitly setting yarn.nodemanager.address to an address with specific port number (for e.g 0.0.0.0:45454) is a precondition for enabling NM restart. |
Step 5. Auxiliary services.
To launch auxiliary services on a NodeManager, users have to add their jar to NodeManager’s classpath directly, thus put them on the system classloader. But if multiple versions of the plugin are present on the classpath, there is no control over which version actually gets loaded. Or if there are any conflicts between the dependencies introduced by the auxiliary services and the NodeManager itself, they can break the NodeManager, the auxiliary services, or both. To solve this issue, we can instantiate auxiliary services using a classloader that is different from the system classloader.
This section describes the auxiliary service manifest for aux-service classpath isolation. To use a manifest, the property yarn.nodemanager.aux-services.manifest.enabled must be set to true in yarn-site.xml.
An example manifest that configures classpath isolation for a CustomAuxService follows. One or more files may be specified to make up the classpath of a service, with jar or archive formats being supported.
< "services": [ < "name": "mapreduce_shuffle", "version": "2", "configuration": < "properties": < "class.name": "org.apache.hadoop.mapred.ShuffleHandler", "mapreduce.shuffle.transfer.buffer.size": "102400", "mapreduce.shuffle.port": "13562" >> >, < "name": "CustomAuxService", "version": "1", "configuration": < "properties": < "class.name": "org.aux.CustomAuxService" >, "files": [ < "src_file": "$/CustomAuxService.jar", "type": "STATIC" >, < "src_file": "$/CustomAuxService.tgz", "type": "ARCHIVE" > ] > > ] >
This section describes the configuration variables for aux-service classpath isolation. Aux services will only be loaded from the configuration if a manifest file is not specified.
The following settings need to be set in yarn-site.xml.
Configuration Name | Description |
---|---|
yarn.nodemanager.aux-services.%s.classpath | Provide local directory which includes the related jar file as well as all the dependencies’ jar file. We could specify the single jar file or use $/* to load all jars under the dep directory. |
yarn.nodemanager.aux-services.%s.remote-classpath | Provide remote absolute or relative path to jar file(We also support zip, tar.gz, tgz, tar and gz files as well). For the same aux-service class, we can only specify one of the configurations: yarn.nodemanager.aux-services.%s.classpath or yarn.nodemanager.aux-services.%s.remote-classpath. The YarnRuntimeException will be thrown. Please also make sure that the owner of the jar file must be the same as the NodeManager user and the permbits should satisfy (permbits & 0022)==0 (such as 600, it’s not writable by group or other). |
yarn.nodemanager.aux-services.%s.system-classes | Normally, we do not need to set this configuration. The class would be loaded from customized classpath if it does not belongs to system-classes. For example, by default, the package org.apache.hadoop is in the system-classes, if your class CustomAuxService is in the package org.apache.hadoop, it would not be loaded from customized classpath. To solve this, either we could change the package for CustomAuxService, or configure our own system-classes which exclude org.apache.hadoop. |
yarn.nodemanager.aux-services mapreduce_shuffle,CustomAuxService yarn.nodemanager.aux-services.CustomAuxService.classpath $/CustomAuxService.jar yarn.nodemanager.aux-services.CustomAuxService.remote-classpath $/CustomAuxService.jar -->yarn.nodemanager.aux-services.CustomAuxService.class org.aux.CustomAuxService yarn.nodemanager.aux-services.mapreduce_shuffle.class org.apache.hadoop.mapred.ShuffleHandler
This allows a cluster admin to configure a cluster such that a task attempt will be killed if any container log exceeds a configured size. This helps prevent logs from filling disks and also prevent the need to aggregate enormous logs.
The following parameters can be used to configure the container log dir sizes.
Configuration Name | Allowed Values | Description |
---|---|---|
yarn.nodemanager.container-log-monitor.enable | true, false | Flag to enable the container log monitor which enforces container log directory size limits. Default is false. |
yarn.nodemanager.container-log-monitor.interval-ms | Positive integer | How often to check the usage of a container’s log directories in milliseconds. Default is 60000 ms. |
yarn.nodemanager.container-log-monitor.dir-size-limit-bytes | Long | The disk space limit, in bytes, for a single container log directory. Default is 1000000000. |
yarn.nodemanager.container-log-monitor.total-size-limit-bytes | Long | The disk space limit, in bytes, for all of a container’s logs. The default is 10000000000. |
This allows a cluster admin to configure a cluster to allow the heart-beat between the Resource Manager and each NodeManager to be scaled based on the CPU utilization of the node compared to the overall CPU utilization of the cluster.
The following parameters can be used to configure the heart-beat interval and whether and how it scales.
Configuration Name | Allowed Values | Description |
---|---|---|
yarn.resourcemanager.nodemanagers.heartbeat-interval-ms | Long | Specifies the default heart-beat interval in milliseconds for every NodeManager in the cluster. Default is 1000 ms. |
yarn.resourcemanager.nodemanagers.heartbeat-interval-scaling-enable | true, false | Enables heart-beat interval scaling. If true, The NodeManager heart-beat interval will scale based on the difference between the CPU utilization on the node and the cluster-wide average CPU utilization. Default is false. |
yarn.resourcemanager.nodemanagers.heartbeat-interval-min-ms | Positive Long | If heart-beat interval scaling is enabled, this is the minimum heart-beat interval in milliseconds. Default is 1000 ms. |
yarn.resourcemanager.nodemanagers.heartbeat-interval-max-ms | Positive Long | If heart-beat interval scaling is enabled, this is the maximum heart-beat interval in milliseconds. Default is 1000 ms. |
yarn.resourcemanager.nodemanagers.heartbeat-interval-speedup-factor | Positive Float | If heart-beat interval scaling is enabled, this controls the degree of adjustment when speeding up heartbeat intervals. At 1.0, 20% less than the average cluster-wide CPU utilization will result in a 20% decrease in the heartbeat interval. Default is 1.0. |
yarn.resourcemanager.nodemanagers.heartbeat-interval-slowdown-factor | Positive Float | If heart-beat interval scaling is enabled, this controls the degree of adjustment when slowing down heartbeat intervals. At 1.0, 20% greater than the average cluster-wide CPU utilization will result in a 20% increase in the heartbeat interval. Default is 1.0. |
© 2008-2023 Apache Software Foundation - Privacy Policy. Apache Maven, Maven, Apache, the Apache feather logo, and the Apache Maven project logos are trademarks of The Apache Software Foundation.