To ensure that each node in your YARN cluster has access to sufficient resources during processing, you need to update the following YARN-specific Hadoop properties.
You can access these properties from your Hadoop cluster manager (Cloudera Manager, Ambari, or MCS). If you need help locating any of them, refer to your distribution's documentation.
Property | Description |
---|---|
yarn.nodemanager.resource.memory-mb | The total amount of memory that YARN can use on a given node. This should be at least 16GB, although you might need to set it higher depending on the amount of data you plan on processing. |
yarn.scheduler.maximum-allocation-vcores | The maximum number of virtual CPU cores
allocated to each YARN container per request.
If your Hadoop cluster contains only one YARN worker node, this should be less than or equal to half of that node's cores. If it contains multiple YARN worker nodes, this should be less than or equal to each node's total number of cores. |
yarn.scheduler.maximum-allocation-mb | The maximum amount of RAM allocated to each
YARN container per request. This should be at least 16GB. Additionally:
|
yarn.scheduler.capacity.maximum-applications | The maximum number of concurrently-running
jobs allowed on each node. This can be between 2 and 8.
Note that setting this value higher could cause jobs submitted at the same time to hang indefinitely. |