Setting up cgroups

Control groups, or cgroups, are a Linux kernel feature that enable you to allocate resources like CPU time and system memory to specific processes or groups of processes. If you need to host the Dgraph on nodes running Spark, you should use cgroups to ensure sufficient resources are available to it.

Warning: Installing the Dgraph on Spark nodes is not recommended and should only be done if absolutely necessary.
To do this, you enable cgroups in Hadoop and create one for YARN that limits the amounts of CPU and memory it can consume. You then create a separate cgroup for the Dgraph.

To set up cgroups:

  1. If your system doesn't currently have the libcgroup package, install it as root.
    This creates /etc/cgconfig.conf, which is used to configure cgroups.
  2. Enable the cgconfig service to run automatically:
    chkconfig cgconfig on
  3. Create a cgroup for YARN. You must do this within Hadoop. For instructions, refer to the documentation for your Hadoop distribution.
    The YARN cgroup should limit the amounts of CPU and memory allocated to all YARN containers. The appropriate limits to set depend on your system and the amount of data you will process. At a minimum, you should reserve the following for the Dgraph:
    • 5GB of RAM
    • 2 CPU cores
    The number of CPU cores YARN is allowed to use must be specified as a percentage. For example, on a quad-core machine, YARN should only get two cores, or 50%. On an eight-core machine, YARN could get up to six of them, or 75%. When setting this amount, remember that allocating more cores to the Dgraph will boost its performance.
  4. Create a cgroup for the Dgraph by adding the following to cgconfig.conf:
    # Create a Dgraph cgroup named "dgraph"
    group dgraph {
    # Specify which users can edit this group
        perm {
            admin {
                uid = $BDD_USER;
            }
            # Specify which users can add tasks for this group
            task {
                uid = $BDD_USER;
            }
        }
    # Set the memory and swap limits for this group
        memory {
            # Sets memory limit to 10GB
            memory.limit_in_bytes = 10000000000;
    
    
            # Sets memory + swap limit to 12GB
            memory.memsw.limit_in_bytes = 12000000000;
        }
    }
    Where $BDD_USER is the name of the bdd user.
    Important: The values given for memory.limit_in_bytes and memory.memsw.limit_in_bytes above are the absolute minimum requirements. You should use higher values, if possible.
  5. Restart cfconfig to enable your changes.