Managing Resources Using Control Groups

Explains how control groups organize processes, how systemd applies resource policies, and when to manage cgroups manually.

Control groups, referred to as cgroups, are an Oracle Linux kernel feature that organizes systemd services, and if required, individual processes (PIDs), into hierarchical groups for allocating system resources, such as CPU, memory, and I/O.

For example, if you have identified three processes that need to be allocated CPU time in a ratio of 150:100:50, you can create three cgroups, each with a CPU weight corresponding to one of the three values in the ratio, and assign the appropriate process to each cgroup.

Important

Use systemd to configure cgroups.

Manual creation of cgroup directories in the /sys/fs/cgroup virtual file system (as discussed in this topic) can be helpful for illustrating underlying concepts. However, on use this approach for specific scenarios, such as temporary debugging or testing. For most use cases, use systemd to configure cgroups to ensure correct and persistent resource management.

By default, systemd creates a cgroup for the following:

  • Each systemd service set up on the host.

    For example, a server might have control group NetworkManager.service to group processes owned by the NetworkManager service, and control group firewalld.service to group processes owned by the firewalld service, and so on.

  • Each user (UID) on the host.

The cgroup functionality is mounted as a virtual file system under /sys/fs/cgroup. Each cgroup has a corresponding directory within /sys/fs/cgroup file system. For example, the cgroups created by systemd for the services it manages can be seen by running the command ls -l /sys/fs/cgroup/system.slice | grep ".service" as shown in the following sample code block:

ls -l /sys/fs/cgroup/system.slice | grep ".service"
            ...root root 0 Mar 22 10:47 atd.service
            ...root root 0 Mar 22 10:47 auditd.service
            ...root root 0 Mar 22 10:47 chronyd.service
            ...root root 0 Mar 22 10:47 crond.service
            ...root root 0 Mar 22 10:47 dbus-broker.service
            ...root root 0 Mar 22 10:47 dtprobed.service
            ...root root 0 Mar 22 10:47 firewalld.service
            ...root root 0 Mar 22 10:47 httpd.service
            ...

You can also create custom cgroups by creating directories under the /sys/fs/cgroup virtual file system and assigning process IDs (PIDs) to different cgroups according to system requirements. However, the recommended practice is to use systemd to configure cgroups instead of creating the cgroups manually under /sys/fs/cgroup.

For the recommended method of managing cgroups through systemd, seeUsing Systemd to Manage Control Groups.

Two versions of control groups are available in the Linux kernel.

Control groups version 1 (cgroups v1)

Provides a per-resource controller hierarchy. Each resource (CPU, memory, I/O, and so on.) has its own control group tree. This can make coordination across resources difficult. Uses a legacy API. Now considered deprecated but available for compatibility on supported systems.

Control groups version 2 (cgroups v2)

Uses a unified, single hierarchy for all controllers, enabling improved cross-resource coordination and simpler management. Uses a modern, simplified API. This is the preferred and actively developed implementation.

The following table summarizes availability per Oracle Linux release:

cgroups Version Support by Oracle Linux Release
Oracle Linux Release cgroups v1 cgroups v2
Oracle Linux 8 Available (default) Available (enable manually)
Oracle Linux 9 Available (compatibility) Available (default)
Oracle Linux 10 Not available (deprecated) Available (default)

For more background on control groups, review the cgroups(7) and sysfs(5) manual pages.

Enable cgroups v2 on Oracle Linux 8

  1. Check the current mounts.
    sudo mount -l | grep cgroup

    If the output already shows cgroup2 on /sys/fs/cgroup, no further action is required.

  2. Add the unified hierarchy boot parameter.
    sudo grubby --update-kernel=ALL --args="systemd.unified_cgroup_hierarchy=1"

    The command appends systemd.unified_cgroup_hierarchy=1 to each kernel entry so that version2 mounts at boot.

  3. Reboot to apply the change.
  4. Verify that cgroups v2 is mounted.
    sudo mount -l | grep cgroup

    Look for cgroup2 on /sys/fs/cgroup in the output.

Verify cgroups v2 on Oracle Linux 9 and Oracle Linux 10

Confirm the mount point.
sudo mount -l | grep cgroup
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,seclabel,nsdelegate,memory_recursiveprot)

About Kernel Resource Controllers

Control groups manage resource use through kernel resource controllers. A kernel resource controller represents a single resource, such as CPU time, memory, network bandwidth, or disk I/O.

To identify mounted resource controllers in the system, check the contents of the /procs/cgroups file, for example run:

less /proc/cgroups
#subsys_name    hierarchy       num_cgroups     enabled
cpuset  0       103     1
cpu     0       103     1
cpuacct 0       103     1
blkio   0       103     1
memory  0       103     1
devices 0       103     1
freezer 0       103     1
net_cls 0       103     1
perf_event      0       103     1
net_prio        0       103     1
hugetlb 0       103     1
pids    0       103     1
rdma    0       103     1
misc    0       103     1

For a detailed explanation of the kernel resource controllers of cgroups, see the cgroups(7) manual page.

About the Control Group File System

cgroup functionality is mounted as a hierarchical file system in /sys/fs/cgroup.

The directory /sys/fs/cgroup is also called the root control group.

The root control group directory contents vary slightly depending on the mounted version, but on systems using cgroups v2 you typically see entries similar to the following:

ls /sys/fs/cgroup
cgroup.controllers      cpuset.mems.effective  memory.stat
cgroup.max.depth        cpu.stat               misc.capacity
cgroup.max.descendants  dev-hugepages.mount    sys-fs-fuse-connections.mount
cgroup.procs            dev-mqueue.mount       sys-kernel-config.mount
cgroup.stat             init.scope             sys-kernel-debug.mount
cgroup.subtree_control  io.pressure            sys-kernel-tracing.mount
cgroup.threads          io.stat                system.slice
cpu.pressure            memory.numa_stat       user.slice
cpuset.cpus.effective   memory.pressure
You can use the mkdir command to create cgroup subdirectories within the root control group. For example, you might create the following cgroup subdirectories:
  • /sys/fs/cgroup/MyGroups/

  • /sys/fs/cgroup/MyGroups/cgroup1

  • /sys/fs/cgroup/MyGroups/cgroup2

Note

Best design practice is for child cgroups to be at least 2 levels deep inside the /sys/fs/cgroup. The examples in the preceding list follow this practice by using the first child group, MyGroups, as a parent that contains the different cgroups needed for the system.

Each cgroup in the hierarchy contains the following files:

cgroup.controllers

This read-only file lists the controllers available in the current cgroup. The contents of this file match the contents of the cgroup.subtree_control file in the parent cgroup.

cgroup.subtree_control

This file contains those controllers in the cgroup.controllers file that are enabled for the current cgroup's immediate child cgroups.

When a controller (for example, pids) is present in the cgroup.subtree_control file, the corresponding controller-interface files (for example, pids.max) are automatically created in the immediate children of the current cgroup.

For a sample procedure that creates child groups where you can implement resource management for an application, see Setting CPU Weight to Regulate Distribution of CPU Time.

To remove a cgroup, ensure that the cgroup doesn't contain other child groups, and then remove the directory. For example, to remove child group /sys/fs/cgroup/MyGroups/cgroup1 you can run the following command:.

sudo rmdir /sys/fs/cgroup/MyGroups/cgroup1

About Resource Distribution Models

The following distribution models provide you ways of implementing control or regulation in distributing resources for use by cgroups v2:

Weights

In this model, the weights of all the control groups are totaled. Each group receives a fraction of the resource based on the ratio of the group's weight against the total weight.

Consider 10 control groups, each with a weight of 100 for a combined total of 1000. In this case, each group can use a tenth of a specified resource.

Weight is typically used to distribute stateless resources. To apply this resource, the CPUWeight option is used.

Limits

In this model, a group can use up to the configured amount of a resource. If a resource such as memory usage for a process exceeds the limit, the kernel might stop the process with an out-of-memory (oom) message.

You can also overcommit resources so that the sum of the subgroups limits can exceed the limit of the parent group. Overcommitment assumes that resources in all subgroups aren't likely to all reach their limits at the same time.

To implement this distribution model, the MemoryMax option is often used.

Protections

In this model, a group is assigned a protected boundary. If the group's resource usage remains within the protected amount, the kernel can't deprive the group of the use of the resource in favor of other groups that are competing for the same resource. In this model, an overcommitment of resources is allowed.

To implement this model, the MemoryLow option is often used.

Allocations

In this model, a specific absolute amount is allocated for the use of finite type of resources, such as real-time budget.

Managing cgroups v2 Using sysfs

Shows how to create and tune cgroups v2 hierarchies directly in /sys/fs/cgroup for troubleshooting or temporary tests.

Important

Use systemd to handle all resource management where possible. For more information, seeUsing Systemd to Manage Control Groups.

The examples provided here provide the context for actions that systemd performs on a system and show the functionality outside of systemd. The information provided can be helpful when debugging issues with cgroups.

The example procedure involves allocating CPU time between cgroups that each have different application PIDs assigned to them. The CPU time and application PID values are set in each group's cpu.weight and cgroup.procs files.

The example also includes the steps required to ensure the cpu controller and its associated files, including the cpu.weight file, are available in the cgroups you need to create under /sys/fs/cgroup.

Preparing the Control Group for Distribution of CPU Time

This procedure describes how to manually prepare a control group to manage the distribution of CPU time. Note that the recommended approach to configuring control groups is to use systemd.

  1. Verify that the cpu controller is available at the top of the hierarchy, in the root control group.

    Printing the contents of the /sys/fs/cgroup/cgroup.controllers file on the screen:

    sudo cat /sys/fs/cgroup/cgroup.controllers
    cpuset cpu io memory hugetlb pids rdma misc

    You can add any controllers listed in the cgroup.controllers file to the cgroup.subtree_control file in the same directory to make them available to the group's immediate child cgroups.

  2. Add the cpu controller to the cgroup.subtree_control file to make it available to immediate child cgroups of the root.

    By default, only the memory and pids controllers are in the file. To add the cpu controller, type:

    echo "+cpu" | sudo tee /sys/fs/cgroup/cgroup.subtree_control
    
  3. Optionally, verify that the cpu controller has been added as expected.
    sudo cat /sys/fs/cgroup/cgroup.subtree_control
    cpu memory pids
  4. Create a child group under the root control group to become the new control group for managing CPU resources on applications.
    sudo mkdir /sys/fs/cgroup/MyGroups
  5. Optionally, list the contents of the new subdirectory, or child group, and confirm that the cpu controller is present as expected.
    ls -l /sys/fs/cgroup/MyGroups
    -r—​r—​r--. 1 root root 0 Jun  1 10:33 cgroup.controllers
    -r—​r—​r--. 1 root root 0 Jun  1 10:33 cgroup.events
    -rw-r—​r--. 1 root root 0 Jun  1 10:33 cgroup.freeze
    -rw-r—​r--. 1 root root 0 Jun  1 10:33 cgroup.max.depth
    -rw-r—​r--. 1 root root 0 Jun  1 10:33 cgroup.max.descendants
    -rw-r—​r--. 1 root root 0 Jun  1 10:33 cgroup.procs
    -r—​r—​r--. 1 root root 0 Jun  1 10:33 cgroup.stat
    -rw-r—​r--. 1 root root 0 Jun  1 10:33 cgroup.subtree_control
    …​
    -r—​r—​r--. 1 root root 0 Jun  1 10:33 cpu.stat
    -rw-r—​r--. 1 root root 0 Jun  1 10:33 cpu.weight
    -rw-r—​r--. 1 root root 0 Jun  1 10:33 cpu.weight.nice
    …​
    -r—​r—​r--. 1 root root 0 Jun  1 10:33 memory.events.local
    -rw-r—​r--. 1 root root 0 Jun  1 10:33 memory.high
    -rw-r—​r--. 1 root root 0 Jun  1 10:33 memory.low
    …​
    -r—​r—​r--. 1 root root 0 Jun  1 10:33 pids.current
    -r—​r—​r--. 1 root root 0 Jun  1 10:33 pids.events
    -rw-r—​r--. 1 root root 0 Jun  1 10:33 pids.max
  6. Enable the cpu controller in cgroup.subtree_control file in the MyGroups directory to make it available to its immediate child cgroups.
    echo "+cpu" | sudo tee /sys/fs/cgroup/MyGroups/cgroup.subtree_control
  7. Optionally, verify that the cpu controller is enabled for child groups under MyGroups.
    sudo cat /sys/fs/cgroup/MyGroups/cgroup.subtree_control
    cpu

Setting CPU Weight to Regulate Distribution of CPU Time

This procedure describes how to set CPU weight for three different processes by using a control group to manage the distribution of CPU time. Note that the recommended approach to configuring control groups is to use systemd.

This procedure is based on the following assumptions:

  • The application that's consuming CPU resources excessively is sha1sum, as shown in the following sample output of the top command:

    sudo top
    ...
    PID   USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
    33301 root      20   0   18720   1756   1468 R  99.0   0.0   0:31.09 sha1sum
    33302 root      20   0   18720   1772   1480 R  99.0   0.0   0:30.54 sha1sum
    33303 root      20   0   18720   1772   1480 R  99.0   0.0   0:30.54 sha1sum
    1 root      20   0  109724  17196  11032 S   0.0   0.1   0:03.28 systemd                     
    2 root      20   0       0      0      0 S   0.0   0.0   0:00.00 kthreadd                    
    3 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_gp                      
    4 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_par_gp                  
              ...
  • The sha1sum processes have PIDs 33301, 33302, and 33303, as listed in the preceding sample output.

Important

As a prerequisite to the following procedure, you must complete the preparations of cgroup-v2 as described in Preparing the Control Group for Distribution of CPU Time. If you skipped those preparations, you can't complete this procedure.

  1. Create 3 child groups in the MyGroups subdirectory.
    sudo mkdir /sys/fs/cgroup/MyGroups/g1
    sudo mkdir /sys/fs/cgroup/MyGroups/g2
    sudo mkdir /sys/fs/cgroup/MyGroups/g3
  2. Configure the CPU weight for each child group.
    echo "150" | sudo tee /sys/fs/cgroup/MyGroups/g1/cpu.weight
    echo "100" | sudo tee /sys/fs/cgroup/MyGroups/g2/cpu.weight
    echo "50" | sudo tee /sys/fs/cgroup/MyGroups/g3/cpu.weight
  3. Apply the application PIDs to their corresponding child groups.
    echo "33301" | sudo tee /sys/fs/cgroup/MyGroups/g1/cgroup.procs
    echo "33302" | sudo tee /sys/fs/cgroup/MyGroups/g2/cgroup.procs
    echo "33303" | sudo /sys/fs/cgroup/MyGroups/g3/cgroup.procs

    These commands set the selected applications to become members of the MyGroups/g*/ control groups. The CPU time for each sha1sum process depends on the CPU time distribution as configured for each group.

    The weights of the g1, g2, and g3 groups that have running processes are summed up at the level of MyGroups, which is the parent control group.

    With this configuration, when all processes run at the same time, the kernel allocates to each of the sha1sum processes the proportionate CPU time based on their respective cgroup's cpu.weight file, as follows:

    Child group cpu.weight setting Percent of CPU time allocation
    g1 150 ~50% (150/300)
    g2 100 ~33% (100/300)
    g3 50 ~16% (50/300)

    If one child group has no running processes, then the CPU time allocation for running processes is recalculated based on the total weight of the remaining child groups with running processes. For example, if the g2 child group doesn't have any running processes, then the total weight becomes 200, which is the weight of g1+g3. In this case, the CPU time for g1 becomes 150/200 (~75%) and for g3, 50/200 (~25%)

  4. Check that the applications are running in the specified control groups.
    sudo cat /proc/33301/cgroup /proc/33302/cgroup /proc/33303/cgroup
    0::/MyGroups/g1
    0::/MyGroups/g2
    0::/MyGroups/g3
  5. Check the current CPU consumption after you have set the CPU weights.
    top
    ...
    PID   USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
    33301 root      20   0   18720   1748   1460 R  49.5   0.0 415:05.87 sha1sum
    33302 root      20   0   18720   1756   1464 R  32.9   0.0 412:58.33 sha1sum
    33303 root      20   0   18720   1860   1568 R  16.3   0.0 411:03.12 sha1sum
    760 root      20   0  416620  28540  15296 S   0.3   0.7   0:10.23 tuned
    1 root      20   0  186328  14108   9484 S   0.0   0.4   0:02.00 systemd
    2 root      20   0       0      0      0 S   0.0   0.0   0:00.01 kthread
    ...