Note:
- This tutorial is available in an Oracle-provided free lab environment.
- It uses example values for Oracle Cloud Infrastructure credentials, tenancy, and compartments. When completing your lab, substitute these values with ones specific to your cloud environment.
Run Control Group Version 2 on Oracle Linux
Introduction
Control Group (cgroup) is a Linux kernel feature for limiting, prioritizing, and allocating resources such as CPU time, memory, and network bandwidth for running processes.
This tutorial guides you through limiting the CPU time for user processes using cgroup v2.
Objectives
In this tutorial, you will learn how to:
- Enable control group version 2
- Set a soft CPU limit for a user process
- Set a hard CPU limit for a user process
Prerequisites
-
Minimum of a single Oracle Linux system
-
Each system should have Oracle Linux installed and configured with:
- A non-root user account with sudo access
- Access to the Internet
Deploy Oracle Linux
Note: If running in your own tenancy, read the linux-virt-labs GitHub project README.md and complete the prerequisites before deploying the lab environment.
-
Open a terminal on the Luna Desktop.
-
Clone the
linux-virt-labsGitHub project.git clone https://github.com/oracle-devrel/linux-virt-labs.git -
Change into the working directory.
cd linux-virt-labs/ol -
Install the required collections.
ansible-galaxy collection install -r requirements.yml -
Deploy the lab environment.
ansible-playbook create_instance.yml -e localhost_python_interpreter="/usr/bin/python3.6"The free lab environment requires the extra variable
local_python_interpreter, which setsansible_python_interpreterfor plays running on localhost. This variable is needed because the environment installs the RPM package for the Oracle Cloud Infrastructure SDK for Python, located under the python3.6 modules.The default deployment shape uses the AMD CPU and Oracle Linux 8. To use an Intel CPU or Oracle Linux 9, add
-e instance_shape="VM.Standard3.Flex"or-e os_version="9"to the deployment command.Important: Wait for the playbook to run successfully and reach the pause task. At this stage of the playbook, the installation of Oracle Cloud Native Environment is complete, and the instances are ready. Take note of the previous play, which prints the public and private IP addresses of the nodes it deploys and any other deployment information needed while running the lab.
Create a Load-generating Script
-
Open a terminal and connect via SSH to the ol-node-01 instance.
ssh oracle@<ip_address_of_instance> -
Create the
foo.exescript.echo '#!/bin/bash /usr/bin/sha1sum /dev/zero' > foo.exe -
Copy the
foo.exescript to a location in your$PATHand set the proper permissions.sudo mv foo.exe /usr/local/bin/foo.exe sudo chown root:root /usr/local/bin/foo.exe sudo chmod 755 /usr/local/bin/foo.exe -
Fix the SELinux labels after copying and changing permissions on the
foo.exescript.sudo /sbin/restorecon -v /usr/local/bin/foo.exeNote: Oracle Linux runs with SELinux set to enforcing mode by default. You can verify this by running
sudo sestatus.
Create a Load-generating Service
-
Create the
foo.servicefile.echo '[Unit] Description=the foo service After=network.target [Service] ExecStart=/usr/local/bin/foo.exe [Install] WantedBy=multi-user.target' > foo.service -
Copy the
foo.servicescript to the default systemd scripts directory and set the proper permissions.sudo mv foo.service /etc/systemd/system/foo.service sudo chown root:root /etc/systemd/system/foo.service sudo chmod 644 /etc/systemd/system/foo.service -
Fix the SELinux labels.
sudo /sbin/restorecon -v /etc/systemd/system/foo.service -
Reload the systemd daemon so it recognizes the new service.
sudo systemctl daemon-reload -
Start
foo.serviceand check its status.sudo systemctl start foo.service sudo systemctl status foo.service
Create Users
Additional users will be allowed to run the load-generating script under these different accounts and different CPU weights.
-
Create users and set passwords.
sudo useradd -u 8000 ralph sudo useradd -u 8001 alice echo "ralph:oracle" | sudo chpasswd echo "alice:oracle" | sudo chpasswd -
Allow SSH connections.
Copy the SSH key from the
oracleuser account for the ‘ralph’ user.sudo mkdir /home/ralph/.ssh sudo cp /home/oracle/.ssh/authorized_keys /home/ralph/.ssh/authorized_keys sudo chown -R ralph:ralph /home/ralph/.ssh sudo chmod 700 /home/ralph/.ssh sudo chmod 600 /home/ralph/.ssh/authorized_keys -
Repeat for the
aliceuser.sudo mkdir /home/alice/.ssh sudo cp /home/oracle/.ssh/authorized_keys /home/alice/.ssh/authorized_keys sudo chown -R alice:alice /home/alice/.ssh sudo chmod 700 /home/alice/.ssh sudo chmod 600 /home/alice/.ssh/authorized_keys -
Open a new terminal and verify both SSH connections work.
ssh -l ralph -o StrictHostKeyChecking=accept-new <ip_address_of_instance> trueThe
-o StrictHostKeyChecking=accept-newoption automatically accepts previously unseen keys but will refuse connections for changed or invalid hostkeys. This option is a safer subset of the current behavior of StrictHostKeyChecking=no. Thetruecommand runs on the remote host and always returns a value of 0, which indicates that the SSH connection was successful. If there are no errors, the terminal returns to the command prompt after running the SSH command. -
Repeat for the other user.
ssh -l alice -o StrictHostKeyChecking=accept-new <ip_address_of_instance> true -
Exit the current terminal and switch to the other existing terminal connected to ol-node-01.
Enable Control Group Version 2
Note: Oracle Linux 9 and higher ships with cgroup v2 enabled by default.
For Oracle Linux 8, you must manually configure the boot kernel parameters to enable cgroup v2 as it mounts cgroup v1 by default.
If you are not using Oracle Linux 8, skip to the next section.
-
Update grub with the cgroup v2 systemd kernel parameter.
sudo grubby --update-kernel=ALL --args="systemd.unified_cgroup_hierarchy=1"You can instead specify only your current boot entry by running
sudo grubby --update-kernel=/boot/vmlinuz-$(uname -r) --args="systemd.unified_cgroup_hierarchy=1". -
Confirm the changes.
cat /etc/default/grub |grep systemd.unified_cgroup_hierarchy -
Reboot the instance for the changes to take effect.
sudo systemctl rebootNote: Wait a few minutes for the instance to restart.
-
Reconnect to the ol-node-01 instance using SSH.
Verify that Cgroup v2 is Enabled
-
Check the cgroup controller list.
cat /sys/fs/cgroup/cgroup.controllersThe output should return similar results:
cpuset cpu io memory hugetlb pids rdma -
Check the cgroup2 mounted file system.
mount |grep cgroup2The output should return similar results:
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime) -
Inspect the contents of the cgroup mounted directory.
ll /sys/fs/cgroupExample output:
total 0 -r--r--r--. 1 root root 0 Mar 13 21:20 cgroup.controllers -rw-r--r--. 1 root root 0 Mar 13 21:20 cgroup.max.depth -rw-r--r--. 1 root root 0 Mar 13 21:20 cgroup.max.descendants -rw-r--r--. 1 root root 0 Mar 13 21:20 cgroup.procs -r--r--r--. 1 root root 0 Mar 13 21:20 cgroup.stat -rw-r--r--. 1 root root 0 Mar 13 21:20 cgroup.subtree_control -rw-r--r--. 1 root root 0 Mar 13 21:20 cgroup.threads -rw-r--r--. 1 root root 0 Mar 13 21:20 cpu.pressure -r--r--r--. 1 root root 0 Mar 13 21:20 cpuset.cpus.effective -r--r--r--. 1 root root 0 Mar 13 21:20 cpuset.mems.effective drwxr-xr-x. 2 root root 0 Mar 13 21:20 init.scope -rw-r--r--. 1 root root 0 Mar 13 21:20 io.pressure -rw-r--r--. 1 root root 0 Mar 13 21:20 memory.pressure drwxr-xr-x. 87 root root 0 Mar 13 21:20 system.slice drwxr-xr-x. 4 root root 0 Mar 13 21:24 user.sliceThe output shows the root control group at its default location. The directory contains interface files all prefixed with cgroup and directories related to
systemdthat end in.scopeand.slice.
Work with the Virtual File System
Before we get started, we need to learn a bit about the cgroup virtual file system mounted at /sys/fs/cgroup.
-
Show which CPUs participate in the cpuset for everyone.
cat /sys/fs/cgroup/cpuset.cpus.effectiveThe output shows a range starting at 0 that indicates the system’s effective CPUs, which consist of a combination of CPU cores and threads.
-
Show which controllers are active.
cat /sys/fs/cgroup/cgroup.controllersExample output:
cpuset cpu io memory hugetlb pids rdma miscIt’s good to see the cpuset controller present as we’ll use it later in this tutorial.
-
Show processes spawned by
oracle.First, we need to determine
oracle’s user id (UID).who idExample output:
[oracle@ol-node-01 ~]$ who oracle pts/0 2022-03-13 21:23 (10.39.209.157) [oracle@ol-node-01 ~]$ id uid=1001(oracle) gid=1001(oracle) groups=1001(oracle),10(wheel) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023Using the UID, we can find the
oracleusers slice.cd /sys/fs/cgroup/user.slice lsExample output:
[oracle@ol-node-01 ~]$ cd /sys/fs/cgroup/user.slice [oracle@ol-node-01 user.slice]$ ls cgroup.controllers cgroup.subtree_control memory.events memory.pressure pids.max cgroup.events cgroup.threads memory.events.local memory.stat user-0.slice cgroup.freeze cgroup.type memory.high memory.swap.current user-1001.slice cgroup.max.depth cpu.pressure memory.low memory.swap.events user-989.slice cgroup.max.descendants cpu.stat memory.max memory.swap.max cgroup.procs io.pressure memory.min pids.current cgroup.stat memory.current memory.oom.group pids.eventsSystemd assigns every user a slice named
user-<UID>.slice. So, what’s under that directory?cd user-1001.slice lsExample output:
[oracle@ol-node-01 user.slice]$ cd user-1001.slice/ [oracle@ol-node-01 user-1001.slice]$ ls cgroup.controllers cgroup.max.descendants cgroup.threads io.pressure user-runtime-dir@1001.service cgroup.events cgroup.procs cgroup.type memory.pressure cgroup.freeze cgroup.stat cpu.pressure session-3.scope cgroup.max.depth cgroup.subtree_control cpu.stat user@1001.serviceThese are the top-level cgroup for the
oracleuser. However, there are no processes listed incgroup.procs. So, where is the list of user processes?cat cgroup.procsExample output:
[oracle@ol-node-01 user-1001.slice]$ cat cgroup.procs [oracle@ol-node-01 user-1001.slice]$When
oracleopened the SSH session at the beginning of this tutorial, the user session created a scope sub-unit. Under this scope, we can check thecgroup.procsfor a list of processes spawned under that session.Note: The user might have multiple sessions based on the number of connections to the system; therefore, replace the 3 in the sample below as necessary.
cd session-3.scope ls cat cgroup.procsExample output:
[oracle@ol-node-01 user-1001.slice]$ cd session-3.scope/ [oracle@ol-node-01 session-3.scope]$ ls cgroup.controllers cgroup.max.depth cgroup.stat cgroup.type io.pressure cgroup.events cgroup.max.descendants cgroup.subtree_control cpu.pressure memory.pressure cgroup.freeze cgroup.procs cgroup.threads cpu.stat [oracle@ol-node-01 session-3.scope]$ cat cgroup.procs 3189 3200 3201 54217Now that we have found the processes the hard way, we can use
systemd-cglsto show the same information in a tree-like view.Note: When run from within the virtual filesystem,
systemd-cglslimits the cgroup output to the current working directory.cd /sys/fs/cgroup/user.slice/user-1001.slice systemd-cglsExample output:
[oracle@ol-node-01 user-1001.slice]$ systemd-cgls Working directory /sys/fs/cgroup/user.slice/user-1001.slice: ├─session-3.scope │ ├─ 3189 sshd: oracle [priv] │ ├─ 3200 sshd: oracle@pts/0 │ ├─ 3201 -bash │ ├─55486 systemd-cgls │ └─55487 less └─user@1001.service └─init.scope ├─3193 /usr/lib/systemd/systemd --user └─3195 (sd-pam)
Limit the CPU Cores Used
With cgroup v2, systemd has complete control of the cpuset controller. This level of control enables an administrator to schedule work on only a specific CPU core.
-
Check CPUs for
user.slice.cd /sys/fs/cgroup/user.slice ls cat ../cpuset.cpus.effectiveExample output:
[oracle@ol-node-01 cgroup]$ cd /sys/fs/cgroup/user.slice/ [oracle@ol-node-01 user.slice]$ ls cgroup.controllers cgroup.subtree_control memory.events memory.pressure pids.max cgroup.events cgroup.threads memory.events.local memory.stat user-0.slice cgroup.freeze cgroup.type memory.high memory.swap.current user-1001.slice cgroup.max.depth cpu.pressure memory.low memory.swap.events user-989.slice cgroup.max.descendants cpu.stat memory.max memory.swap.max cgroup.procs io.pressure memory.min pids.current cgroup.stat memory.current memory.oom.group pids.events [oracle@ol-node-01 user.slice]$ cat ../cpuset.cpus.effective 0-1The
cpuset.cpus.effectiveshows the actual cores used by the user.slice. If a parameter does not exist in the specific cgroup directory, or we don’t set it, the value gets inherited from the parent, which happens to be the top-level cgroup root directory for this case. -
Restrict the
systemand user 0, 1001, and 989 slices to CPU core 0.cat /sys/fs/cgroup/system.slice/cpuset.cpus.effective sudo systemctl set-property system.slice AllowedCPUs=0 cat /sys/fs/cgroup/system.slice/cpuset.cpus.effectiveExample output:
[oracle@ol-node-01 user.slice]$ cat /sys/fs/cgroup/system.slice/cpuset.cpus.effective cat: /sys/fs/cgroup/system.slice/cpuset.cpus.effective: No such file or directory [oracle@ol-node-01 user.slice]$ sudo systemctl set-property system.slice AllowedCPUs=0 cat: /sys/fs/cgroup/system.slice/cpuset.cpus.effective: No such file or directory 0Note: The
No such file or directoryindicates that by default, thesystemslice inherits itscpuset.cpus.effectivevalue from the parent.sudo systemctl set-property user-0.slice AllowedCPUs=0 sudo systemctl set-property user-1001.slice AllowedCPUs=0 sudo systemctl set-property user-989.slice AllowedCPUs=0 -
Restrict the
ralphuser to CPU core 1.sudo systemctl set-property user-8000.slice AllowedCPUs=1 cat /sys/fs/cgroup/user.slice/user-8000.slice/cpuset.cpus.effectiveExample output:
[oracle@ol-node-01 ~]$ sudo systemctl set-property user-8000.slice AllowedCPUs=1 [oracle@ol-node-01 ~]$ cat /sys/fs/cgroup/user.slice/user-8000.slice/cpuset.cpus.effective 1 -
Open a new terminal and connect via ssh as
ralphto the ol-node-01 system.ssh ralph@<ip_address_of_instance> -
Test using the
foo.exescript.foo.exe &Verify the results.
topOnce
topis running, hit the1 keyto show the CPUs individually.Example output:
top - 18:23:55 up 21:03, 2 users, load average: 1.03, 1.07, 1.02 Tasks: 155 total, 2 running, 153 sleeping, 0 stopped, 0 zombie %Cpu0 : 6.6 us, 7.0 sy, 0.0 ni, 84.8 id, 0.0 wa, 0.3 hi, 0.3 si, 1.0 st %Cpu1 : 93.0 us, 6.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.7 hi, 0.3 si, 0.0 st MiB Mem : 14707.8 total, 13649.1 free, 412.1 used, 646.6 buff/cache MiB Swap: 4096.0 total, 4096.0 free, 0.0 used. 13993.0 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 226888 ralph 20 0 228492 1808 1520 R 99.7 0.0 199:34.27 sha1sum 269233 root 20 0 223724 6388 1952 S 1.3 0.0 0:00.04 pidstat 1407 root 20 0 439016 41116 39196 S 0.3 0.3 0:17.81 sssd_nss 1935 root 20 0 236032 3656 3156 S 0.3 0.0 0:34.34 OSWatcher 2544 root 20 0 401900 40292 9736 S 0.3 0.3 0:10.62 ruby 1 root 20 0 388548 14716 9508 S 0.0 0.1 0:21.21 systemd ...Type
qto quit top. -
Alternate way to check the processor running a process.
ps -eo pid,psr,user,cmd | grep ralphExample output:
[ralph@ol-node-01 ~]$ ps -eo pid,psr,user,cmd | grep ralph 226715 1 root sshd: ralph [priv] 226719 1 ralph /usr/lib/systemd/systemd --user 226722 1 ralph (sd-pam) 226727 1 ralph sshd: ralph@pts/2 226728 1 ralph -bash 226887 1 ralph /bin/bash /usr/local/bin/foo.exe 226888 1 ralph /usr/bin/sha1sum /dev/zero 269732 1 ralph ps -eo pid,psr,user,cmd 269733 1 ralph grep --color=auto ralphThe
psrcolumn is the CPU number of thecmdor actual process. -
Exit and close the current terminal and switch to the other existing terminal connected to ol-node-01.
-
Kill the
foo.exejob.sudo pkill sha1sum
Adjust the CPU Weight for Users
Time to have alice join in the fun. She has some critical work to complete, so, we’ll give her twice the normal priority on the CPU.
-
Assign
aliceto the same CPU asralph.sudo systemctl set-property user-8001.slice AllowedCPUs=1 cat /sys/fs/cgroup/user.slice/user-8001.slice/cpuset.cpus.effective -
Set
CPUWeight.sudo systemctl set-property user-8001.slice CPUWeight=200 cat /sys/fs/cgroup/user.slice/user-8001.slice/cpu.weightThe default weight is 100, so 200 is twice that number.
-
Open a new terminal and connect via SSH as
ralphto the ol-node-01 system.ssh ralph@<ip_address_of_instance> -
Run
foo.exeasralph.foo.exe & -
Open another new terminal and connect via SSH as
aliceto the ol-node-01 system.ssh alice@<ip_address_of_instance> -
Run
foo.exeasalice.foo.exe & -
Verify via
topthataliceis getting the higher priority.topOnce
topis running, hit the1 keyto show the CPUs individually.Example output:
top - 20:10:55 up 25 min, 3 users, load average: 1.29, 0.46, 0.20 Tasks: 164 total, 3 running, 161 sleeping, 0 stopped, 0 zombie %Cpu0 : 0.0 us, 0.0 sy, 0.0 ni, 96.5 id, 0.0 wa, 0.0 hi, 3.2 si, 0.3 st %Cpu1 : 92.4 us, 7.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 15715.8 total, 14744.6 free, 438.5 used, 532.7 buff/cache MiB Swap: 4096.0 total, 4096.0 free, 0.0 used. 15001.1 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7934 alice 20 0 15800 1768 1476 R 67.0 0.0 0:36.15 sha1sum 7814 ralph 20 0 15800 1880 1592 R 33.3 0.0 0:34.60 sha1sum 1 root 20 0 388476 14440 9296 S 0.0 0.1 0:02.22 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd ... -
Switch to the terminal logged in as the
oracleuser. -
Load the
system.sliceusing thefoo.service.sudo systemctl start foo.serviceLook now at the top output, which is still running in the
aliceterminal window. See that thefoo.serviceconsumes CPU 0 while the users split CPU 1 based on their weights.Example output:
top - 19:18:15 up 21:57, 3 users, load average: 2.15, 2.32, 2.25 Tasks: 159 total, 4 running, 155 sleeping, 0 stopped, 0 zombie %Cpu0 : 89.1 us, 7.3 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.7 hi, 0.3 si, 2.6 st %Cpu1 : 93.7 us, 5.3 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.7 hi, 0.3 si, 0.0 st MiB Mem : 14707.8 total, 13640.1 free, 420.5 used, 647.2 buff/cache MiB Swap: 4096.0 total, 4096.0 free, 0.0 used. 13984.3 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 280921 root 20 0 228492 1776 1488 R 93.4 0.0 0:07.74 sha1sum 279185 alice 20 0 228492 1816 1524 R 65.6 0.0 7:35.18 sha1sum 279291 ralph 20 0 228492 1840 1552 R 32.8 0.0 7:00.30 sha1sum 2026 oracle-+ 20 0 935920 29280 15008 S 0.3 0.2 1:03.31 gomon 1 root 20 0 388548 14716 9508 S 0.0 0.1 0:22.30 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.10 kthreadd ...
Assign a CPU Quota
Lastly, we will cap the CPU time for ralph.
-
Return to the terminal logged in as the
oracleuser. -
Set the quota to 5%.
sudo systemctl set-property user-8000.slice CPUQuota=5%The change takes effect immediately, as seen in the top output, which still runs in the
aliceuser terminal.Example output:
top - 19:24:53 up 22:04, 3 users, load average: 2.21, 2.61, 2.45 Tasks: 162 total, 4 running, 158 sleeping, 0 stopped, 0 zombie %Cpu0 : 93.0 us, 4.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.7 hi, 0.0 si, 1.7 st %Cpu1 : 91.7 us, 5.6 sy, 0.0 ni, 0.0 id, 0.0 wa, 1.0 hi, 1.0 si, 0.7 st MiB Mem : 14707.8 total, 13639.4 free, 420.0 used, 648.4 buff/cache MiB Swap: 4096.0 total, 4096.0 free, 0.0 used. 13984.7 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 280921 root 20 0 228492 1776 1488 R 97.4 0.0 6:26.75 sha1sum 279185 alice 20 0 228492 1816 1524 R 92.1 0.0 12:21.12 sha1sum 279291 ralph 20 0 228492 1840 1552 R 5.3 0.0 8:44.84 sha1sum 1 root 20 0 388548 14716 9508 S 0.0 0.1 0:22.48 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.10 kthreadd ... -
Revert the cap on the
ralphuser using theoracleterminal window.echo "max 100000" | sudo tee -a user-8000.slice/cpu.maxThe quota gets written to the
cpu.maxfile, and the defaults aremax 100000.Example output:
[oracle@ol-node-01 user.slice]$ echo "max 100000" | sudo tee -a user-8000.slice/cpu.max max 100000You can enable cgroup v2, limit users to a specific CPU when the system is under load, and lock them to using only a percentage of that CPU.
Summary
Thank you for completing this tutorial. Hopefully, these steps have given you a better understanding of installing, configuring, and using control group version 2 on Oracle Linux.
For More Information
More Learning Resources
Explore other labs on docs.oracle.com/learn or access more free learning content on the Oracle Learning YouTube channel. Additionally, visit education.oracle.com/learning-explorer to become an Oracle Learning Explorer.
For product documentation, visit Oracle Help Center.
Run Control Group Version 2 on Oracle Linux
F54922-02
August 2024