Configuring HugeTLB Pages

Describes how to request HugeTLB pages with kernel boot parameters, sysfs settings, and NUMA-aware procedures.

You can configure HugeTLB pages by using the following types of parameters:

  • Kernel boot parameters
  • File-based configuration parameters

The following sections discuss the parameters in greater detail.

Kernel Boot Parameters for HugeTLB Pages

The kernel boot options enable you to specify values such as the size and the number of pages to be reserved in the kernel's pool. Using the kernel boot parameters is the most reliable method of requesting huge pages.

The following table describes the kernel boot parameters available for HugeTLB page setup.
The Kernel Boot Command Line Parameters for Requesting HugeTLB Pages
ParametersPurpose Accepted Value Option on x86_64 Architecture
default_hugepagesz Defines the default size of persistent huge pages configured in the kernel at boot time.2M (default), 1G
hugepagesz and hugepages

Size parameter hugepagesz is used with quantity parameter hugepages to reserve a pool of a specified page size and quantity. For example, to request a pool of 1500 pages of size 2 MB, the command line options would be as follows:

hugepagesz=2M hugepages=1500

If multiple huge page sizes are supported, the "hugepagesz=<size> hugepages=<qty>" pair can be specified multiple times, once for each page size. For example, you can use the following command line options to request one pool of four pages of 1 GB size, and a second pool of 1500 pages of 2 MB size:

hugepagesz=1G hugepages=4 hugepagesz=2M hugepages=1500

Hugepagesz: 2M, 1G

hugepages: 0 or greater

Note

In a NUMA system, pages reserved with kernel command line options, as shown in the previous table, are divided equally between the NUMA nodes.

If the requirement is to have a different number of pages on each node, you can use the file-based HugeTLB parameters in the sysfs file system. See File-Based Configuration Parameters for HugeTLB Pages and Configuring HugeTLB Pages Using NUMA Node Specific Parameters Early in the Boot Process.

File-Based Configuration Parameters for HugeTLB Pages

The file-based configuration parameters provide runtime access to the configuration settings.
Note

In addition to accessing the settings at runtime, you can also initialize the parameters early in the boot process, for example, by creating a start-up bash script or by setting the parameters up in a local rc init script.
Multiple instances of each file-based parameter can be configured on a system. For example, on a system that can handle both 2 MB and 1 GB HugeTLB page sizes, several nr_hugepages settings can exist. This parameter defines the number of pages in a pool, including the following:
  • File /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages for the number of pages in the pool of 2 MB pages.
  • File /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages for the number of pages in the pool of 1 GB pages.

The following table outlines commonly used HugeTLB configuration parameters and the multiple file instances that you might find for each parameter.

Commonly Used File-Based HugeTLB Parameters
Parameter Purpose File Paths for Different Instances
nr_hugepages
  • Each instance of nr_hugepages defines the current number of huge pages in the pool associated with that instance.
  • Can be changed at runtime.
  • Example command:

    echo 20 | sudo tee /proc/sys/vm/nr_hugepages
  • Default value is 0.
The file path formats for different instances of nr_hugepages are as follows:
  • File location: /proc/sys/vm/nr_hugepages (present on all systems).
  • File location: /sys/kernel/mm/hugepages/hugepages-<SIZE>kB/nr_hugepages (present on systems that support more than one huge page size).
  • File location: /sys/devices/system/node/node{0,1,2...n}/hugepages/hugepages-<SIZE>kB/nr_hugepages (present on NUMA systems only).

    Use the NUMA node specific path format if you need to request different quantities of pages of different sizes to be supported on specific NUMA nodes. If you use any other path format (for example, /proc/sys/vm/nr_hugepages) to request HugeTLB pages, the pages are divided equally between the NUMA nodes.

nr_overcommit_hugepages
  • Each instance of nr_overcommit_hugepages defines the additional number of huge pages that's higher than the quantity specified by nr_hugepages. It can be created by the system at runtime through overcommitting memory.
  • As these additional huge pages become unused, they're freed and returned to the kernel's normal page pool.
  • Example command:
    echo 20 | sudo tee /proc/sys/vm/nr_overcommit_hugepages
The file path formats for different instances of nr_overcommit_hugepages are as follows:
  • File location /proc/sys/vm/nr_overcommit_hugepages (present on all systems).
  • File location: /sys/kernel/mm/hugepages/hugepages-<SIZE>kB/nr_overcommit_hugepages (present on systems that support more than one huge page size).

The nr_overcommit_hugepages parameter isn't defined at the individual node level, so no node specific file exists for this setting.

free_hugepages
  • Read-only parameter.
  • Each instance of free_hugepages returns the number of huge pages in its associated page pool that have yet to be allocated.

The file path formats for different instances of free_hugepages are as follows:

  • File location: /sys/kernel/mm/hugepages/hugepages-<SIZE>kB/free_hugepages (present on systems that support more than one huge page size).
  • File location: /sys/devices/system/node/node{0,1,2...n}/hugepages/hugepages-<SIZE>kB/ free_hugepages (present on NUMA systems only).
surplus_hugepages
  • Read-only parameter.
  • Each instance of surplus_hugepages returns the number of huge pages that have been overcommitted from its associated page pool.

The file path formats for different instances of surplus_hugepages are as follows:

  • File location: /sys/kernel/mm/hugepages/hugepages-<SIZE>kB/surplus_hugepages (present on systems that support more than one huge page size).
  • File location: /sys/devices/system/node/node{0,1,2...n}/hugepages/hugepages-<SIZE>kB/surplus_hugepages (present on NUMA systems only).

The following sections show file branches under which different instances of the HugeTLB parameters are stored:

/proc/sys/vm

All systems that support static huge pages contain HugeTLB parameter files under /proc/sys/vm.

Note

On many systems, including many Oracle database servers, the procfs file system is the main parameter-set used.

The sysctl parameter vm.nr_hugepages that's commonly initialized in scripts that request huge pages also writes to the procfs file /proc/sys/vm/nr_hugepages.

The following are example folders under branch /proc/sys/vm:

    ├── ...
    ├── ...
    ├── nr_hugepages
    ├── ...
    ├── nr_overcommit_hugepages
    ├── ...
    ├── ...

/sys/kernel/mm/hugepages/

Systems that support multiple size pools contain HugeTLB parameter files in size-specific folders under /sys/kernel/mm/hugepages/.

The following are example folders under branch /sys/kernel/mm/hugepages/:

└── hugepages-2048kB
    ├── free_hugepages
    ├── nr_hugepages
    ├── ...
    ├── nr_overcommit_hugepages
    ├── ...
    └── surplus_hugepages

└── hugepages-1048576kB
    ├── free_hugepages
    ├── nr_hugepages
    ├── ...
    ├── nr_overcommit_hugepages
    ├── ...
    └── surplus_hugepages

/sys/devices/system/node/

Only NUMA systems contain HugeTLB parameter files under /sys/devices/system/node/.

The following are example folders under branch /sys/devices/system/node:

      ├─  ...
      ├── node0
      │   ├── ...
      │   ├──hugepages
      │          hugepages-2048kB
      │              ├── free_hugepages
      │              ├── nr_hugepages
      │              └── surplus_hugepages
      │   
      │          hugepages-1048576kB
      │              ├── free_hugepages
      │              ├── nr_hugepages
      │              └── surplus_hugepages
      ├── node1
          ├── ...
          ├──hugepages
                 hugepages-2048kB
                     ├── free_hugepages
                     ├── nr_hugepages
                     └── surplus_hugepages
          
                 hugepages-1048576kB
                     ├── free_hugepages
                     ├── nr_hugepages
                     └── surplus_hugepages

Configuring HugeTLB Pages by Using Kernel Parameters at Boot Time

Use the grubby command to configure kernel command line options that the system uses at boot time to set up HugeTLB pages.

The following procedure shows how to set default kernel command line options in the GRUB 2 configuration to specify two pools of HugeTLB pages and a default page size on a system that handles multiple huge page sizes. In this procedure, the following are requested:
  • A default page size of 1 GB.
  • One pool with four HugeTLB pages of 1 GB size.
  • One pool of 1500 HugeTLB pages of 2 MB size.

Before beginning the following procedure, ensure that you have the administrative privileges required.

For more information about configuring kernel command line options and GRUB 2, see Managing Kernels and System Boot on Oracle Linux.

  1. Use the grubby command to add the kernel command line arguments that you require.

    For example, specify 1 GB size for kernel boot parameter default_hugepagesz and 2 pairs of "hugepagesz=<Size_num>G hugepages=Qty_num" parameters for the two huge page pools.

    sudo grubby --update-kernel=ALL --args="default_hugepagesz=1G hugepagesz=1G hugepages=4 hugepagesz=2M hugepages=1500"

    When you run grubby with the ALL keyword to update the kernel, the changes apply to all kernels and also update the /etc/default/grub configuration file.

  2. Validate that the configuration is updated.
    sudo grubby --info ALL
  3. Reboot the system for the changes to take effect.
  4. Verify that the new configuration is in effect on the system.
    cat /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
    cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages

Configuring HugeTLB Pages Using NUMA Node Specific Parameters Early in the Boot Process

Provides a systemd-based workflow for reserving different HugeTLB allocations on specific NUMA nodes during startup.

The precise way to request huge pages at boot time depends upon the system's requirements. The following procedure provides some guidance but isn't exclusive of other approaches to configuring boot options.

Huge Pages requested by using the kernel boot-time parameters, as shown in the previous example, are divided equally between the NUMA nodes.

However, you might need to request a different number of huge pages for specific nodes by setting the configuration values in a node-specific file path. The file path is defined as follows:

/sys/devices/system/node/node{0,1,2...n}/hugepages/hugepages-<SIZE>kB/

The following procedure describes how to reserve 299 pages of 2 MB size on node 0, and 300 pages of 2 MB size on node 1 on a NUMA system. This approach uses a custom systemd service to run a shell script after boot, to set the sysfs parameters required.

Before beginning the following procedure, ensure that you have the administrative privileges required for all the steps.

For more information about using systemd units, see Managing the System With systemd.

  1. Create a script file called hugetlb-reserve-pages.sh in the /usr/lib/systemd/ directory and add the following content.
    #!/bin/sh
    
    nodes_path=/sys/devices/system/node/
    if [ ! -d $nodes_path ]; then
        echo "ERROR: $nodes_path does not exist"
        exit 1
    fi
    
    #######################################################
    #                                                     #
    #     FUNCTION                                        #
    #           reserve_pages <number_of_pages> <node_id> #
    #                                                     #
    ####################################################### 
    
    reserve_pages()
    {
        echo $1 > $nodes_path/$2/hugepages/hugepages-2048kB/nr_hugepages
    }
    
    reserve_pages 299 node0    
    reserve_pages 300 node1 
    
  2. Make the script executable:
    sudo chmod +x /usr/lib/systemd/hugetlb-reserve-pages.sh
  3. Create a service file called hugetlb-gigantic-pages.service in the /usr/lib/systemd/system/ directory and add the following content to it.
    [Unit]
    Description=HugeTLB Gigantic Pages Reservation
    DefaultDependencies=no
    Before=dev-hugepages.mount
    ConditionPathExists=/sys/devices/system/node
    
    [Service]
    Type=oneshot
    RemainAfterExit=yes
    ExecStart=/usr/lib/systemd/hugetlb-reserve-pages.sh
    
    [Install]
    WantedBy=sysinit.target
  4. Enable the service file.
    sudo systemctl enable hugetlb-gigantic-pages

Configuring HugeTLB Pages for a Specific NUMA Node at Runtime

In certain cases, you might need to make a request for huge pages at runtime.

The following procedure shows how to request 20 HugeTLB pages of size 2048 kB for node2 at runtime.

Before starting, ensure you have the administrative privileges required for all the steps. The procedure uses the numastat command, which is available in the numactl package. You might need to install this package beforehand.

  1. Run the numastat command to show memory statistics relating to the NUMA nodes:
    numastat -cm | egrep 'Node|Huge'| grep -v AnonHugePages
                     Node 0 Node 1 Node 2 Node 3  Total add
    HugePages_Total       0      0      0      0      0
    HugePages_Free        0      0      0      0      0
    HugePages_Surp        0      0      0      0      0
    
  2. Add the required number of huge pages of a specified size to the selected node, for example 20 pages of 2 MB size on node 2:
    echo 20 | sudo tee /sys/devices/system/node/node2/hugepages/hugepages-2048kB/nr_hugepages
  3. Run the numastat command again to ensure the request was successful and that the requested memory (in this example 20 x 2 MB pages = 40 MB) has been added to HugePages_Total for node2:
    numastat -cm | egrep 'Node|Huge'| grep -v AnonHugePages
                     Node 0 Node 1 Node 2 Node 3  Total
    HugePages_Total       0      0     40      0     40
    HugePages_Free        0      0     40      0     40
    HugePages_Surp        0      0      0      0      0