Memory and Thread Placement Optimization Developer's Guide

Locality Groups and Thread and Memory Placement

This section discusses the APIs used to discover and affect thread and memory placement with respect to lgroups.

Using lgrp_home()

The lgrp_home() function returns the home lgroup for the specified process or thread.

#include <sys/lgrp_user.h>
lgrp_id_t lgrp_home(idtype_t idtype, id_t id);

The lgrp_home() function returns EINVAL when the ID type is not valid. The lgrp_home() function returns EPERM when the effective user of the calling process is not the superuser and the real or effective user ID of the calling process does not match the real or effective user ID of one of the threads. The lgrp_home() function returns ESRCH when the specified process or thread is not found.

Using madvise()

The madvise() function advises the kernel that a region of user virtual memory in the range starting at the address specified in addr and with length equal to the value of the len parameter is expected to follow a particular pattern of use. The kernel uses this information to optimize the procedure for manipulating and maintaining the resources associated with the specified range. Use of the madvise() function can increase system performance when used by programs that have specific knowledge of their access patterns over memory.

#include <sys/types.h>
#include <sys/mman.h>
int madvise(caddr_t addr, size_t len, int advice);

The madvise() function provides the following flags to affect how a thread's memory is allocated among lgroups:

MADV_ACCESS_DEFAULT

This flag resets the kernel's expected access pattern for the specified range to the default.

MADV_ACCESS_LWP

This flag advises the kernel that the next LWP to touch the specified address range is the LWP that will access that range the most. The kernel allocates the memory and other resources for this range and the LWP accordingly.

MADV_ACCESS_MANY

This flag advises the kernel that many processes or LWPs will access the specified address range randomly across the system. The kernel allocates the memory and other resources for this range accordingly.

The madvise() function can return the following values:

EAGAIN

Some or all of the mappings in the specified address range, from addr to addr+len, are locked for I/O.

EINVAL

The value of the addr parameter is not a multiple of the page size as returned by sysconf(3C), the length of the specified address range is less than or equal to zero, or the advice is invalid.

EIO

An I/O error occurs while reading from or writing to the file system.

ENOMEM

Addresses in the specified address range are outside the valid range for the address space of a process or the addresses in the specified address range specify one or more pages that are not mapped.

ESTALE

The NFS file handle is stale.

Using meminfo()

The meminfo() function gives the calling process information about the virtual memory and physical memory that the system has allocated to that process.

#include <sys/types.h>
#include <sys/mman.h>
int meminfo(const uint64_t inaddr[], int addr_count,
    const uint_t info_req[], int info_count, uint64_t outdata[],
    uint_t validity[]);

The meminfo() function can return the following types of information:

MEMINFO_VPHYSICAL

The physical memory address corresponding to the given virtual address

MEMINFO_VLGRP

The lgroup to which the physical page corresponding to the given virtual address belongs

MEMINFO_VPAGESIZE

The size of the physical page corresponding to the given virtual address

MEMINFO_VREPLCNT

The number of replicated physical pages that correspond to the given virtual address

MEMINFO_VREPL|n

The nth physical replica of the given virtual address

MEMINFO_VREPL_LGRP|n

The lgroup to which the nth physical replica of the given virtual address belongs

MEMINFO_PLGRP

The lgroup to which the given physical address belongs

The meminfo() function takes the following parameters:

inaddr

An array of input addresses.

addr_count

The number of addresses that are passed to meminfo().

info_req

An array that lists the types of information that are being requested.

info_count

The number of pieces of information that are requested for each address in the inaddr array.

outdata

An array where the meminfo() function places the results. The array's size is equal to the product of the values of the info_req and addr_count parameters.

validity

An array of size equal to the value of the addr_count parameter. The validity array contains bitwise result codes. The 0th bit of the result code evaluates the validity of the corresponding input address. Each successive bit in the result code evaluates the validity of the response to the members of the info_req array in turn.

The meminfo() function returns EFAULT when the area of memory to which the outdata or validity arrays point cannot be written to. The meminfo() function returns EFAULT when the area of memory to which the info_req or inaddr arrays point cannot be read from. The meminfo() function returns EINVAL when the value of info_count exceeds 31 or is less than 1. The meminfo() function returns EINVAL when the value of addr_count is less than zero.


Example 3–2 Use of meminfo() to Print Out Physical Pages and Page Sizes Corresponding to a Set of Virtual Addresses

void
print_info(void **addrvec, int how_many)
{
        static const int info[] = {
                MEMINFO_VPHYSICAL,
                MEMINFO_VPAGESIZE};
        uint64_t * inaddr = alloca(sizeof(uint64_t) * how_many);
        uint64_t * outdata = alloca(sizeof(uint64_t) * how_many * 2;
        uint_t * validity = alloca(sizeof(uint_t) * how_many);

        int i;

        for (i = 0; i < how_many; i++)
                inaddr[i] = (uint64_t *)addr[i];

        if (meminfo(inaddr, how_many,  info,
                    sizeof (info)/ sizeof(info[0]),
                    outdata, validity) < 0)
                ...

        for (i = 0; i < how_many; i++) {
                if (validity[i] & 1 == 0)
                        printf("address 0x%llx not part of address
                                        space\n",
                                inaddr[i]);

                else if (validity[i] & 2 == 0)
                        printf("address 0x%llx has no physical page
                                        associated with it\n",
                                inaddr[i]);

                else {
                        char buff[80];
                        if (validity[i] & 4 == 0)
                                strcpy(buff, "<Unknown>");
                        else
                                sprintf(buff, "%lld", outdata[i * 2 +
                                                1]);
                        printf("address 0x%llx is backed by physical
                                        page 0x%llx of size %s\n",
                                        inaddr[i], outdata[i * 2], buff);
                }
        }
}

Locality Group Affinity

The kernel assigns a thread to a locality group when the lightweight process (LWP) for that thread is created. That lgroup is called the thread's home lgroup. The kernel runs the thread on the CPUs in the thread's home lgroup and allocates memory from that lgroup whenever possible. If resources from the home lgroup are unavailable, the kernel allocates resources from other lgroups. When a thread has affinity for more than one lgroup, the operating system allocates resources from lgroups chosen in order of affinity strength. Lgroups can have one of three distinct affinity levels:

  1. LGRP_AFF_STRONG indicates strong affinity. If this lgroup is the thread's home lgroup, the operating system avoids rehoming the thread to another lgroup if possible. Events such as dynamic reconfiguration, processor, offlining, processor binding, and processor set binding and manipulation might still result in thread rehoming.

  2. LGRP_AFF_WEAK indicates weak affinity. If this lgroup is the thread's home lgroup, the operating system rehomes the thread if necessary for load balancing purposes.

  3. LGRP_AFF_NONE indicates no affinity. If a thread has no affinity to any lgroup, the operating system assigns a home lgroup to the thread .

The operating system uses lgroup affinities as advice when allocating resources for a given thread. The advice is factored in with the other system constraints. Processor binding and processor sets do not change lgroup affinities, but might restrict the lgroups on which a thread can run.

Using lgrp_affinity_get()

The lgrp_affinity_get(3LGRP) function returns the affinity that a LWP has for a given lgroup.

#include <sys/lgrp_user.h>
lgrp_affinity_t lgrp_affinity_get(idtype_t idtype, id_t id, lgrp_id_t lgrp);

The idtype and id arguments specify the LWP that the lgrp_affinity_get() function examines. If the value of idtype is P_PID, the lgrp_affinity_get() function gets the lgroup affinity for one of the LWPs in the process whose process ID matches the value of the id argument. If the value of idtype is P_LWPID, the lgrp_affinity_get() function gets the lgroup affinity for the LWP of the current process whose LWP ID matches the value of the id argument. If the value of idtype is P_MYID, the lgrp_affinity_get() function gets the lgroup affinity for the current LWP.

The lgrp_affinity_get() function returns EINVAL when the given lgroup or ID type is not valid. The lgrp_affinity_get() function returns EPERM when the effective user of the calling process is not the superuser and the ID of the calling process does not match the real or effective user ID of one of the LWPs. The lgrp_affinity_get() function returns ESRCH when a given lgroup or LWP is not found.

Using lgrp_affinity_set()

The lgrp_affinity_set(3LGRP) function sets the affinity that a LWP or set of LWPs have for a given lgroup.

#include <sys/lgrp_user.h>
int lgrp_affinity_set(idtype_t idtype, id_t id, lgrp_id_t lgrp,
                      lgrp_affinity_t affinity);

The idtype and id arguments specify the LWP or set of LWPs the lgrp_affinity_set() function examines. If the value of idtype is P_PID, the lgrp_affinity_set() function sets the lgroup affinity for all of the LWPs in the process whose process ID matches the value of the id argument to the affinity level specified in the affinity argument. If the value of idtype is P_LWPID, the lgrp_affinity_set() function sets the lgroup affinity for the LWP of the current process whose LWP ID matches the value of the id argument to the affinity level specified in the affinity argument. If the value of idtype is P_MYID, the lgrp_affinity_set() function sets the lgroup affinity for the current LWP or process to the affinity level specified in the affinity argument.

The lgrp_affinity_set() function returns EINVAL when the given lgroup, affinity, or ID type is not valid. The lgrp_affinity_set() function returns EPERM when the effective user of the calling process is not the superuser and the ID of the calling process does not match the real or effective user ID of one of the LWPs. The lgrp_affinity_set() function returns ESRCH when a given lgroup or LWP is not found.