11 Configuring Resource Consumption Management

This chapter describes how to use Resource Consumption Management (RCM) to ensure the fairness of resource allocation and to reduce the contention of shared resources by collocated domain partitions in the server instance. You can create RCM policies using Fusion Middleware Control (FMWC) or WLST to provide consistent performance of domain partitions in MT environments.

This chapter includes the following sections:

Configuring Resource Consumption Management: Overview

Resource Consumption Management (RCM) provides a flexible, dynamic mechanism for WebLogic Server system administrators to manage shared resources and provide consistent performance of domain partitions in MT environments.

RCM policies are configured as resource managers. A resource manager can be created with a global scope at the domain level and then used as the RCM policy for any partition within the domain. You can also create a partition-scoped resource manager if the partition has RCM characteristics specific to that partition. See "Configuring Resource Consumption Management: Main Steps"

Why Do You Need Resource Consumption Management?

When applications are deployed to multiple domain partitions, sharing low-level resources such as CPU, heap, network and file descriptors can result in unfairness during resource allocation. Abnormal resource consumption requests may happen for various reasons such as high-traffic, application design, or malicious code. These request types can overload the capacity of a shared resource, preventing another collocated domain partition's access to the resource. By employing appropriate RCM policies at the domain partition level, resource consumption management prevents applications in one partition from negatively affecting applications in other partitions.

Software Requirements for Using RCM

WebLogic RCM requires Oracle JDK 8u60 build 32 or higher.

How To Enable RCM

Set the following JVM arguments to enable WebLogic RCM in your environment:

-XX:+UnlockCommercialFeatures -XX:+ResourceManagement -XX:+UseG1GC

This flag must be applied on every server instance (JVM) where RCM is enabled.

An alternative method is to uncomment these JAVA_OPTIONS in the startWeblogic.sh file which gets created when a domain is created:

#JAVA_OPTIONS="-XX:+UnlockCommercialFeatures -XX:+ResourceManagement -XX:+UseG1GC ${SAVE_JAVA_OPTIONS}

You must do this on every server instance, and it must be done prior to starting the WebLogic Server instance.

If a Java Security Manager is used with WebLogic Server, WebLogic Resource Consumption Management requires the granting of the following permission in weblogic.policy:

permission RuntimePermission("jdk.management.resource.getResourceContextFactory")

For more information about using the Java Security Manager to protect resources in Weblogic, see "Using the Java Security Manager to Protect WebLogic Resources" in Developing Applications with the WebLogic Security Service.

Supported Resources for RCM

The following shared resources can be managed through RCM policies:

  • FileOpen: The number of open file descriptors in use by a partition. This includes files opened through FileInputStream, FileOutputstream, RandomAccessFile, and NIO File channels.

  • HeapRetained: The amount of heap (in MB) retained/in use by a partition.

  • CpuUtilization: The percentage of CPU time utilized by a partition with respect to the available CPU time of the WebLogic process.

Configuring Resource Consumption Management: Main Steps

A system administrator specifies resource consumption management policies on shared resources for each partition in the domain using a resource manager. A resource manager consists of one or more policies for one or more shared resources. Each policy consists of a constraint value for a resource and a specified action a WebLogic Server instance takes when the constraint value is met.

The following RCM policy types are supported:

Triggers

A trigger defines the static constraint value for the allowed usage for a resource. When the consumption of that resource exceeds the constraint value, a specified action is performed. This policy type is best suited for environments where the resource usage by a partition in the domain is predictable.

A system administrator can select the following actions when creating a trigger policy:

  • Notify: A notification is provided to the system administrator that the constraint value has been reached. You can add more than one Notify trigger for a resource. For example, Notify when the Open Files go beyond 20; Notify when the Open Files go beyond 50. Also, you can use the WebLogic Diagnostic Framework (WLDF) to create watch rules to listen to log messages and provide advanced notifications.

  • Slow: Slows the rate at which the resource is consumed. When a Slow action is triggered, the Partition Work Manager's fair share value is reduced which results in reducing the thread-usage time made available to the partition. For more information about Partition Work Managers, see Configuring Partition Work Managers.

  • Fail: Fails resource consumption requests for a resource until usage is below the constraint value.

    Note:

    Fail is applicable only for Open Files and not for other resources.

    For example, to limit the number of open files in partition P1 to less than 100 files, create a resource manager with a trigger policy that has a constraint value of 100 units for the Open Files resource and a Fail action for the P1 partition.

  • Shutdown: Initiates the shutdown of a partition while allowing cleanup. This action is useful when a partition exceeds a known constraint value and adverse impact of shared resources used by other partitions in the domain is expected. A partition is only shutdown in the Managed Server where the constraint value has been met, allowing continuous availability in clustered environments. Fail and Shutdown triggers should not be used together.

Note:

Policy actions may be implemented synchronously or asynchronously depending on the action type. For instance, the Fail action synchronously uses the thread that requested the file open. Other actions, such as the Slow action configured for a "Heap Retained" resource proceed asynchronously.

Fair Share

The fair share policy allows a system administrator to allocate a share (a percentage of the available resource) based on a representative load of each partition in the domain. Fair share policies are used when the exact usage requirements of a resource cannot be determined or are not practical to implement when using resource managers to provide the efficient and fair utilization of resources. When there is no contention between partitions in a domain for a given resource, each partition is able to utilize the amount of resource required for its immediate workload. If there is contention between partitions for a given resource, then each partition is constrained to utilize only their fair share of the available resource. Ensure that limits are set such that overall memory consumption does not cross the maximum available memory resulting in an out of memory exception.

Determining Fair Share Allocations for a Resource

A share is an allocation to use a specified amount of a resource. A system administrator allocates a share to a partition by specifying an integer value between 1 and 1000 in the associated resource manager fair share policy. For a given partition, the ratio of its configured fair share value to the sum total of all fair share values for the same resource in the domain determines the amount of resource allocated.

For example, a system administrator specifies a fair share value of 150 for a resource in partition P1 and a value of 100 for the same resource partition P2. If the workload is heavy enough in both partitions to create contention for that resource, the resource allocation for partitions P1 is 150/(150+100) or 60 percent of the available resource.

How to Create a Resource Manager

A resource manager consists of one or more policies for one or more shared resources. Each policy consists of a constraint value for a resource and a specified action that a WebLogic Server instance takes when the constraint value is met.

To configure resource managers, see:

Example RCM Configuration in config.xml

The following is an annotated RCM configuration similar to the WLST example displayed in Configuring Resource Consumption Management: WLST Example

Example 11-1 Example config.xml Configuration for Resource Consumption Management

<domain>
...
   <!--Define RCM Configuration -->
   <resource-management>
        <resource-manager>
            <name>Approved</name>
            <file-open>
                <trigger>
                    <name>Approved2000</name>
                    <value>2000</value><!-- in units-->
                    <action>shutdown</action>
                </trigger>
                <trigger>
                    <name>Approved1700</name>
                    <value>1700</value>
                    <action>slow</action>
                </trigger>
                <trigger>
                    <name>Approved1500</name>
                    <value>1500</value>
                    <action>notify</action>
                </trigger>
            </file-open>           
            <heap-retained>
                <trigger>
                    <name>Approved2GB</name>
                    <value>2097152</value>
                    <action>shutdown</action>
                </trigger>                               
                <fair-share-constraint>
                    <name>FS-ApprovedShare</name>
                    <value>60</value>
                </fair-share-constraint>
        </heap-retained>
        </resource-manager>
        <resource-manager>
            <name>Trial</name>
            <file-open>
                <trigger>
                    <name>Trial1000</name>
                    <value>1000</value><!-- in units-->
                    <action>shutdown</action>
                </trigger>
                <trigger>
                    <name>Trial700</name>
                    <value>700</value>
                    <action>slow</action>
                </trigger>
                <trigger>
                    <name>Trial500</name>
                    <value>500</value>
                    <action>notify</action>
                </trigger>
            </file-open>
                     ...           
        </resource-manager>
    </resource-management>
    <partition>
        <name>Partition-0</name>
        <resource-group>
            <name>ResourceTemplate-0_group</name>
            <resource-group-template>ResourceTemplate-0</resource-group-template>
        </resource-group>
        ...
        <partition-id>1741ad19-8ca7-4339-b6d3-78e56d8a5858</partition-id>
 
        <!-- RCM Managers are
            then targetted to Partitions during partition creation time or later
            by system administrators -->
        <resource-manager-ref>Approved</resource-manager-ref>
    ...
    </partition>
..
</domain>

Dynamic Reconfiguration of Resource Managers

You can dynamically apply or remove a resource management policy from a domain partition. Changes to a resource management policy will be applied to all domain partitions that use that policy.

If a policy-update for an active domain partition sets trigger values for a resource that is lower than the current usage of that resource, subsequent usage of that resource would have the policy's recourse action applied. If a change to a policy would result in an immediate shutdown of an active domain partition based on the current usage value, the change would not be accepted as a dynamic reconfiguration change.

Configuring Resource Consumption Management: Monitoring Resource Utilization

Resource consumption metrics for shared resources in a partition are available through a PartitionResourceMetricsRuntimeMBean.

Use these metrics to:

  • Monitor the current resource utilization in a partition.

  • Profile and analyze the resource consumption of a partition to generate data such as representative loads, peak loads, and peak load times needed to create effective resource managers and WLDF watches and notifications.

To monitor resource managers in FMWC, see "Monitor resource managers" in Administering Oracle WebLogic Server with Fusion Middleware Control.

By default, eager registration of resource meters is turned off. As a result, they get created lazily the first time the resource consumption metrics are queried for a particular resource. In that case, where the resource accounting is started lazily, the values returned from the resource consumption metrics might be different from the actual values of the resource consumed by the partition.

To get a true reflection of the amount of a resource consumed by a partition, the meters should be registered eagerly on partition startup. To enable eager registration of the resource meters, set the property, weblogic.rcm.enable-eager-resource-meter-registration, to true, as a JVM argument, when starting the WebLogic Server instance.

Best Practices and Considerations When Using Resource Consumption Management

The following sections provide best practices and considerations for system administrators developing resource management policies.

General Considerations

Recourse actions must be selected carefully by a system administrator. A lot of resources have complex interactions between them. For instance, slowing down CPU utilization (resulting in fewer threads allocated to the domain partition) may result in increased heap residency, thereby impacting retained heap usage.

For a slow recourse action to be effective, applications must not create or manage threads. Oracle recommends that applications use any of the WebLogic Server provided capabilities like EJB Timers, Common J Work Manager and Timers, Managed Executor Service, Batch API and such, to manage the tasks, so that the slow recourse action will be effective.

Monitor Average and Peak Resource Utilization

Before specifying resource consumption management policies, Oracle recommends that system administrators monitor average and peak resource usage data and configure policies with sufficient headroom to balance efficient usage of resources and meeting their SLAs. See Configuring Resource Consumption Management: Monitoring Resource Utilization.

When to Use a Trigger

Use triggers when an administrator is aware of precise limits at which the corresponding trigger needs to be executed. The trigger will be executed after the configured threshold is exceeded for some resources like file, and may be delayed for some of the resources like heap and CPU.

When to Use Fair Share

A fair share policy is typically used by a system administrator to ensure that a bounded-size shared resource is shared effectively (yet fairly) by competing consumers. A fair share policy may also be employed by a system administrator when a clear understanding of the exact usage of a resource by a partition cannot be accurately determined in advance, and the system administrator would like efficient utilization of resources while ensuring fair allocation of shared resources to co-resident partitions. Use fair share policies in your environment when you have complementary workloads for a resource between partitions. See Use Complementary Workloads.

Use Complementary Workloads

When possible, maximize resource density by balancing the peak usage times between partitions so that there is no overlap in peak usage times and the sum of their averages is not above their maximum Peak value. Antagonistic workloads on the other hand have overlapping peak usage times and their sum of averages is greater than their maximum Peak values.

Also consider collocating partition workloads that exercise resources differently. For instance, hosting a partition that has a predominantly CPU-bound workload with another partition that has a memory-bound workload could help in achieving better density and improving overall resource utilization.

When to Use Partition-Scoped RCM Policies

Use partition-scoped (ad hoc) RCM policies if a partition has unique resource requirements. They facilitate easy import and export of partition RCM policies to and from existing domains.

If no resource management policies are explicitly set on a partition, that partition has unconstrained access to available shared resources.

Managing CPU Utilization

CPU utilization is an excellent metric to track contention of CPU by collocated domain partitions, and is especially useful in fair share policies for CPU-bound workloads. Consider the following when using RCM policies to maximize CPU utilization:

  • When considering the workload of all the partitions in a domain (the consolidated workload), the peak CPU utilization should not greatly exceed the average CPU utilization. Minimizing the gap between peak CPU load and average CPU load maximizes the CPU utilization for the domain.

  • Oracle recommends configuring RCM CPU policies so that about 75 percent of CPU utilization is used for applications housed in the partitions of a domain. The remaining 25 percent should be allocated approximately as: 10 percent for operational tasks (backup, scheduled tasks, and other administration) and 15 percent for cluster failover.

Managing Heap

Develop a memory utilization plan that supports the requirements of the partition applications while continuing to provide enough available heap (headroom) for the domain and other system work. When evaluating heap requirements for a domain, consider the low, average, steady-state and peak Heap Retained usage values for each partition's representative workload.

Limitations

Be aware of the following RCM limitations:

  • Heap resource consumption tracking and management is supported only when run with the G1 garbage collector (there is no RCM support for other JDK collectors).

  • There is no support to measure and account for resource consumption metrics for activities happening in JNI/native code.

  • Measurements of Retained Heap and CPU Utilization are performed asynchronously and hence do not represent "current" (a "point-in-time") value.

  • Discrimination of heap usage for objects in static fields, and singleton objects of classes loaded from system and shared classloaders are problematic and may not be accurately represented in the final accounting values. If an instance of a class loaded from system and shared classloaders is loaded by a partition, the instance's use of heap is accounted against that partition.

  • Garbage collection activity is not isolated to specific domain partitions in WebLogic Server 12.2.1 with Oracle JDK 8u40.

  • There is a performance impact to enabling the WebLogic Server RCM feature due to the additional tracking and management of resource consumption in the server instance.

Configuring Resource Consumption Management: WLST Example

You can implement and monitor Resource Consumption Management policies using WLST.

RCM WLST Example: Overview

The following is an example of a RCM configuration created using WLST. In this example, a system administrator has defined:

  • A Production resource manager representing the set of resource consumption management polices the system administrator would like to establish for all production tiered domain partitions in the domain. The Production resource manager has policies for various resources.

    • For the FileOpen resource type, three triggers are specified. A Production2000 trigger ensures the partition is shutdown when the number of open file descriptors reaches 2000. A Production1700 trigger specifies that when the number of open file descriptors cross 1700, the domain partition must be slowed down. A Production1500 trigger specifies a notify action.

    • For the HeapRetained resource type, a Production2GB trigger is created to ensure that when the partition's retained heap value reaches 2GB, the partition must be shutdown. A fair share value of 60 is assigned to the Production resource manager.

  • A Trial resource manager defines a similar but reduced set of policies.

  • A partition named Partition-0.

At the completion of this script, Partition-0 has been assigned the Production resource manager.

RCM WLST Example: WLST Script

The policy discussed in RCM WLST Example: Overview can be created using the following WLST script:

Example 11-2 WLST Example for Resource Consumption Management

startEdit()
 
cd('/ResourceManagement')
cd(domainName)
rm=cmo.createResourceManager('Approved')
fo=rm.createFileOpen('Approved-FO')
fo.createTrigger('Approved2000',2000,'shutdown')
fo.createTrigger('Approved1700',1700,'slow')
fo.createTrigger('Approved1500',1500,'notify')
hr=rm.createHeapRetained('Approved-HR')
hr.createTrigger('Approved2GB',2097152,'shutdown')
hr.createFairShareConstraint('FS-ApprovedShare', 60)
 
 
cd('/ResourceManagement')
cd(domainName)
rm=cmo.createResourceManager('Trial')
fo=rm.createFileOpen('Trial-FO')
fo.createTrigger('Trial1000',1000,'shutdown')
fo.createTrigger('Trial700',700,'slow')
fo.createTrigger('Trial500',500,'notify')
 
save()
activate()
 
startEdit()
cd('/Partitions')
cd(partition-0)
cmo.setResourceManagerRef(getMBean('/ResourceManagement/'+domainName+'/ResourceManager/Approved'))
save()
activate()

Configuring Resource Sharing: Related Tasks and Links

This section provides additional information that may be useful when implementing RCM in your environment.

For additional information, see "Multitenancy Tuning Recommendations" in Tuning Performance of Oracle WebLogic Server.