8 Understanding and Managing SLA Budgets

This chapter explains how to configure Oracle Communications Services Gatekeeper budgets and describes their relationship to Service Level Agreement (SLA) settings.

For information on how to configure, manage, and use SLAs, see Services Gatekeeper Accounts and SLAs Guide.

Understanding How the PRM Portals Use SLA Settings

Using the PRM Portals you create simple SLAs based on these parameters:

  • Group limits:

    • Request Limit

    • Quota

  • Application requests:

    • Request Limit

    • Quota

If these bandwidth settings are appropriate for your implementation, there is no need to use this chapter to create more sophisticated budgets. However, if your implementation requires more fine-grained access and bandwidth settings, you must use the instructions in this chapter to create SLAs.

Note:

Either use the PRM portal SLA settings, or SLAs that you create using the BudgetServiceMBean but not both. Both methods create SLAs and they will overwrite each other, creating unexpected results.

Understanding Budgets

In Services Gatekeeper, SLA enforcement is based on budgets maintained by the Budget service. Based on the rates, restrictions, and quotas set up in SLAs, budgets measure the level of access an application has to Services Gatekeeper over time. The budget is continuously being decremented based on the application's use of Services Gatekeeper. At the same time, the budget is being incremented based on the contract between the service provider and the operator. If the application's use of Services Gatekeeper exactly matches the SLA values, the budget level remains the same. The Platform Text Environment (PTE) includes a budget monitoring tool that tracks the state of budget usage over time

The budget reflects the current traffic request rate based on traffic history. Each Services Gatekeeper server updates both its own local traffic count and the clusterwide count maintained in one Services Gatekeeper server, the cluster master, based on load and time intervals. The cluster master is, from a cluster perspective, a singleton service that is highly available and is guaranteed to be available by the WebLogic Server infrastructure. The cluster master is also guaranteed to be active on only one server in the cluster. This ensures accurate SLA enforcement with regards to request counters.

By default, budget quotas are enforced within the cluster. The Budget service is also capable of maintaining budget quotas across domains spread across geographic locations.

Budget values for SLAs that span longer periods of time are persisted in the persistent store to minimize the state loss if a cluster master fails.

There are two types of budget caches:

  • In-memory only

  • In-memory cache backed by persistent storage

When a cluster master is restarted, it revives its state from the persistent store. If a cluster master fails, each Services Gatekeeper server continues to independently enforce the SLA accurately to the extent possible, until the role of cluster master has been transferred to an operational server. In such a situation, a subset of the budget cache is lost: the in-memory-only budget cache and the parts of the in-memory cache backed by persistent storage that have not been flushed to persistent storage. The flush intervals are configurable using the PersistentBudgetFlushInterval and PersistentBudgetTimeThreshold fields to BudgetServiceMBean.

The configuration settings for the following attributes of the BudgetServiceMBean MBean affect accuracy and performance:

  • The higher the value for the AccuracyFactor field, the more granularity you have in enforcing the budgets over the time span. This requires more processing power to synchronize the budgets over the cluster.

  • The higher the value for the PersistentBudgetFlushInterval field, the less impact persisting data has on overall performance, and the more budget data may be lost in case of server failure.

  • The higher the value for the PersistentBudgetTimeThreshold field, the fewer budgets are likely to be persisted. This value is related to the time intervals for which time limits are defined in the SLAs. A high threshold causes less impact on database performance, but more data may be lost in case of server failure.

For information about using BudgetServiceMBean see "BudgetServiceMBean Reference".

Synchronizing Budgets Between Servers

Budgets are synchronized between all servers in a cluster according to the following algorithm:

rt = r / (a * n)

Tt = T / (a * n)where:

  • rt is the slave request count synchronization threshold value.

  • r is a request limit specified in an SLA.

  • a is the accuracy factor; use the AccuracyFactor field

  • n is the number of running WebLogic Network Servers in a cluster.

  • Tt is the duration of counter synchronization between the slave and the master.

  • T is a time period specified in the SLA.

Understanding Slave intervals

The request count is the amount of the budget that has been allocated since the last synchronization with the master. The following scenarios are possible:

  1. When the request count reaches rt on a particular node, it synchronizes with the master.

  2. If the request count does not reach the rt value and if the count is greater than zero, the slave synchronizes with the master if the time since last synchronization reaches Tt.

Synchronization happens as a result of (1) or (2), whichever comes first.

If the request count reaches the threshold value, there will be no explicit synchronization when the timer reaches Tt.

Example:

If r = 100, n = 2 and T = 1000 milliseconds and a = 2

rt = 100 / (2 * 2) = 25 requests Tt = 1000 / (2 *2) = 250 milliseconds

The slave synchronizes with the master if the request count reaches 25 or if the time since the last synchronization is 250 milliseconds, whichever comes first, at which point the timer is reset.

Understanding Masters

The master is responsible for enforcing the budget limits across the cluster by keeping track of the request count across all the servers in the cluster. If there is budget available, the master updates the slaves with the remaining budget whenever the slaves synchronize with the master.

Understanding Failure Conditions

In the absence of the master, each slave individually enforces the budget limit but caps the requests at r/n, thereby guaranteeing that the budget count never reaches the limit.

If the slave fails before it can update the master, the master is not able to account for that server's budget allocation and can be rt requests out of sync.

Under certain circumstances, if the master allocates more than the configured budget limit, the budget will be adjusted over time.

For budgets that span longer periods of time, the budget count is persisted in the database to avoid losing all state during master failures. Use the PersistentBudgetTimeThreshold field of BudgetServiceMBean for this.

For information about using BudgetServiceMBean see "BudgetServiceMBean Reference".

Understanding Budget Overrides

Budgets can have overrides defined in the SLAs. When a budget is configured with an override, the budget master determines if a given override is active. If an override is active, the budget master enforces limits based on that active override configuration. If overrides are overlapping, no guarantees are provided on which override is enforced.

Note:

For an override to be active, all of the following must be true:
  • Today's date must be the same as or later than startDate.

  • Today's date must be earlier than endDate (must not be the same date).

  • Current time must be between startTime and endTime. If endTime is earlier than startTime, the limit spans midnight.

  • Current day of week must be between startDow and endDow or equal to startDow or endDow. If endDow is less than startDow the limit spans the week end.

Overrides are not enforced across geographically redundant sites.

Budget Calculations and Relationship to SLA Settings

Budgets are calculated based on the following SLA settings:

  • <reqLimit> and <timePeriod> for request limits.

  • <qtaLimit> and <days> for quotas.

  • <reqLimitGuarantee> and <timePeriodGuarantee> for guarantee settings

The limits divided by the time-period translates into a budget increase rate, expressed in transactions per second.

Each budget has a maximum value define in the limits (<reqLimit>, <qtaLimit>, and <reqLimitGuarantee>). This value is also the starting value of the budget. For each requests that is processed, the budget is decreased with one (1). Over time, the budget increases with the budget increase rate multiplied with the time, and its maximum value is the request limit. When a budget has reached the value of zero (0), requests are denied.

A maximum request rate is expressed as the number of request during a a time-period, so it offers very flexible ways to define the limits. For a given budget increase rate, expressed in transactions per second:

  • The longer time-period defined, the longer it takes for requests to start to be rejected if the request rate is higher than allowed since the maximum budget value is higher.

  • The shorter time-period defined, the sooner requests are starting to be rejected if the request rate is higher than allowed since the maximum budget value is lower.

Having a shorter time period means that budget synchronizations are more frequent and this has a performance impact.

If an application has used up its budget by sending more requests than allowed, and requests have started to be rejected, and the application reduces the request rate, the budget starts to increase. The budget is increased with the delta between the budget increase rate and the request rate.

Example:

An application sends 250 requests per second.

The SLA settings are:

<reqLimit>2000</reqLimit>

<timePeriod>10000</timePeriod>

This means that the budget increase rate is 2000/10000= 200 requests per second.

The difference between the request rate and the budget increase factor is 50 (250-200) so the budget will be zero (0) after 40 seconds (2000/50). When the budget is zero, requests are rejected.

Example:

An application sends 250 requests per second.

The SLA settings are

<reqLimit>200</reqLimit>

<timePeriod>1000</timePeriod>

This means that the budget increase rate is 200/1000= 200 requests per second.

The difference between the request rate and the budget increase rate is 50 (250-200) so the budget will be zero (0) after 4 seconds (200/50). When the budget is zero, requests are rejected.

Example:

An application sends 180 requests per second.

The SLA settings are

<reqLimit>200</reqLimit>

<timePeriod>1000</timePeriod>

Also, the current budget is zero (0) since it previously had a request rate that was higher than the allowed.

This means that the budget increase rate is 200/1000= 200 requests per second.

The difference between the budget increase rate and the request rate is 20 (200-180) so the budget will be at its maximum value (200) after 10 seconds (200/20). When the budget reaches its maximum value it does not increase any further.

Extending SLAs for Budget Services

By default, Services Gatekeeper uses plugin interface names and method names to control the budgets for services. You can fine-tune the rates, restrictions, and quotas set up in SLAs. To extend SLAs in Services Gatekeeper, you add optional elements and attributes to the XSD (XML schema definition).

The SLA shown in Example 8-1 contains a rate for the doStuff method of a plugin called DummyPlugInterface. It uses the <scs> element to specify the interface and service usage restrictions (rate and quota) in a <methodRestriction> element

Example 8-1 SLA Extended for an Example plug-in

<?xml version=\"1.0\" encoding=\"UTF-8\"?>
<Sla applicationGroupID="default_app_group" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:noNamespaceSchemaLocation='app_sla_file.xsd'> 
   <serviceContract>
      <startDate>2005-07-22</startDate>
      <endDate>9999-12-31</endDate>
      <scs>com.bea.wlcp.wlng.plugin.DummyPluginInterface</scs>
      <contract>
        <methodRestrictions>
           <methodRestriction>
              <methodName>doStuff</methodName>
              <rate>
                <reqLimit>1</reqLimit> 
                <timePeriod>5000</timePeriod> 
              </rate>
              <quota>
                 <qtaLimit>1</qtaLimit> 
                 <days>1</days> 
                 <limitExceedOK>true</limitExceedOK>
              </quota> 
           </methodRestriction>
       </methodRestrictions> 
      </contract>
   </serviceContract> 
</Sla>;

When you use the dynamic API framework, incoming HTTP requests for a service are processed using logic defined in a generic servlet set up for the service. You create an XML file in which you expose your communication services as APIs and, in each case, provide their URLs and service usage details in the SLA associated with each application group. Each <scs> element contains the URL to access an API. The <methodRestriction> and <methodAccess> elements in the <contract> section provide the details about setup of the API for the service.

The SLA shown in Example 8-2 defines a OneAPI SMS (REST) service for the dynamic API framework.

Example 8-2 Example Service SLA in XML for Dynamic API Framework

<?xml version=\"1.0\" encoding=\"UTF-8\"?>
<Sla applicationGroupID="default_app_group" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:noNamespaceSchemaLocation='app_sla_file.xsd'> 
   <serviceContract>
      <startDate>2005-07-22</startDate>
      <endDate>9999-12-31</endDate>
      <scs>/1/smsmessaging/outbound/(.*)/sendMessage</scs>  <!--New value-->
      <contract>
        <methodRestrictions>
           <methodRestriction>
              <methodName>POST</methodName>  <!--New value-->
              <rate>
                <reqLimit>5</reqLimit> 
                <timePeriod>1000</timePeriod> 
              </rate>
              <quota>
                 <qtaLimit>1</qtaLimit> 
                 <days>1</days> 
                 <limitExceedOK>true</limitExceedOK>
              </quota> 
           </methodRestriction>
       </methodRestrictions> 
       <methodAccess>
          <blacklistedMethod>
            <methodName>PUT</methodName>
          </blacklistedMethod>
          <blacklistedMethod>
            <methodName>HEAD</methodName>
          </blacklistedMethod>
       </methodAccess>
      </contract>
   </serviceContract> 
   <serviceTypeContract>     
        <serviceTypeName>/1/smsmessaging/</serviceTypeName>        
        <startDate>2012-07-31+08:00</startDate>
        <endDate>2018-12-31+08:00</endDate>
        <rate>
            <reqLimit>200</reqLimit>
            <timePeriod>1</timePeriod>
        </rate>
    </serviceTypeContract>
    <composedServiceContract>     
        <composedServiceName>ComposedService</composedServiceName>     
        <service>
           <serviceTypeName>/1/smsmessaging/</serviceTypeName>           
        </service>
        <service>
           <serviceTypeName>/1/mmsmessaging/</serviceTypeName>           
        </service>   
        <startDate>2012-07-31+08:00</startDate>
        <endDate>2018-12-31+08:00</endDate>
        <rate>
            <reqLimit>400</reqLimit>
            <timePeriod>1</timePeriod>
        </rate>
    </serviceTypeContract>
</Sla>;

As Example 8-2 shows, the <scs> element contains the request url for the SendMessage request API for the Restful service. The <methodRestriction> element specifies the details of the service quota and limits for the supported HTTP method, POST. The HTTP methods PUT, HEAD, and DELETE are not supported and therefore are blacklisted within the <blacklistedMethod> element under the <methodAccess> element.

Configuring and Managing Budgets

Configure the following fields for the BudgetServiceMBean:

  • PersistentBudgetFlushInterval

  • PersistentBudgetTimeThreshold

  • AccuracyFactor

  • ConfigUpdateInterval

No management methods are available.

For information about using BudgetServiceMBean see "BudgetServiceMBean Reference".

BudgetServiceMBean Reference

Set field values and use methods from the Administration Console by selecting Container, then Services followed by BudgetService. Alternately, use a Java application. For information on the methods and fields of the supported MBeans, see the ”All Classes” section of Services Gatekeeper OAM Java API Reference.

Adding a Datasource

A datasource is an abstraction that handles connections with the persistent store. Under normal operating conditions, Services Gatekeeper, including the Budget service and the WebLogic Server automatic migration framework, share the common transactional (XA) datasource (wlng.datasource) that has been set up for Services Gatekeeper at large.Under very heavy traffic, it is possible for the Budget singleton service to be deactivated on all servers. This can happen if the automatic migration mechanism that supports the service becomes starved for connections. In this case, a major severity alarm is thrown: Alarm ID 111002, ”Budget master unreachable”. Although datasource issues are not the only reason you might receive this alarm.

If you encounter this problem, you can set up a separate singleton datasource for the migration mechanism to assure that the singleton service always has access to the persistent store. This datasource should be configured to use wlng.datasource. For more information on automatic migration of singleton services, see "Automatic Migration of User-Defined Singleton Services" in Oracle WebLogic Server Administering Clusters for Oracle WebLogic Server.

Also see the section about high-availability database leasing in that document. This is the mechanism underlying migration.

For information on setting up a separate datasource to support migration of singleton services, like the Budget service, see "Configuring JDBC Data Sources" in Oracle Fusion Middleware Configuring and Managing JDBC Data Sources for Oracle WebLogic Server.