Administering a BEA Tuxedo Application at Run Time

     Previous  Next    Open TOC in new window  Open Index in new window  View as PDF - New Window  Get Adobe Reader - New Window
Content starts here

Tuning a BEA Tuxedo ATMI Application

This topic includes the following sections:

Note: For detailed information about tuning your applications in the BEA Tuxedo CORBA environment, refer to the Scaling, Distributing, and Tuning CORBA Applications guide.

 


When to Use MSSQ Sets

Note: Multiple Servers, Single Queue (MSSQ) sets are not supported in BEA Tuxedo CORBA servers.

The MSSQ scheme offers additional load balancing in BEA Tuxedo ATMI environments. One queue is accommodated by several servers offering identical services at all times. If the server queue to which a request is sent is part of an MSSQ set, the message is dequeued to the first available server. Thus load balancing is provided at the individual queue level.

When a server is part of an MSSQ set, it must be configured with its own reply queue. When the server makes requests to other servers, the replies must be returned to the original requesting server; they must not be dequeued by other servers in the MSSQ set.

You can configure MSSQ sets to be dynamic so they automatically spawn and eliminate servers based upon a queue load.

The following table specifies when it is beneficial to use MSSQ sets.

You Should Use MSSQ Sets If . . .
You Should Not Use MSSQ Sets If . . .
You have between 2 and 12 servers.
There are many servers. (A compromise is to use many MSSQ sets.)
Buffer sizes are not too large, that is, large enough to exhaust a queue.
Buffer sizes are large enough to exhaust one queue.
All servers offer identical sets of services.
Each server offers different services.
Messages are relatively small.
Large messages are being passed to the services, causing the queue to be exhausted. When a queue is exhausted, either nonblocking sends fail or blocking sends block.
Optimization and consistency of service turnaround time are paramount.
 

The following two analogies illustrate when it is beneficial to use MSSQ sets.

 


How to Enable Load Balancing

To alleviate the performance degradation resulting from heavy system traffic, you may want to implement a load balancing algorithm on your entire application. With load balancing, a load factor is applied to each service within the system, and you can track the total load on every server. Every service request is sent to the qualified server that is least loaded.

To implement system-wide load balancing, complete the following procedure.

  1. Run your application for an extended period of time.
  2. Note the average amount of time it takes for each service to be performed.
  3. In the RESOURCES section of the configuration file:
    • Set LDBAL to Y.
    • Assign a LOAD value of 50 (LOAD=50) to any service that takes approximately the average amount of time.
    • For any service taking longer than the average amount of time, set LOAD>50; for any service taking less than the average amount of time, set LOAD<50.
Note: This algorithm, although effective, is expensive and should be used only when necessary, that is, only when a service is offered by servers that use more than one queue. Services offered by only one server, or by multiple servers, all of which belong to the same MSSQ (Multiple Server, Single Queue) set, do not need load balancing.

 


How to Measure Service Performance Time

You can measure service performance time in either of two ways:

 


How to Assign Priorities to Interfaces or Services

Assigning priorities enables you to exert significant control over the flow of data in an application, provide faster service to the most important requests, and provide slower service to the less important requests. You can also give priority to specific users—at all times or in specific circumstances.

You can assign priorities to BEA Tuxedo services in either of two ways:

Example of Using Priorities

Server 1 offers Interfaces A, B, and C. Interfaces A and B have a priority of 50; Interface C, a priority of 70. An interface requested for C is always dequeued before a request for A or B. Requests for A and B are dequeued equally with respect to one another. The system dequeues every tenth request in first-in, first-out (FIFO) order to prevent a message from waiting indefinitely on the queue.

Using the PRIO Parameter to Enhance Performance

The PRIO parameter determines the priority of an interface or a service on a server's queue. It should be used cautiously. Once priorities are assigned, it may take longer for some messages to be dequeued. Depending on the order of messages on the queue (for example, A, B, and C), some (such as A and B) are dequeued only one in ten times when there are more than 10 requests for C. This means reduced performance and potential slow turnaround time for some services.

When you are deciding whether to use the PRIO parameter, keep the following implications in mind:

 


Bundling Services into Servers

The easiest way to package services into servers is to avoid packaging them at all. Unfortunately, if you do not package services, the number of servers, message queues, and semaphores rises beyond an acceptable level. Thus there is a trade-off between no bundling and too much bundling.

When to Bundle Services

We recommend that you bundle services if you have one of the situations or requirements described in the following list.

Do not put two or more services that call each other, that is, call-dependent services, in the same server. If you do so, the server issues a call to itself, causing a deadlock.

 


Enhancing Overall System Performance

The following performance enhancement controls can be applied to BEA Tuxedo release 8.0 or later.

Service and Interface Caching

BEA Tuxedo release 8.0 or later allows you to cache service and interface entries, and to use the cached copies of the service or interface without locking the bulletin board. This feature represents a significant performance improvement, especially in systems with large numbers of clients and only a few services.

The SICACHEENTRIESMAX option has been added to the MACHINES and SERVERS sections of the configuration file to allow you to define the maximum number of service cache entries that any process and/or server can hold.

Since caching may not be useful for every client or every application, the TMSICACHEENTRIESMAX environment variable has been added to control the cache size. The default value for TMSICACHEENTRIESMAX is preconfigured so that no administrative changes are necessary when upgrading from previous releases. TMSICACHEENTRIESMAX can also control the number of cache entries, since it is not desirable for clients to grow too large.

Service Caching Limitations

The following limitations apply to the caching feature:

Notes: For more information about the SICACHEENTRIESMAX option, refer to the UBBCONFIG(5)and TM_MIB(5) sections in the File Formats, Data Descriptions, MIBs, and System Processes Reference.
Note: For more information about the TMSICACHEENTRIESMAX variable, refer to the tuxenv(5)section in the File Formats, Data Descriptions, MIBs, and System Processes Reference.

Removing Authorization and Auditing Security

For BEA Tuxedo release 7.1, the AAA (authentication, authorization, and auditing) security features were added so that implementations using the AAA plug-in functions would not need to base security on the BEA Tuxedo administrative option. As a result, the BEA Engine AAA security functions are always called in the main BEA Tuxedo 7.1 code path. Since many applications do not use security, they should not pay the overhead price of these BEA Engine security calls.

For BEA Tuxedo release 8.0 or later, the NO_AA option has been added to the OPTIONS parameter in the RESOURCES section of the configuration file. The NO_AA option will circumvent the calling of the authorization and auditing security functions. Since most applications need authentication, this feature cannot be turned off.

If the NO_AA option is enabled, the following SECURITY parameters may be affected:

Note: For more information about the NO_AA option, refer to the UBBCONFIG(5)and TM_MIB(5) sections in the File Formats, Data Descriptions, MIBs, and System Processes Reference.

Using the Multithreaded Bridge

Because only one Bridge process is running per host machine in a multiple machine Tuxedo domain, all traffic from a host machine passes through a single Bridge process to all other host machines in the domain. The Bridge process supports both single-threaded and multithreaded execution capabilities. The availability of multithreaded Bridge processing improves the data throughput potential. To enable multithreaded Bridge processing, you can configure the BRTHREADS parameter in the MACHINES section of the UBBCONFIG file.

Setting BRTHREADS=Y configures the Bridge process for multithreaded execution. Setting BRTHREADS=N or accepting the default N, configures the Bridge process for single-threaded execution.

Configurations with BRTHREADS=Y on the local machine and BRTHREADS=N on the remote machine are allowed, but the thoughput between the machines will not be greater than that for the single-threaded Bridge process.

Other important considerations for using the BRTHREADS parameter include:

Note: In a Tuxedo multiple-machine domain, setting BRTHREADS=Y has no effect for a machine that is running an earlier version of Tuxedo.
Note: For more information about the multithreaded Bridge, see the BRTHREADS parameter in the MACHINES section of the UBBCONFIG(5) in File Formats, Data Descriptions, MIBs, and System Processes Reference.

Turning Off Multithreaded Processing

BEA Tuxedo has a generalized threading feature. Due to the generality of the architecture, all ATMI calls must call mutexing functions in order to protect sensitive state information. Furthermore, the layering of the engine and caching schemes used in the libraries cause more mutexing. For applications that do not use threads, turning them off can result in significant performance improvements without making changes to the application code.

To turn off multithreaded processing use the TMNOTHREADS environment variable. With this setting, individual processes can turn threads on and off without introducing a new API or flag in order to do so.

If the TMNOTHREADS=Y, then the calls to the mutexing functions are avoided.

Note: For more information about TMNOTHREADS, refer to the tuxenv(5) section in File Formats, Data Descriptions, MIBs, and System Processes Reference.

Turning Off XA Transactions

Although not all BEA Tuxedo applications use XA transactions, all processes pay the cost of transactional semantics by calling internal transactional verbs. To boost performance for applications that don't use XA transactions for BEA Tuxedo release 8.0 or later, the NO_XA flag has been has been added to the OPTIONS parameter in the RESOURCES section of the configuration file.

No XA transactions are allowed when the NO_XA flag is set. It is important to remember though, that any attempt to configure TMS services in the GROUPS section will fail if the NO_XA option has been specified.

Note: For more information about the NO_XA option, refer to the UBBCONFIG(5)and TM_MIB(5) sections in the File Formats, Data Descriptions, MIBs, and System Processes Reference.

 


Determining Your System IPC Requirements

The IPC requirements for your system are determined by the values of several system parameters:

You can use the tmboot -c command to display the minimum IPC requirements of your configuration.

The following table describes these system parameters.

Table 8-1 Parameters for Tuning IPC Resources 
Parameter(s)
Description
MAXACCESSSERS
Equals the number of semaphores.
Number of message queues is almost equal to MAXACCESSERS + number of servers with reply queues (number of servers in MSSQ set * number of MSSQ sets).
MAXSERVERS, MAXSERVICES, and MAXGTT
While MAXSERVERS, MAXSERVICES, MAXGTT, and the overall size of the ROUTING, GROUP, and NETWORK sections affect the size of shared memory, an attempt to devise formulas that correlate these parameters can become complex. Instead, simply run tmboot -c or tmloadcf -c to calculate the minimum IPC resource requirements for your application.
Queue-related kernel parameters
Need to be tuned to manage the flow of buffer traffic between clients and servers. The maximum total size (in bytes) of a queue must be large enough to handle the largest message in the application. A typical queue is not more than 75 to 85 percent full. Using a smaller percentage of a queue is wasteful; using a larger percentage causes message sends to block too frequently.
Set the maximum size for a message to handle the largest buffer that the application sends.
The maximum queue length (the largest number of messages that are allowed to sit on a queue at once) must be adequate for the application's operations.
Simulate or run the application to measure the average fullness of a queue or its average length. This process may require a lot of trial and error; you may need to estimate values for your tunables before running the application, and then adjust them after running under performance analysis.
For a large system, analyze the effects of parameter settings on the size of the operating system kernel. If they are unacceptable, reduce the number of application processes or distribute the application across more machines to reduce MAXACCESSERS.

 


Tuning IPC Parameters

The following application parameters enable you to enhance the efficiency of your system:

Setting the MAXACCESSERS, MAXSERVERS, MAXINTERFACES, and MAXSERVICES Parameters

The MAXACCESSERS, MAXSERVERS, MAXINTERFACES, and MAXSERVICES parameters increase semaphore and shared memory costs, so you should carefully weigh these costs against the expected benefits before using these parameters, and choose the values that best satisfy the needs of your system. You should take into account any increased resources your system may require for a potential migration. You should also allow for variation in the number of clients accessing the system simultaneously. Defaults may be appropriate for a generous allocation of IPC resources; however, it is prudent to set these parameters to the lowest appropriate values for the application.

Setting the MAXGTT, MAXBUFTYPE, and MAXBUFSTYPE Parameters

To determine whether the default is adequate for your application, multiply the number of clients in the system times the percentage of time they are committing a transaction. If the product of this multiplication is close to 100, you should increase the value of the MAXGTT parameter. As a result of increasing MAXGTT:

To limit the number of buffer types and subtypes allowed in the application, set the MAXBUFTYPE and MAXBUFSTYPE parameters, respectively. The current default for MAXBUFTYPE is 16. If you plan to create eight or more user-defined buffer types, you should set MAXBUFTYPE to a higher value. Otherwise, you do not need to specify this parameter; the default value is used.

The current default for MAXBUFSTYPE is 32. You may want to set this parameter to a higher value if you intend to use many different VIEW subtypes.

Tuning with the SANITYSCAN, BLOCKTIME, BBLQUERY, and DBBLWAIT Parameters

If a system is running on slow processors (for example, due to heavy usage), you can increase the timing parameters: SANITYCAN, BLOCKTIME, and individual transaction timeouts.

If networking is slow, you can increase the value of the BLOCKTIME, BBLQUERY, and DBBLWAIT parameters.

Recommended Values for Tuning-related Parameters

In the following table are recommended values for the parameters available for tuning an application.

Use These Parameters . . .
To . . .
MAXACCESSERS, MAXSERVERS, MAXINTERFACES, and MAXSERVICES
Set the smallest satisfactory value because of IPC cost. (Allow for extra clients.)
MAXGTT, MAXBUFTYPE, and MAXBUFSTYPE
Increase MAXGTT for many clients; set MAXGTT to 0 for nontransactional applications.
Use MAXBUFTYPE only if you create eight or more user-defined buffer types.
Increase the value of MAXBUFSTYPE if you use many different VIEW subtypes.
BLOCKTIME, TRANTIME, and SANITYSCAN
Increase the values if the system is slow.
BLOCKTIME, TRANTIME, BBLQUERY, and DBBLWAIT
Increase the values if networking is slow.

 


Measuring System Traffic

As on any road that supports a lot of traffic, bottlenecks can occur in your system. On a highway, cars can be counted with a cable strung across the road, that causes a counter to be incremented each time a car drives over it.

You can use a similar method to measure service traffic. For example, when a server is started (that is, when tpsvrinit() is invoked), you can initialize a global counter and record a starting time. Subsequently, each time a particular service is called, the counter is incremented. When the server is shut down (through the tpsvrdone() function), the final count and the ending time are recorded. This mechanism allows you to determine how busy a particular service is over a specified period of time.

In the BEA Tuxedo system, bottlenecks can originate from problematic data flow patterns. The quickest way to detect bottlenecks is to measure the amount of time required by relevant services from the client's point of view.

Example of Detecting a System Bottleneck

Client 1 requires 4 seconds to display the results. Calls to time() determine that the tpcall to service A is the culprit with a 3.7-second delay. Service A is monitored at the top and bottom and takes 0.5 seconds. This finding implies that a queue may be clogged, a situation that can be verified by running the pq command in tmadmin.

On the other hand, suppose service A takes 3.2 seconds. The individual parts of service A can be bracketed and measured. Perhaps service A issues a tpcall to service B, which requires 2.8 seconds. Knowing this, you should then be able to isolate queue time or message send blocking time. Once the relevant amount of time has been identified, the application can be retuned to handle the traffic.

Using time(), you can measure the duration of the following:

Detecting Bottlenecks on UNIX Platforms

The UNIX system sar(1) command provides valuable performance information that can be used to find system bottlenecks. You can run sar(1) to do the following:

The following table describes the sar(1) command options.

Use This Option...
To...
-u
Gather CPU utilization numbers, including percentages of time during which the system: runs in user mode, runs in system mode, remains idle with some process waiting for block I/O, and otherwise remains idle.
-b
Report buffer activity, including number of data transfers, per second, between system buffers and disk (or other block devices).
-c
Report activity of system calls of all types, as well as specific system calls, such as fork(2) and exec(2).
-w
Monitor system swapping activity, including the number of transfers for swapins and swapouts.
-q
Report average queue lengths while queues are occupied, and the percentage of time they are occupied.
-m
Report message and system semaphore activities, including the number of primitives per second.
-p
Report paging activity, including the number of address translation page faults, page faults and protection errors, and valid pages reclaimed for free lists.
-r
Report the number of unused memory pages and disk blocks, including the average number of pages available to user processes and disk blocks available for process swapping.

Note: Some flavors of the UNIX system do not support the sar(1) command, but offer equivalent commands, instead. BSD, for example, offers the iostat(1) command; Sun offers perfmeter(1).

Detecting Bottlenecks on Windows Platforms

On Windows platforms, you can use the Performance Monitor to collect system information and detect bottlenecks. To open the Performance Monitor, select the following options from the Start menu:

Start —> Settings —> Control Settings —> Administration Tools —> Performance

See Also


  Back to Top       Previous  Next