Table of Contents Previous Next PDF

Monitoring Your Oracle Tuxedo Application

Monitoring Your Oracle Tuxedo Application
This topic includes the following sections:
Ways to Monitor Your Application
As an administrator, you must ensure that once an application is up and running, it continues to meet the performance, availability, and security requirements set by your company. To perform this task, you need to monitor the resources (such as shared memory), activities (such as transactions), and potential problems (such as security breaches) in your configuration, and take any necessary corrective actions.
To help you meet this responsibility, the Oracle Tuxedo system provides several methods for monitoring system and application events, and dynamically reconfiguring your system to improve performance. The following facilities offer an excellent view of how your system is working:
These tools help make your application capable of responding quickly and efficiently to changing business needs or failure conditions. They also assist you in managing your application’s performance and security.
Figure 2‑1 shows the monitoring tools.
Figure 2‑1 Monitoring Tools
The Oracle Tuxedo system offers the following tools to monitor your application:
Oracle Administration Console—a Web-based graphical user interface you can use to observe the behavior of the application, and to dynamically configure its operation. You can display and change configuration information, determine the state of each component of the system, and obtain statistical information about items such as executed requests, and queued requests.
Command-line utilities—a set of commands (for example, tmboot(1), tmadmin(1), and tmshutdown(1)) you can use to activate, deactivate, configure, and manage your application.
EventBroker—a mechanism that informs administrators of system faults and exceptional happenings such as network failures. When an event is posted by clients or servers, the EventBroker matches the name of the posted event to a list of subscribers for that event, and takes appropriate action, determined by each subscription.
Log files—a set of files that make up a repository for error and warning messages, debugging messages, and informational messages helpful in tracking and resolving problems in the system.
MIB—an interface to a set of procedures for accessing and modifying information in the MIBs. Using the MIB, you can write programs that enable you to monitor your run-time application.
Run-time and User-level tracing facility—software that tracks the execution of an application, thus providing information that is helpful in resolving system problems.
See Also
tmshutdown(1) in the Oracle Tuxedo Command Reference
“Oracle Tuxedo Management Tools” on page 4‑1 in Introducing Oracle Tuxedo ATMI
System and Application Data That You Can Monitor
The Oracle Tuxedo system enables you to monitor system and application data.
Monitoring System Data
To help you monitor a running system, your Oracle Tuxedo system maintains parameter settings and generates statistics for the following system components:
You can access these components using the MIB or tmadmin. You can set up your system so that it can use the statistics in the bulletin board to make decisions and to modify system components dynamically, without your intervention. With proper configuration, your system can perform the following tasks (when bulletin board statistics indicate that they are required):
By monitoring the administrative data for your system, you can prevent and resolve problems that threaten the performance, availability, and security of your application.
Where the System Data Resides
To ensure that you have the information necessary to monitor your system, the Oracle Tuxedo system provides the following three data repositories:
Bulletin board—a segment of shared memory (on each machine in your network) to which your system writes statistics about the components and activities of your configuration
Log files—files to which your system writes messages
UBBCONFIG—a text file in which you define the parameters of your system and application
Monitoring Dynamic and Static Administrative Data
You can monitor two types of administrative data that are available on every running Oracle Tuxedo system: static and dynamic.
What Is Static Data?
Static data about your configuration consists of configuration settings that you assign when you first configure your system and application. These settings are never changed without intervention (either in realtime or through a program you have provided). Examples include system-wide parameters (such as the number of machines used) and the amount of interprocess communication (IPC) resources (such as shared memory) allocated to your system on your local machine. Static data is kept in the UBBCONFIG file and in the bulletin board.
Checking Static Data
At times you may need to check static data about your configuration. For example, you may want to add a large number of machines without exceeding the maximum number of machines allowed in your configuration (or allowed in the machine tables of the bulletin board). You can look up the maximum number of machines allowed by checking the current values of the system-wide parameters for your configuration (one of which is MAXMACHINES).
You may be able to improve the performance of your application by tuning your system. To determine whether tuning is required, you need to check the amount of local IPC resources currently available.
What Is Dynamic Data?
Dynamic data about your configuration consists of information that changes in realtime, that is, while an application is running. For example, the load (the number of requests sent to a server) and the state of various configuration components (such as servers) change frequently. Dynamic data is kept in the bulletin board.
Checking Dynamic Data
Dynamic configuration data is useful in resolving many administrative problems, as demonstrated by two examples.
In the first example, suppose your throughput is suffering and you want to know whether you have enough servers running to accommodate the number of clients currently connected. Check the number of running servers and connected clients, and the load on one or more servers. These numbers help you determine whether adding more servers will improve performance.
In the second example, suppose you receive multiple complaints about slow response from users when making particular requests of your application. By checking load statistics, you can determine whether increasing the value of the BLOCKTIME parameter would improve response time.
Common Startup and Shutdown Problems
When evaluating whether your Oracle Tuxedo system is operating normally, you might want to consider the following list of common startup and shutdown problems, and monitor your system periodically.
Common Startup Problems
IPCKEY is already in use
TLOG file is not created
Common Shutdown Problems
Selecting Appropriate Monitoring Tools
To monitor a running application, you need to keep track of the dynamic aspects of your configuration and sometimes check the static data. In other words, you need to be able to watch the bulletin board on an ongoing basis and consult the UBBCONFIG file when necessary. The method you choose depends on the following factors:
Which information you want to view: If you decide to monitor your application by examining the RESOURCES section of the UBBCONFIG file through the tmadmin command, you will have access to only the current values.
Table 2‑1 describes how to use each monitoring method.
Log files (for example, ULOG, TLOG)
Viewing the ULOG with any text editor; checking the ULOG for tlisten messages; and converting the TLOG (a binary file) to a text file by running tmadmin dumptlog which downloads a TLOG to a text file.
Using the Oracle Administration Console to Monitor Your Application
The Oracle Administration Console is a graphical user interface to the MIB that enables you to tune and modify your application. It is accessed through the World Wide Web and used through a Web browser. Any administrator with a supported browser can monitor a Oracle Tuxedo application.
Using the Toolbar to Monitor Activities
The toolbar is a row of 12 buttons that allow you to run tools for frequently performed administrative and monitoring functions. All buttons are labeled with both icons and names. The following buttons are available for monitoring:
Logfile—displays the ULOG file from a particular machine in the active domain.
Event Tool—helps you monitor system events. When you click the Event Tool button, a window displays four options: subscribe—to request notification of specified system events, unsubscribe—to reject further notification of specified system events, snapshot—to create a record of the data currently held by the Event Tool, and select format—to choose parameters for the information being collected by the Event Tool.
Stats—to display a graphical representation of Oracle Tuxedo system activity.
Search—to look for a particular object class or object in the Tree.
See Also
Using Command-line Utilities to Monitor Your Application
To monitor your application through the command-line interface, use the tmadmin(1) or txrpt(1) command.
Inspecting Your Configuration Using tmadmin
The tmadmin command is an interpreter for 53 commands that enable you to view and modify a bulletin board and its associated entities. Using the tmadmin commands, you can monitor statistical information in the system such as the state of services, the number of requests executed, the number of queued requests, and so on.
Using the tmadmin commands, you can also dynamically modify your Oracle Tuxedo system. You can, for example, perform the following types of changes while your system is running:
Change the AUTOTRAN timeout value
Whenever you start a tmadmin session, you can choose the following operating modes for that session: the default operating mode, read-only mode, or configuration mode:
In default operating mode, you can view and change bulletin board data during a tmadmin session, if you have administrator privileges (that is, if your effective UID and GID are those of the administrator).
In read-only mode, you can view the data in the bulletin board, but you cannot make any changes. The advantage of working in read-only mode is that your administrator process is not tied up by tmadmin; the tmadmin process attaches to the bulletin board as a client, leaving your administrator slot available for other work.
In configuration mode, you can view the data in the bulletin board and, if you are the Oracle Tuxedo application administrator, you can make changes. You can start a tmadmin session in configuration mode on any machine, including an inactive machine. On most inactive machines, configuration mode is required in order to run tmadmin. (The only inactive machine on which you can start a tmadmin session without requesting configuration mode is the MASTER machine.)
Generating Reports on Servers and Services Using txrpt
The txrpt command analyzes the standard error output of a Oracle Tuxedo server and provides a summary of service processing time within the server. The report shows the number of times each service was dispatched and the average amount of time it took for each service to process a request during the specified period. txrpt takes its input from the standard input or from a standard error file redirected as input. To create standard error files, have your servers invoked with the -r option from the servopts(5) selection; you can name the file by specifying it with the -e servopts option. Multiple files can be concatenated into a single input stream for txrpt.
Over time, information about service X and server Y (on which service X resides) is accumulated in a file. txrpt processes the file and provides you with a report about the service access and timing characteristics of the server.
See Also
How a tmadmin Session Works
The tmadmin command is an interpreter for 53 commands that enable you to view and modify a bulletin board and its associated entities. Figure 2‑2 shows you how a typical tmadmin session works.
Figure 2‑2 Typical tmadmin Session
Monitoring Your System Using tmadmin Commands
Following is a list of run-time system functions that you can monitor with tmadmin commands:
See Also
tmadmin(1) in the Oracle Tuxedo Command Reference
Using EventBroker to Monitor Your Application
The Oracle Tuxedo EventBroker monitors a running application for events (for example, a state change in a MIB object, such as the transition of a client from active to inactive). When the EventBroker detects an event, it reports or posts the event, and then notifies relevant subscribers that the event has occurred. You can be informed automatically when events occur in the MIB by receiving FML data buffers representing MIB objects. To post the event and report it to subscribers, the EventBroker uses the tppost(3c) function. Both administrators and application processes can subscribe to events.
The EventBroker recognizes over 100 meaningful state transitions to a MIB object as system events. A posting for a system event includes the current MIB representation of the object on which the event occurred, and some event-specific fields that identify the event that occurred. For example, if a machine is partitioned, an event is posted with the following:
The name of the affected machine, as specified in the T_MACHINE class, with all the attributes of that machine
To use the EventBroker, you simply subscribe to system events.
See Also
Using Log Files to Monitor Activity
To help you identify error conditions quickly and accurately, the Oracle Tuxedo system provides the following log files:
Transaction log (TLOG)—a binary file that is not normally read by you (the administrator), but that is used by the Transaction Manager Server (TMS). A TLOG is created only on machines involved in Oracle Tuxedo global transactions.
User log (ULOG)—a log of messages generated by the Oracle Tuxedo system while your application is running. The ULOGMILLISEC environment variable is used to time stamp ulog message output intervals in milliseconds instead of seconds. The ULOGRTNSIZE environment variable is used to specify rotation files size. For more information on ULOGMILLISEC and ULOGRTNSIZE, see userlog(3c) in the Oracle Tuxedo Command Reference.
These logs are maintained and updated constantly while your application is running.
See Also
What Is the Transaction Log (TLOG)?
The transaction log (TLOG) keeps track of global transactions during the commit phase. At the end of the first phase of a 2-phase commit protocol, the participants in a global transaction issue a reply to the question of whether to commit or roll back the transaction. This reply is recorded in the TLOG.
The TLOG file is used only by the Transaction Manager Server (TMS) that coordinates global transactions. It is not read by the administrator. The location and size of the TLOG are specified by four parameters that you set in the MACHINES section of the UBBCONFIG file.
You must create a TLOG on each machine that participates in global transactions.
See Also
What Is the User Log (ULOG)?
The user log (ULOG) is a file to which all messages generated by the Oracle Tuxedo system—error messages, warning messages, information messages, and debugging messages—are written. Application clients and servers can also write to the user log. A new log is created every day and there can be a different log on each machine. However, a ULOG can be shared by multiple machines when a remote file system is being used.
The ULOG provides an administrator with a record of system events from which the causes of most Oracle Tuxedo system and application failures can be determined. You can view the ULOG, a text file, with any text editor. The ULOG also contains messages generated by the tlisten process. The tlisten process provides remote service connections for other machines in an application. Each machine, including the master machine, should have a tlisten process running on it.
Detecting Errors Using Logs
The Oracle Tuxedo log files can help you detect failures in both your application and your system by:
Analyzing the Transaction Log (TLOG)
The TLOG is a binary file that contains only messages about global transactions that are in the process of being committed. To view the TLOG, you must first convert it to text format so that it is readable. The Oracle Tuxedo system provides two tmadmin operations to do this:
dumptlog (dl) downloads (or dumps) the TLOG (a binary file) to a text file.
loadtlog uploads (or loads) an text version of the TLOG into an existing TLOG (a binary file).
The dumptlog and loadtlog commands are also useful when you need to move the TLOG between machines as part of a server group migration or machine migration.
Detecting Transaction Errors
Use the MIB T_TRANSACTION class to obtain the runtime transaction attributes within the system. The tmadmin command printtrans (pt) can also be used to display this information. Information about each group in the transaction is printed only if tmadmin is running in verbose mode as set by a previous verbose (v) command.
Any serious errors during the transaction commit process, such as a failure while writing the TLOG, is written to the USERLOG.
Analyzing the User Log (ULOG)
On each active machine in an application, the Oracle Tuxedo system maintains a log file that contains Oracle Tuxedo system error messages, warning messages, debugging messages, or other helpful information. This file is called the user log or ULOG. The ULOG simplifies the job of finding errors returned by the Oracle Tuxedo ATMI, and provides a central repository in which the Oracle Tuxedo system and applications can store error information.
You can use the information in the ULOG to identify the cause of system or application failures. Multiple messages about a given problem can be placed in the user log. Generally, earlier messages provide more useful diagnostic information than later messages.
ULOG Message Example
In the following example of Listing 2‑1, message 358 from the LIBTUX_CAT catalog identifies the cause of the trouble reported in subsequent messages, namely, that there are not enough UNIX system semaphores to boot the application.
Listing 2‑1 Sample ULOG Messages
151550.gumby!BBL.28041.1.0: LIBTUX_CAT:262: std main starting
151550.gumby!BBL.28041.1.0: LIBTUX_CAT:358: reached UNIX limit on semaphore ids
151550.gumby!BBL.28041.1.0: LIBTUX_CAT:248: fatal: system init function ...
151550.gumby!BBL.28040.1.0: CMDTUX_CAT:825: Process BBL at SITE1 failed ...
151550.gumby!BBL.28040.1.0: WARNING: No BBL available on site SITE1.
Will not attempt to boot server processes on that site.
Analyzing tlisten Messages in the ULOG
Part of the ULOG records error messages to the tlisten process. You can view tlisten messages using any text editor. Each machine, including the MASTER machine contains a separate tlisten process. Though separate tlisten logs are maintained in the ULOG on each machine, they can be shared across remote file systems.
The ULOG records tlisten process failures. tlisten is used, during the boot process, by tmboot and, while an application is running, by tmadmin. tlisten messages are created as soon as the tlisten process is booted. Whenever a tlisten process failure occurs, a message is recorded in the ULOG.
Application administrators are responsible for analyzing the tlisten messages in the ULOG, but programmers may also find it useful to check these messages.
The Oracle Tuxedo System Messages CMDTUX Catalog contains the following information about tlisten messages:
tlisten Message Example
Consider the following example of a tlisten message in the ULOG:
121449.gumby!simpserv.27190.1.0: LIBTUX_CAT:262: std main starting
A ULOG message consists of a tag and text. The tag consists of the following:
Placeholders are printed in the thread_ID and context_ID field of entries for single-threaded and single-contexted applications. (Whether an application is multithreaded is not apparent until more than one thread is used.)
The text consists of the following:
You can find this message in the Oracle Tuxedo System Messages LIBTUX Catalog.
See Also
“Using Transactions” in Tutorials for Developing Oracle Tuxedo ATMI Applications
Estimating Service Workload Using the Application Service Log
A Oracle Tuxedo application server can generate a log of the service requests it handles. The log is displayed on the server’s standard output (stdout). Each record contains a service name, start time, and end time.
You can request such a log when a server is activated. The txrpt facility produces a summary of the time spent by the server, thus giving you a way to analyze the log output. Using this data, you can estimate the relative workload generated by each service, which will help you set workload parameters appropriately for the corresponding services in the MIB.
Using the MIB to Monitor Your Application
There are essentially two operations you can perform using the MIB: you can get information from the MIB (a get operation) or you can update information in the MIB (a set operation) at any time using a set of ATMI functions (for example, tpalloc(3c), tprealloc(3c), tpcall(3c), tpacall(3c), tpgetrply(3c), tpenqueue(3c), and tpdequeue(3c)).
When you query the MIB with a get operation, the MIB responds to your reply with a number of matches, and indicates how many more objects match your request. The MIB returns a handle (that is, the cursor) that you can use to get the remaining objects. The operation you use to get the next set of objects is called getnext. The third operation occurs when queries span multiple buffers.
Limiting Your MIB Queries
When you query the MIB, which is a virtual database, you are selecting a set of records from the database table. You can control the size of the database table in two ways: by controlling the number of objects about which you want information, or by controlling the amount of information about each object. Using key fields and filters, you can limit the scope of your request to data that is meaningful for your needs. The more limits you specify, the less information is requested from the application, and the faster the data is provided to you.
Querying Global and Local Data
Data in the MIB is stored in a number of different places. Some data is replicated on more than one machine in a distributed application. Other data is not replicated, but is local to particular machines based on the nature of the data or the object represented.
What Is Global Data?
Global data is information about application components such as servers that is replicated on every machine in an application. Most of the data about a server, for example, such as information about its configuration and state, is replicated globally throughout an application, specifically in every bulletin board. An Oracle Tuxedo application can access this information from anywhere.
For example, from any machine in an application called Customer Orders, the administrator can find out that server B6 belongs to Group 1, runs on machine CustOrdA, and is active.
What Is Local Data?
Other information is not replicated globally, but is local to an entity, such as statistics for a server. An example of a local attribute is TA_TOTREQC, which defines the number of times services have been processed in a specified server. This statistic is stored with the server on its host machine. When the server accepts and processes a service request, the counter is incremented. Because this kind of information is managed locally, replicating it would inhibit your system’s performance.
There are also classes in the MIB that are exclusively local, such as clients. When a client logs in, the Oracle Tuxedo system creates an entry for it in the bulletin board, and records all tracking information about the client in that entry. The MIB can determine the state of the client at anytime by checking this entry.
Using tmadmcall to Access Information
The Oracle Tuxedo system provides a programming interface that offers direct access to the MIB while your application is not running. This interface, the tpadmcall function, gives the application direct access to the data upon which the MIB is based. tpadmcall allows you access to a subset of information that is local to your process.
Use tpadmcall when you need to query the system or make administrative changes while your system is not running. tpadmcall queries the TUXCONFIG file on behalf of your request. Data buffers that you put in, and data buffers that you receive (containing your queries and the replies to them) are exactly the same.
See Also
MIB(5) in File Formats, Data Descriptions, MIBs, and System Processes Reference
Querying and Updating the MIB with ud32
ud32 is a client program delivered with the Oracle Tuxedo system that reads input consisting of text representation of FML buffers. You can use ud32 for ad hoc queries and updates to the MIB. It creates an FML32 buffer, makes a service call with the buffer, receives a reply (also in an FML32 buffer) from the service call, and displays the results on screen or in a file in text format.
ud32 builds an FML32-type buffer with the FML fields and values that you represent in text format, makes a service call to the identified service in the buffer, and waits for the reply. The reply then comes back in FML32 format as a report. Now, because the MIB is FML32-based, ud32 becomes the scripting tool for the MIB.
For example, suppose you write a small file that contains the following text:
service name=.tmib and ta_operation=get, TACLASSES=T_SERVER
When you type this file into ud32, you receive an FML output buffer listing all the data in the system about the servers.
Using the Run-time and User-level Tracing Utility
The Oracle Tuxedo system provides a run-time and user-level tracing facility that enable you to track the execution of distributed business applications. The system has a set of built-in trace points that mark calls to functions in different categories, such as ATMI functions issued by the application or XA functions issued by the Oracle Tuxedo system to an X/Open compliant resource manager.
To enable tracing, you must specify a tracing expression that contains a category, a filtering expression, and an action. The category indicates the type of function (such as ATMI) to be traced. The filtering expression specifies which particular functions trigger an action. The action indicates the response to the specified functions by the Oracle Tuxedo system.
The system may, for example, write a record in the ULOG, execute a system command, or terminate a trace process. A client process can also propagate the tracing facility with its requests. This capability is called dyeing; the trace dye colors all services that are called by the client.
You can specify a tracing expression in the following ways.
Setting the TMTRACE run-time environment variable
For a simple tracing expression, define TMTRACE=on in the environment of the client. This expression enables tracing of ATMI functions on the client and on any server that performs a service on behalf of that client. The trace records are written to the ULOG file.
You can also specify a tracing expression in the environment of a server using the ulog or utrace tmtrace(5) receivers. For example, you might enter the following:
Run-time Tracing Expression: TMTRACE=atmi:/tpservice/ulog. If you export this setting within a server environment, a record with general run-time trace information is created in the ULOG file for all service requests performed on that server.
User-Level Expression: TMTRACE=atmi:utrace. Specifying the utrace receiver automatically calls the user-defined tputrace(3c). If you export this setting within a server environment, a record with trace information and output location defined by the user is created for the ATMI functions running on that server.
You can activate or deactivate the tracing option using the changetrace command of tmadmin. This command enables you to overwrite the tracing expression on active client or server processes. Administrators can enable global tracing for all clients and servers, or for a particular machine, group, or server.
See Also
tmtrace(5) in File Formats, Data Descriptions, MIBs, and System Processes Reference
userlog(3c) and tputrace(3c) in Oracle Tuxedo ATMI C Function Reference
Managing Errors Using the DBBL and BBLs
The Oracle Tuxedo system uses the following two administrative servers to distribute the information on the bulletin board to all active machines in the application:
DBBL—the Distinguished Bulletin Board Liaison server propagates global changes to the MIB and maintains the static part of the MIB. Specifically, the DBBL:
Resides (only one DBBL per application) on the MASTER machine and provides periodic status requests to all BBLs
BBL—the Bulletin Board Liaison server maintains the bulletin board on its host machine, coordinating changes to the local MIB, and verifying the integrity of application programs active on its machine. Specifically, the bulletin board:
Figure 2‑3 shows the diagnosis and repair using the DBBL and BBLs.
Figure 2‑3 Diagnosis and Repair Using the DBBL and BBLs
Both servers have a role in managing faults. The DBBL coordinates the state of other active machines in the application. Each BBL communicates state changes in the MIB, and sometimes sends a message to the DBBL indicating all is OK on its host machine.
The Oracle Tuxedo run-time system records events, along with system errors, warnings, and tracing events, in the user log (ULOG). Programmers can use the ULOG to debug their applications or notify administrators of special conditions or states found (for example, an authorization failure).
Using ATMI to Handle System and Application Errors
Using ATMI, a programmer controls some of the more global aspects of communications. ATMI provides functions for handling both application and system-related errors. When a service routine encounters an application error, such as an invalid account number, the client knows the service performed its task but could not fulfill its request because of an application error.
With a system failure, such as a server crashing while performing a request, the client knows the service routine did not perform its task because of an underlying system error. The Oracle Tuxedo system notifies programs of system errors that occur as it monitors the application’s behavior and its own behavior.
Using Configurable Timeout Mechanisms
At times, a service may get stuck in an infinite loop while processing a request. The client waits, but no reply is forthcoming. To protect a client from endless waiting, the Oracle Tuxedo system has two types of configurable timeout mechanisms: blocking timeouts and transaction timeouts. For more information about these timeout mechanisms, refer to Specifying Domains Transaction and Blocking Timeouts in Using the Oracle Tuxedo Domains Component.
A blocking timeout is a mechanism that ensures a blocked program waits no longer than the specified timeout value for something to occur. Once a timeout is detected, the waiting program is alerted with a system error informing it that a blocking timeout has occurred. The blocking timeout defines the duration of service requests, or how long the application is willing to wait for a reply to a service request. The timeout value is a global value defined in the BLOCKTIME field of the RESOURCES section of the TUXCONFIG file.
A transaction timeout is another type of timeout that can occur because active transactions tend to be resource-intensive. A transaction timeout defines the duration of a transaction, which may involve several service requests. The timeout value is defined when the transaction is started (with tpbegin(3c)). Transaction timeouts are useful when maximizing resources. For example, if database locks are held while a transaction progresses, an application programmer may want to limit the amount of time that the application’s transaction resources are held up. A transaction timeout always overrides a blocking timeout.
There are two UBBCONFIG file transaction timeout parameters:
TRANTIME which is specified in the SERVICES section of the UBBCONFIG and controls the timeout value for a specific AUTOTRAN service.
MAXTRANTIME which is specified in the RESOURCES section of the UBBCONFIG and is used by the administrator to place a maximum upper bound on the timeout value of a transaction started via tpbegin(3c) or via an AUTOTRAN service invocation.
For more information about these transaction timeout parameters, refer to UBBCONFIG(5) in File Formats, Data Descriptions, MIBs, and System Processes Reference.
Configuring Redundant Servers to Handle Failures
You can handle some failure situations by configuring an application with redundant servers and the automatic restart capability. Redundant servers provide high availability, and can be used to handle large amounts of work, server failures, or machine failures. The Oracle Tuxedo system continually checks the status of active servers, and when it detects the failure of a restartable server, the system automatically creates a new instance of that server.
By configuring servers with the automatic restart property, you can handle individual server failures.You can also specify the number of restarts that the system will provide. This capability can prevent a recurring application error by limiting the number of times a server is restarted.
The Oracle Tuxedo system frequently checks the availability of each active machine. A machine is marked as partitioned when it cannot be reached by the system. If this occurs, a system event is generated. A partition can occur due to a network failure, machine failure, or severe performance degradation.
See Also
Monitoring Multithreaded and Multicontexted Applications
You can get MIB statistical reports for various aspects of your multithreaded and/or multicontexted application by running the tmadmin(1) command interpreter. Here are a few examples of the information you can request for a multithreaded application:
Therefore application programmers should keep in mind the possibility that individual threads within a process may die. If one thread dies and a signal is issued, the whole process to which the thread belongs usually dies, and that death is detected by the BBL.
If a thread dies as the result of an erroneous call to a thread exit function, however, no signal is generated. If this type of death occurs before the thread calls tpterm(), then the BBL cannot detect the death and does not deallocate the registry table slot for the context associated with the dead thread. (It would not be proper for the BBL to deallocate this registry table slot even if it could detect the death of the thread because, in some application models, another thread might subsequently choose to associate itself with that context.)
There is no solution for this limitation so it is important for programmers to keep it in mind and design their applications accordingly.
How to Retrieve Data About a Multithreaded/Multicontexted Application Using the MIB
You can obtain information about a multithreaded or multicontexted application by:
Issuing selected tmadmin commands
Information is available in the following locations:
The T_SERVERCTXT class of the TM_MIB provides multiple instances of 14 fields if multiple server dispatch threads are active simultaneously. Specifically, the T_SERVERCTXT section includes an instance of each of the following fields for each active sever dispatch thread:
TA_CONTEXTID (key field)
TA_SRVGRP (key field)
TA_SRVID (key field)
For example, if 12 server dispatch threads are active simultaneously, then the T_SERVERCTXT class of the MIB for this application will include 12 occurrences of the TA_CONTEXTID field, 12 occurrences of the TA_SRVGRP field, and so on.
When multiple instances of T_SERVER class fields contain multiple values for different contexts of a multicontexted server, a “dummy” value is specified in the T_SERVER class field and the T_SERVERCTXT field contains an actual value for each context.
See Also
tmadmin(1) in the Oracle Tuxedo Command Reference
TM_MIB(5) in the File Formats, Data Descriptions, MIBs, and System Processes Reference

Copyright © 1994, 2017, Oracle and/or its affiliates. All rights reserved.