Oracle Tuxedo System and Applications Monitor Plus (TSAM Plus), is an Oracle Tuxedo add-on product. Tuxedo is widely used by enterprises that develop and use in mission-critical applications. It acts as the infrastructure layer in distributed computing environments. The complexity of Tuxedo and the applications running on top of it makes performance measurement extremely complex.
Oracle TSAM Plus monitors the major performance sensitive areas of a Tuxedo-supported enterprise computing environment. It can be used to monitor real-time performance bottlenecks and business data fluctuations, determine service models, and provide notification when pre-defined thresholds are violated.
Oracle TSAM Plus Components
This section introduces the following Oracle TSAM Plus components:
An Oracle Enterprise Manager plug-in that performs monitoring and management for Oracle Tuxedo product family.
Oracle TSAM Plus Agent
The Oracle TSAM Plus Agent handles all Tuxedo-side back-end logic. It works in conjunction with the Oracle TSAM Plus Manager, and includes the following sub-components:
Oracle TSAM Plus Framework: The framework is the data collection engine. It is an independent layer working between Tuxedo infrastructure and other TSAM Plus components. This module is responsible for run time metrics collection, alert evaluation and monitoring policy enforcement.
Oracle TSAM Plus Plug-in: An extensible mechanism invoked by the Oracle TSAM Plus Framework. The Oracle TSAM Plus Agent provides default plug-ins to send data to the LMS (Local Monitor Server), and then to the Oracle TSAM Plus Manager. The plug-in allows custom plug-in to be hooked to intercept the metrics. The default plug-in communicates with LMS with share memory. Application will not be blocked at metrics collection point.
You can develop your own plug-ins for additional data processing. A customized plug-in can be linked to an existing plug-in chain, or replace the default plug-in.
Local Monitor Server (LMS): The LMS is an Oracle Tuxedo system server. The Oracle TSAM Plus default plug-in sends data to the LMS. The LMS then passes the data to the Oracle TSAM Plus Manager in HTTP protocol. LMS is required on each Tuxedo machine if the node need to be monitored.
JMX Agent: A component working in conjunction with Enterprise Manager for Oracle Tuxedo, which enables you to monitor and manage Oracle Tuxedo applications through the JMX interface and furthermore, through the Oracle Enterprise Manger cloud control 12c.
The Oracle TSAM Plus Manager is built on J2EE technology. It includes following components:
Oracle TSAM Plus Data Server: The data server is responsible for:
accepting data from the LMS and store them into database
accepting requests from representation layer and does data processing
communicating with LMS for configuration instructions.
Oracle TSAM Plus Console: The Oracle TSAM Plus presentation layer. It is a J2EE Web application and can be accessed via a compatible Web browser. After logging on to the Oracle TSAM Plus Console, you have access to full Oracle TSAM Plus functionality.
Enterprise Manager for Oracle Tuxedo is an Oracle Enterprise Manager plug-in, by which users are able to monitor and manage Oracle Tuxedo product family in a centralized console with other product such as Weblogic Server and Oracle Database.
Figure 1-1 shows the Oracle TSAM Plus architecture.
Figure 1-1 Oracle TSAM Plus Architecture
Oracle TSAM Plus High Availability and Scalability
Oracle TSAM Plus High Availability
From TSAM Plus 126.96.36.199, you can configure two TSAM Plus managers for one TSAM Plus agent. The first one is a master. When the master manager is not available, TSAM Plus agent tries to connect the second one as a backup.
Figure 1-2 Oracle TSAM Plus High Availablility
TSAM Plus Manager Scalability
For an enterprise that has a large number of Tuxedo applications and TSAM Plus agents, it is a challenge to let the existing Oracle TSAM Plus manager do following tasks simultaneously:
Persists monitoring data sent by Oracle TSAM Plus agents.
Serves the GUI requests from end users promptly.
Calculates the call pattern of call path data.
Oracle TSAM Plus 188.8.131.52 enhances TSAM Plus manager scalability, which enables you to configure multiple TSAM Plus managers in one TSAM Plus system.
As shown in Figure 1-3, there are four Oracle TSAM Plus managers configured in one Oracle TSAM Plus system, in which manager 1 and 2 are configured as data servers, and manager 3 is configured as a TSAM Plus console. Manager 4 is configured for call pattern batch calculation purpose.
Figure 1-3 Oracle TSAM Plus Manager Scalability
Oracle TSAM Plus Concepts
Call Path Monitoring
Tuxedo is typically used by a client program (not necessarily a Tuxedo client process) that calls a service to perform a business computing logic scenario. The service implementation is completely transparent to the caller. This type of middleware transparency provides many benefits for development, deployment, and system administration. However, from a monitoring perspective, it is difficult for the end user or administrator to figure out what happens “behind the scene”. Oracle TSAM Plus call path monitoring helps to alleviate this problem.
Call Path Tree Definition
A simple Tuxedo application call triggers a set of service invocations. The involved services constitute a tree (“call path tree”). A call path tree strictly defines the following factors:
What type of services are involved to perform the initial service request.
The service invocation depth (that is, the depth of the call path tree).
The service invocation sequence. For example, client A calls SVC1. SVC1 calls SVC2 and SVC3.
Call transportation. The edge (how information is sent and received) of a call path tree represents the transportation information from caller to service provider. It could be an IPC queue, BRIDGE connection or DOMAIN connection. The elapse time used for each transportation is also recorded.
Call path metrics. Lots of metrics are available during the message propagation within Tuxedo system, such as the message size, execution status, transaction and CPU consumption etc.
A "monitoring initiator" is a process that "initiates" tracking a call path tree. The process can be a Tuxedo client, application server, client proxy server (WSH/JSH), the Tuxedo domain gateway server or web services proxy serve GWWS. A typical scenario is when a tpcall/tpacall is invoked by the monitoring initiator; call path monitoring begins. All the back-end services involved in this call are displayed on the call path tree representation in the Oracle TSAM Plus Console.
Currently only tpcall/tpacall can trigger a call path monitoring. Other communication models are not supported.
A Tuxedo application server performs two functions:
All sub-calls made in the service implementation are a part of the call path tree started by the original monitoring initiator (if the incoming request is already monitored).
It is a monitoring initiator with calls made in the service routine according to the monitoring policy definition.
Service monitoring focuses on Tuxedo service execution status. It does not care about call correlation, as call path monitoring does. Service monitoring can be used with call path monitoring together or performed independently. Tuxedo CORBA is also based on the service infrastructure, so in this release the CORBA interfaces can also be monitored.
System Server Monitoring
Tuxedo has several important system servers: BRIDGE, GWTDOMAIN and GWWS. BRIDGE connects multiple Tuxedo machines within a Tuxedo domain. GWTDOMAIN connects one Tuxedo domain with others. GWWS is the web services gateway. The system server monitoring tracks message throughput, pending sent messages and awaiting reply messages on each network link for BRIDGE and GWTDOMAIN. For GWWS, the web service requests statistics will be collected.
A critical use of Tuxedo is transaction monitoring. Tuxedo coordinates activities in a distributed transaction with an XA compliant resource manager, such as a database. Oracle TSAM Plus transaction monitoring tracks each XA call triggered in a transaction allowing you to clearly identify where a global distributed transaction is bottle necked. TSAM Plus supports the transaction monitoring capability propagation. That is if the transaction initiator is monitored, all XA calls on the transaction path will be monitored. The propagation supports transaction across domains.
Oracle Tuxedo Application Runtime for CICS and Batch Monitoring
Oracle TSAM Plus allows you to monitor the following Oracle Tuxedo Application Runtime for CICS and Batch components.
CICS Region representation (status and components).
Oracle TSAM Plus provides a comprehensive policy monitoring mechanism. When the proper policy monitoring settings are created, you can collect the exact metrics needed with minimum application performance impact. You define a policy using the TSAM Plus Web console and apply it to an Oracle Tuxedo application automatically. Oracle TSAM Plus Policy Monitoring has the following characteristics:
Monitoring Category. One monitoring policy can focus on one kind of monitoring, such as call path. It can also cover multiple interested areas.
Enable or Disable. Oracle TSAM Plus monitoring can be dynamically turned on or off. Monitoring policy can be predefined and enabled when the monitoring is needed. All enabled monitoring policy will be applied to Tuxedo applications automatically while application is running. Non-started application will get the policy while it is started.
Interval-Based Monitoring. Monitoring is initiated based on specific time intervals. For example, call path monitoring. An interval-based monitoring policy can specify that the call path is tracked in 60-second intervals.
Ratio-Based Monitoring. Monitoring is initiated by the number of executions. For example, service monitoring. A ratio that is set to 5 indicates that every 5 executed services are monitored. For call path monitoring, a ratio set to 5 indicates that every 5 tpcall/tpacall calls are monitored.
Runtime Condition Filtering. TSAM Plus monitoring policy supports some run time filters. Customer can monitor a particular service, a request from specific client and some kind of process type. The filter supports regular expression format.
Flexibility to Reduce Monitoring Performance Impact. Oracle TSAM Plus monitoring control enables you to configure the monitoring policy based on your application size, load and network activity. The monitoring policy can support only alert triggering without raw metrics storage.
Oracle TSAM Plus performance metrics are listed as follows:
Correlation ID: A unique identifier that represents a call path tree. It is generated by the monitoring initiator plug-in. It uses the following format:
DOMAINID:MASTERHOSTNAME:IPCKEY LMID PROCESSNAME PID TID COUNTER TIMESTAMP
Listing 1 shows an example of a Correlation ID. The monitored call is started by the program “bankclient” with process ID 8089 and thread ID 1 on machine “SITE1” on Tuxedo domain “TUXDOM1”. The master is “bjsol18” and IPCKEY in TUXCONFIG is “72854”.
Service Name: The name of an Oracle Tuxedo Service.
Location: The set of metrics to identify the process who sends out the performance metrics. It includes information about domain, machine, group and process name etc.
IPC Queue Length: The message number in an IPC queue.
IPC Queue ID: Oracle Tuxedo identifier of an IPC queue.
Execution Time: The time used in an Oracle Tuxedo service or XA call execution in milliseconds.
Wait Time: The time used of a message in the transportation stage.
CPU Time: The CPU time consumed by the service request processing. It only applies to single threaded server.
Message Size: The Oracle Tuxedo message size.
Execution Status: The tpreturn service return code. It is defined by the Oracle Tuxedo ATMI interface.
Call Flags: The flags passed to tpcall/tpacall in the Oracle Tuxedo ATMI interface.
Call Type: tpcall, tpacall, or tpforward.
Elapse Time: The time elapsed time a call is monitored.
GTRID: OracleTuxedo global transaction ID.
Pending Message Number: The number of messages which are delivered to the Oracle Tuxedo network layer and waiting for being sent.
Message Throughput: The total message number and volume accumulated in system server monitoring intervals.
Waiting Reply Message Number: The number of requests in GWTDOMAIN awaiting a reply from the remote domain.
XA Code: The XA call return code in transaction monitoring.
XA Name: The XA call name.
GWWS Metrics: A set of metrics used to measure GWWS throughput, including:
Inbound Message Throughput
Inbound Message Processing Time
Outbound Message Throughput
Outbound Message Processing Time
Oracle Tuxedo Application Runtime for CICS and Batch Metrics
TCP terminal throughput
Oracle TSAM Plus Use Cases
Oracle TSAM Plus is built on top of Oracle Tuxedo and has unique service, call, and transaction tracking capabilities. Enterprise organization usually have many widely distributed services deployed and one client request that requires complex back-end service coordination to perform the processes.
It can be difficult for an administrator to figure out what exactly is happening during these interactions. Oracle TSAM Plus call path monitoring helps to alleviate this problem.
The followings are FAQs will help you to better understand how Oracle TSAM Plus works with your applications:
Message routing is an important concept in Tuxedo. It impacts the system's performance, business logic process correctness and application reliability and high availability. Lots of factors can affect Tuxedo message routing, such as:
Data dependent routing (DDR)
Transaction Affinity (Oracle RAC support)
In development stage, huge of tests needed to verify your settings take effect. Without TSAM Plus, it is hard to tell the exact request going to the correct service desired. TSAM Plus call path can depict each call path clearly and speed your development process.
Understanding Your Applications
What happens behind a simple call?
Enabling call path monitoring for a Tuxedo client or application server allows you to find out all the information behind a simple tpcall/tpacall. The tracking points span multiple machines and multiple domains. You can clearly see the following information in the call path tree:
The service invocation hierarchy that supports your call
The transmission cost for each message flow step, from IPC Queue to Network
The execution status of each service involved
The call type and call flags of all the intermediate calls
The waiting time in queue and response time for each service
The end-to-end response time
What about my services?
Service monitoring enables you to measure your service response time, IPC queue length, and execution status. Service monitoring provides the following information:
Service Execution Status Summary.
Oracle TSAM Plus tells you how many service executions succeeded or failed recently or during a period of time. Oracle TSAM Plus also computes the average response time. These are important factors in measuring the quality of your services.
Service Activity Trends.
Oracle TSAM Plus also displays your services activity trends. It tells you what the peek time is and when the services requests are low.
Is my network busy?
Oracle TSAM Plus allows you to monitor the network connection attached to your local domain gateways. You can easily find which link is busy and its data fluctuation trend. You have more in-depth understanding of the business data flow model between departments and organizations.
Who participates in my transaction?
Oracle TSAM Plus monitors the transaction XA calls. Transaction participants are listed on the transaction monitoring page. For a large distributed transaction, a slow branch can result in the entire transaction being slowly completed. Oracle TSAM Plus lets you know who the transaction participants are, and how much time is used during XA calls. The transaction monitoring can also help find the bottleneck in a two phase commit stage if multiple resource managers involved.
Solving Application Performance Problems
Why is the service response time slow recently?
Turn on the call path monitoring for a particular call to investigate the following:
How much network-side time is used
Which services are the most time-consuming point in the call path tree
Is the service routed to a remote machine or a domain
Is client wait time a reply problem?
My back-end services failed, but I don’t know which one.
Turn on call path monitoring. You can find the service execution status for this call.
How many kinds of call paths are in my application?
Turn on call path monitoring using an adequate sampling policy. Oracle TSAM Plus will tell you how many call paths (a “call pattern”) exist in your application.
Why is my global distributed transaction completed slowly?
Turn on Oracle TSAM Plus transaction monitoring. You can see the execution time used by the transaction participants.
I want to correlate local transactions with remote transactions.
Turn on Oracle TSAM Plus transaction monitoring for all involved processes and GWTDOMAIN. The Oracle TSAM Plus Console shows you the transaction mapping between local and remote transactions.
I want to know what is the peak time that my local domain uses resources from the remote domain, and how busy it is.
Use Oracle TSAM Plus system server monitoring on the GWTDOMAIN. Oracle TSAM Plus records the information for you, and shows you the throughput trends.
Can I check program request information?
Turn on call path monitoring with the proper monitoring policy and then use “tpgetcallinfo”. The following information is provided.
The timestamp when the request leaves the caller
The timestamp when the request comes into to the server IPC queue
The client IP address (workstation client, GWWS client)
The monitoring initiator process, tpgetcallinfo(), can also tell you the total time used.
Improving Application Performance
Are my services too fine grained?
In some cases, too many services supporting a request may add to performance overhead. Use call path tree to investigate. The service number and the tree depth are key analysis factors.
Are my services deployed properly?
Some services are called more frequently than others. Use call path monitoring to gather the information, and re-consider the service deployment. It is best to have the most used services located on the local machine and LAN. Services across domain services should be used carefully.
Do I have too many servers configured?
Oracle TSAM Plus provides a central view of your Tuxedo applications with multiple domain support. Using Oracle TSAM Plus Console allows you to easily see how many domains, machines, servers and services are configured.
I want to be notified when something is wrong with my system
Oracle TSAM Plus provides comprehensive alert configuration based on the metrics collected. The base technology of TSAM Plus alert evaluation is Tuxedo FML Boolean expression, so you can combine complex conditions to compose an alert, such as:
Report an alert while the execution time greater than 30 seconds and execution is failed.
Report an alert if heuristic commit happens
Report an alert if the CPU time consumption is too high in my service
How to drop stale requests?
Some times the service execution is slow and quite some request messages are waiting for a long time in the IPC queue. The client who issues the request might be already got timeout notification, but the service still continually to process the request. Oracle TSAM Plus Alert can allow you to configure an "drop request" action with some metrics. For example, you can drop a request if the waiting time of a message greater an large value. You can also configure an alert with drop request action under the condition that the message number in the request queue exceeding some threshold.
To add Oracle TSAM Plus functionality to an existing Oracle Tuxedo application, do the following steps: