About Instruments

Business Transaction Management uses a variety of instruments to measure the performance and usage characteristics of your business transactions and underlying services and operations. These instruments are displayed in various parts of the Management Console for interactive monitoring. You can also use most of these instruments as a basis for defining service-level agreements (SLA).

The period over which these instruments operate is either the evaluation period, in the case of an SLA, or the display period, in the case of interactive monitoring in the Management Console. The following descriptions use the term period to mean the evaluation period and/or display period, depending on the context in which the instrument is used. Some instruments, for example current compliance status, provide a current value only.

Transaction Instruments

The following instruments are available for monitoring transactions.

Average Response Time

The average amount of time a transaction requires to complete. For each instance of the transaction, the instrument measures the time from when the instance's start message is observed until its end message is observed. The instrument keeps a running average of the response time across all instances observed during the period. All completed instances are counted in the response time, regardless of whether condition alerts occurred.

If no transactions are observed during the period, the instrument value is set to -. Response time is measured in milliseconds.

Maximum Response Time

The maximum amount of time a transaction requires to complete. The instrument records the single highest response time from all instances of the transaction observed during the period.

Completed Transactions

The number of instances of a transaction that complete during the period. An instance is considered to have completed when both its start and end messages have been observed, regardless of whether condition alerts occurred. However, if the end message is defined as being in the response phase (for example, submit.response) and the end operation faults, the end message will not exist and the instance will, therefore, not be counted.

Completed Transaction Rate

The number of instances of a transaction that complete per hour during the period. This instrument derives its measurements from the completed transactions instrument.

Started Transactions

The number of instances of a transaction that start during the period. An instance is considered to have started when its start message is observed.

Started Transaction Rate

The number of instances of a transaction that start per hour during the period. This instrument derives its measurements from the started transactions instrument.

Condition Alerts

The number of condition alerts generated on the transaction during the period.

Condition Alert Rate

The number of condition alerts generated on the transaction per hour during the period. This instrument derives its measurements from the transaction condition alerts instrument.

Current Compliance Status

The current compliance status for the transaction.

Violation Alerts

The number of SLA violations or warnings caused by a transaction during the period.

Service and Operation Instruments

The following instruments are available for monitoring services, endpoints, and operations.

Average Response Time (services, endpoints, and operations)

The average amount of time a service or operation requires to respond to a request. For each request, the instrument measures the time from when the service receives the request until it sends a corresponding response to the client. The instrument keeps a running average of the response time across all messages received during the period.

Only successfully processed requests are counted in the response time; the response times for faults are not figured into this measurement. The response time is measured individually for each operation. The response time for a service is the average response time of all of its operations. This average is weighted according to the number of messages processed by each operation.

If no requests are observed during the period, the value of the instrument is set to -. Response time is measured in milliseconds.

Maximum Response Time (services, endpoints, and operations)

The maximum amount of time a service or operation requires to respond to a request. The instrument records the single highest response time for all requests received during the period.

Link Average Response Time

The average response time to outbound requests. For example, imagine a hypothetical orderService that receives a request from some client, and as a result sends a request to a creditCheckService. In this case, orderService is acting as a client to creditCheckService. The response time is measured from the point of view of the service that is acting as a client. In other words, it measures the time from when the client service sends the request until it receives the response, meaning that network latency, if it exists, is included in the response time.

Only successfully processed requests are counted in the response time; the response times for faults are not figured into this measurement. If no requests are observed during the period, the value of the instrument is set to -. Response time is measured in milliseconds.

Traffic

The number of requests that a service or operation receives during the period. The traffic count equals the throughput plus the fault count. Traffic count is measured individually for each operation. Traffic count for a service is the total traffic count of all of its operations.

Throughput

The number of requests that a service or operation successfully receives, processes, and responds to during the period (in other words, the number of responses). A message that generates a fault is not counted by the throughput instrument. Throughput is measured individually for each operation. Throughput for a service is the total throughput of all of its operations.

Throughput Rate

The number of successfully handled requests per hour during the period. This instrument derives its measurements from the throughput instrument.

Link Throughput

The number of outbound requests to another service that are successfully received, processed, and responded to during the period (in other words, the number of inbound responses; see the link average response time instrument for an explanation of service-to-service calls).

Faults

The number of faults generated by a service or operation during the period. Fault count is measured individually for each operation. The overall fault count for a service is the total fault count of all its operations.

Fault Rate

The average number of faults generated per hour over the period. This instrument derives its measurements from the faults instrument.

Fault Percentage

The percentage of messages that cause faults during the period. This instrument derives its measurements from the faults and traffic instruments.

Link Faults

The number of faults generated by outbound requests to another service during the period (see the link average response time instrument for an explanation of service-to-service calls).

Current Compliance Status

The current compliance health for the selected object.

Violation Alerts

The number of SLA violations or warnings caused by a service or operation during the period.

Violation Alerts Percentage

The percentage of time that a service or operation is in a state of SLA violation or warning during the period.

Failure Alerts

Count of failure violations for the specified period.

Warning Alerts

Count of warning violations for the specified period.

Uptime

The percentage of time that an endpoint's container responds successfully to a periodic ping message. See the configureAlivenessCheck command for details on how you can specify the method to be used for aliveness checking.