6 Service Management

This chapter discusses Oracle Service Bus service monitoring and management capabilities. It is intended for system administrators and operators who manage and monitor Oracle Service Bus.

This chapter includes the following sections:

6.1 Service Monitoring

In addition to delivering enterprise service bus capabilities such as service routing and transformation, the Oracle Service Bus also contains service monitoring and management capabilities to ensure the successful operations the IT organization expects. The following topics describe the service management and monitoring capabilities of Oracle Service Bus.

6.1.1 Dashboard

Oracle Service Bus aggregates runtime statistics and allows them to be viewed in real-time on a customizable dashboard, to monitor system operational health and flag problems in messaging services, allowing quick isolation and diagnosis of problems as they occur. Oracle Service Bus Administration Console can be used to establish service level agreements (SLAs) for the performance of a system, and configure rules that trigger alerts to provide automated responses to SLA violations.

Figure 6-1 Oracle Service Bus Dashboard

Description of Figure 6-1 follows
Description of "Figure 6-1 Oracle Service Bus Dashboard"

Information about system operational health can be organized by server and services. It can display status of the overall domain or the status of individual servers within it, using color-coded pie charts. It can also show service summaries showing the number of alerts and the corresponding severity for all services that have alerts defined and monitoring enabled.

In addition to the Dashboard, Oracle Service Bus provides the capability to review operational and performance statistics at individual service levels. These can be statistics for the individual service across the domain or for a specified server. It also provides performance statistics for the service at an operation level for more granular analysis.

For more information about the Oracle Service Bus Administration Console Dashboard, see "Monitoring" in the Oracle Fusion Middleware Administrator's Guide for Oracle Service Bus.

6.1.2 Metric Aggregation

The information displayed on the Dashboard is based on an asynchronous aggregation of data collected during system operation. In an Oracle Service Bus production cluster domain, the Oracle Service Bus data aggregator runs as a singleton service on one of the Managed Servers in the cluster. Server-specific data aggregation is performed on each of the Managed Servers in the domain. The aggregator is responsible for the collection and aggregation of data from all the Managed Servers at regular, configurable intervals.

The following table lists the metrics that the Dashboard displays for each service.

Table 6-1 Oracle Service Bus Service Metrics

Metric Description

Average Execution Time

For a proxy service, the average of the time interval measured between receiving the message at the transport and either handling the exception or sending the response.

For a business service, the average of the time interval measured between sending the message in the outbound transport and receiving an exception or a response.

Total Number of Messages

Number of messages sent to the service. In the case of JMS proxy services, if the transaction aborts due to an exception and places the message back in the queue so it is not lost, each retry dequeue is counted as a separate message. In the case of outbound transactions, each retry or failover is likewise counted as a separate message.

Messages With Errors

Number of messages with error responses.

For a proxy service, it is the number of messages that resulted in an exit with the system error handler or an exit with a reply failure action. If the error is handled in the service itself with a reply with success or a resume action, it is not an error.

For a business service, it is the number of messages that resulted in a transport error or a timeout. Retries and failovers are treated as separate messages.

Success/Failure Ratio

(Total Number of Messages - Number of Messages with Errors)/Messages with Errors

Security

Number of messages with WS-Security errors. This metric is calculated for both proxy services and business services.

Validation

Number of validation actions in the flow that failed. This metric only applies to proxy services.


These metrics are aggregated across the cluster for the configured aggregation interval. The Dashboard displays information about the overall health of the system, refreshing the display at a specified interval.

6.1.3 SLA Enforcement Through Alerts

Oracle Service Bus provides the ability to set service level agreements (SLAs) on business and proxy services. These SLAs define the precise level and quality of service expected from business and proxy services. Rules can be configured to trigger alerts based on what the SLA measures. Multiple levels of severity can be configured for an alert including normal, warning, minor, major, critical, and fatal. Multiple alert conditions can be combined for each business or proxy service. Each alert can be based on the following parameters:

  • Success rate, success ratio, failure ratio

  • Message count

  • Error count

  • Failover/retry count

  • Validation error count

  • WSS error count

  • Response time, minimum response time, maximum response time.

SLA alerts are set to inform the operations team of issues relating to the health of business and proxy services, or to the quality of service provided.

Oracle Service Bus implements Service Level Agreements (SLAs) and automated responses to SLA violations using rules that specify unacceptable service performance and the system response required under those circumstances. Rules are defined and constructed using the Oracle Service Bus Administration Console. Oracle Service Bus evaluates rules against its aggregated metrics each time it updates that data.

When a rule evaluates to True, it raises an alert. In addition to displaying information about the alert in the Oracle Service Bus Administration Console Dashboard, Oracle Service Bus executes the action specified for the rule when it evaluates to True. Any of the following types of actions can be assigned to a rule:

  • Send email notification

  • Send a JMS message

  • Send alert to the Oracle WebLogic Server Logger

It is also possible to configure operating times for alerts. Rule and alert processing is handled by the Oracle Service Bus Alert Manager. The Alert Manager resides on the same single Managed Server as the metric aggregator for the system.

In addition to SLA alerts, Oracle Service Bus also allows Alert actions to be configured within the message flow (pipeline alerts). These pipeline alert actions generates alerts based on message context in a pipeline, to send to an alert destination. Alert actions can be configured to include an alert name, description (which can include message elements such as $order), alert destination, or alert severity.

For information on how to configure Oracle Service Bus alerts, see "Monitoring" in the Oracle Fusion Middleware Administrator's Guide for Oracle Service Bus.

6.2 Message Reporting

Oracle Service Bus can report on message data as messages pass through a proxy service. This is done through a reporting action which can be placed at any point within a request/response pipeline or error pipeline stage. Reporting actions can be used to filter message information as it flows through the proxy. The data that is captured by the report action, can then be accessed by a reporting provider. The reporting actions can help determine whether there is a problem with a message pre- or post-transformation, during routing, and so on.

In the reporting action, it is possible to specify information about each message that needs to be written to the Oracle Service Bus Reporting Data Stream. The following figure shows an example Report action:

Figure 6-2 Example Report Action

Description of Figure 6-2 follows
Description of "Figure 6-2 Example Report Action"

Oracle Service Bus is packaged with a built-in JMS Reporting Provider. It picks up reported data and stores it in a message reporting database that acts as the Reporting Data Store. Customers may also Oracle Service Bus also provides a Java API for customers who wish to use their own reporting provider.

The Oracle Service Bus Administration Console Message Reporting module displays information from the Reporting Data Store, including summary information. Message Reporting enables you to drill down from summary information to view detailed information about specific messages.

Figure 6-3 Example Message Report Summary in the Oracle Service Bus Dashboard

Description of Figure 6-3 follows
Description of "Figure 6-3 Example Message Report Summary in the Oracle Service Bus Dashboard"

It is possible to customize displayed Message Reporting information by filtering and sorting the data to meet specific reporting requirements. For information on how to configure reporting actions, see "Adding and Editing Actions in Message Flows" in the Oracle Fusion Middleware Administrator's Guide for Oracle Service Bus.

Note:

Message Reporting displays information only for messages that traverse a pipeline that includes a reporting action.

Oracle Service Bus Administration Console provides purge functionality to help manage message data. For other data management functions, standard database administration practices can be applied to the database hosting the Reporting Data Store. For a list of supported database platforms for the Reporting Data Store, see the "Oracle Fusion Middleware Supported System Configurations" at http://www.oracle.com/technetwork/middleware/ias/downloads/fusion-certification-100350.html.

Using monitoring, SLA alerts, and reporting features of Oracle Service Bus, IT operations departments can manage the health and availability of their service infrastructure in real time, measure SLA compliance, and report efficiently and effectively to their management teams and business executives.