Service Management

This section discusses AquaLogic Service Bus service monitoring and management capabilities. It is intended for system administrators and operators who manage and monitor AquaLogic Service Bus. It includes the following topics:

Service Monitoring

In addition to delivering enterprise service bus capabilities such as service routing and transformation, the AquaLogic Service Bus also contains service monitoring and management capabilities to ensure the successful operations the IT organization expects. The following topics describe the service management and monitoring capabilities of AquaLogic Service Bus.

Dashboard

AquaLogic Service Bus aggregates run-time statistics and allows them to be viewed in real-time on a customizable dashboard, to monitor system operational health and flag problems in messaging services, allowing quick isolation and diagnosis of problems as they occur. AquaLogic Service Bus Console can be used to establish service level agreements (SLAs) for the performance of a system, and configure rules that trigger alerts to provide automated responses to SLA violations.

Information about system operational health can be organized by server and services. It can display status of the overall domain or the status of individual servers within it, using color-coded pie charts. It can also show service summaries showing the number of alerts and the corresponding severity for all services that have alerts defined and monitoring enabled.

In addition to the Dashboard, AquaLogic Service Bus provides the capability to review operational and performance statistics at individual service levels. These can be statistics for the individual service across the domain or for a specified server. It also provides performance statistics for the service at an operation level for more granular analysis.

For more information about the AquaLogic Service Bus Console Dashboard, see Monitoring in Using the AquaLogic Service Bus Console.

Metric Aggregation

The information displayed on the Dashboard is based on an asynchronous aggregation of data collected during system operation. In an AquaLogic Service Bus production cluster domain, the AquaLogic Service Bus data aggregator runs as a singleton service on one of the managed servers in the cluster. Server-specific data aggregation is performed on each of the managed servers in the domain. The aggregator is responsible for the collection and aggregation of data from all the managed servers at regular, configurable intervals.

The following table lists the metrics that the Dashboard displays for each service.

Table 6-1 AquaLogic Service Bus Service Metrics
Metric	Description
Average Execution Time	For a proxy service, the average of the time interval measured between receiving the message at the transport and either handling the exception or sending the response. For a business service, the average of the time interval measured between sending the message in the outbound transport and receiving an exception or a response.
Total Number of Messages	Number of messages sent to the service. In the case of JMS proxy services, if the transaction aborts due to an exception and places the message back in the queue so it is not lost, each retry dequeue is counted as a separate message. In the case of outbound transactions, each retry or failover is likewise counted as a separate message.
Messages With Errors	Number of messages with error responses. For a proxy service, it is the number of messages that resulted in an exit with the system error handler or an exit with a reply failure action. If the error is handled in the service itself with a reply with success or a resume action, it is not an error. For a business service, it is the number of messages that resulted in a transport error or a timeout. Retries and failovers are treated as separate messages.
Success/Failure Ratio	(Total Number of Messages - Number of Messages with Errors)/Messages with Errors
Security	Number of messages with WS-Security errors. This metric is calculated for both proxy services and business services.
Validation	Number of validation actions in the flow that failed. This metric only applies to proxy services.

These metrics are aggregated across the cluster for the configured aggregation interval. The Dashboard displays information about the overall health of the system, refreshing the display at a specified interval.

SLA Enforcement via Alerts

AquaLogic Service Bus provides the ability to set service level agreements (SLAs) on business and proxy services. These SLAs define the precise level and quality of service expected from business and proxy services. Rules can be configured to trigger alerts based on what the SLA measures. Multiple levels of severity can be configured for an alert including normal, warning, minor, major, critical, and fatal. Multiple alert conditions can be combined for each business or proxy service. Each alert can be based on the following parameters:

SLA alerts are set to inform the operations team of issues relating to the health of business and proxy services, or to the quality of service provided.

AquaLogic Service Bus implements Service Level Agreements (SLAs) and automated responses to SLA violations using rules that specify unacceptable service performance and the system response required under those circumstances. Rules are defined and constructed using the AquaLogic Service Bus Console. AquaLogic Service Bus evaluates rules against its aggregated metrics each time it updates that data.

When a rule evaluates to True, it raises an alert. In addition to displaying information about the alert in the AquaLogic Service Bus Console Dashboard, AquaLogic Service Bus executes the action specified for the rule when it evaluates to True. Any of the following types of actions can be assigned to a rule:

It is also possible to configure operating times for alerts. Rule and alert processing is handled by the AquaLogic Service Bus Alert Manager. The Alert Manager resides on the same single managed server as the metric aggregator for the system.

In addition to SLA alerts, AquaLogic Service Bus also allows Alert actions to be configured within the message flow (pipeline alerts) . These pipeline alert actions generates alerts based on message context in a pipeline, to send to an alert destination. Alert actions can be configured to include an alert name, description (which can include message elements such as $order), alert destination, or alert severity.

For information on how to configure AquaLogic Service Bus alerts, see Monitoring in Using the AquaLogic Service Bus Console.

Message Reporting

AquaLogic Service Bus can report on message data as messages pass through a proxy service. This is done via a reporting action which can be placed at any point within a request/response pipeline or error pipeline stage. Reporting actions can be used to filter message information as it flows through the proxy. The data that is captured via the report action, can then be accessed via a reporting provider. The reporting actions can help determine whether there is a problem with a message pre- or post-transformation, during routing, etc.

In the reporting action, it is possible to specify information about each message that needs to be written to the AquaLogic Service Bus Reporting Data Stream. The following figure shows an example Report action:

AquaLogic Service Bus is packaged with a built-in JMS Reporting Provider. It picks up reported data and stores it in a message reporting database that acts as the Reporting Data Store. Customers may also AquaLogic Service Bus also provides a Java API for customers who wish to use their own reporting provider.

The AquaLogic Service Bus Console Message Reporting module displays information from the Reporting Data Store, including summary information. Message Reporting enables you to drill down from summary information to view detailed information about specific messages.

It is possible to customize displayed Message Reporting information by filtering and sorting the data to meet specific reporting requirements. For information on how to configure reporting actions, see “Adding an Action” in Proxy Services in the Using the AquaLogic Service Bus Console.

AquaLogic Service Bus Console provides purge functionality to help manage message data. For other data management functions, standard database administration practices can be applied to the database hosting the Reporting Data Store. For a list of supported database platforms for the Reporting Data Store, see Supported Database Configurations in Supported Configurations for AquaLogic Service Bus.

Using monitoring, SLA alerts, and reporting features of AquaLogic Service Bus, IT operations departments can manage the health and availability of their service infrastructure in real time, measure SLA compliance, and report efficiently and effectively to their management teams and business executives.