Monitoring ALSB at Run Time

ALSB enables you to monitor and collect run-time information required for system operations. ALSB aggregates run-time statistics, which you can view on the dashboard. The dashboard allows you to monitor the health of the system and notifies you when alerts are generated in your services. With this information, you can quickly and easily isolate and diagnose problems as they occur.

What is Service Monitoring?

You can monitor ALSB at run time to know how many messages in a particular service have processed successfully and how many have failed.

The ALSB monitoring framework provides access to information about the number of messages that were processed successfully or failed, which project the service belongs to, the average execution time of message processing, and the number of alerts associated with the service. Using the ALSB Console you can view monitoring statistics for the period of the current aggregation interval or for the period since you last reset statistics for this service or since you last reset statistics for all services. Using the public APIs you can access only the statistics since the last reset.

About the ALSB Monitoring Framework

Monitoring in ALSB involves monitoring of the operational resources, servers, and Service Level Agreements (SLAs). Figure 3-1 is an illustration of the architecture of the ALSB monitoring framework.

Collector– Each managed server in a cluster hosts a Collector. The Collector collects statistics on operational resources at regular intervals of time, which is managed in a RMI object. It also keeps samples history within the aggregation interval for the collected statistics. ALSB run time invokes a collector at the beginning of each minute. At every system-defined checkpoint interval, it stores a snapshot of current statistics into a persistent store for recovery purposes and sends the information to the Aggregator in raw format, as raw format is optimized for fast collection and small footprint.

Note:

An operational resource is defined as the unit for which statistical information can be collected by the monitoring subsystem. Operational resources include proxy services, business services, service level resources such as Web Services Definition Language (WSDL) operations, flow components in a pipeline, and endpoint URIs.

Aggregator– The Aggregator is present only on only one managed server. The server on which this resides is selected arbitrarily when you generate domain using the config wizard. It aggregates all the statistics that are collected from all managed servers across all managed servers in a cluster. ALSB run time invokes the aggregator twenty-five seconds past each minute, to enable collectors to collect data and send it to the aggregator. At system-defined checkpoint intervals, each managed server in the cluster sends a snapshot of its contributions to the Aggregator. Data structures in aggregator are optimized for aggregating and retrieving data.
Retriever– The Retriever retrieves the statistics that are stored in the memory. This is present only in that managed server, which contains the aggregator.
Alert Manager– The alert manager fires alerts based on the aggregated statistics. This is present only in that managed server, which contains the aggregator.

The Collector collects the updated statistics from ALSB run time and sends it to Aggregator.The Aggregator aggregates the statistics over the aggregation interval. The aggregated statistics are pushed to the Alert Manager. The Alert Manager triggers alerts based on these statistics. The aggregated statistics are also stored and can be retrieved by the Retriever. The following steps are executed when you monitor a service in ALSB run time:

The alerts are pulled from the managed server that hosts the aggregator, and they are displayed in the ALSB Console.

Aggregation Intervals

In ALSB, the monitoring subsystem collects statistics, such as message count over an aggregation interval. The aggregation interval is the time period over which statistical data is collected and displayed in the ALSB Console. Statistics which are not based on an aggregation interval are meaningless. In addition to statistics collected over well-defined aggregation interval you can also collect cumulative statistics.

The Refresh Rate of Monitoring Data

Aggregation interval is a moving window, which always refers to an interval of time in minutes, hours or days. It does not move with infinite granularity or precision, but at regular intervals of time called the sampling interval. This enables an aggregation interval to move smoothly and produce accurate statistics.

Figure 3-2 is an illustration of the of Aggregation interval. For example aggregation interval A1 is set at five minutes and aggregation interval A2 has been set at ten minutes. ALSB run time collects statistics for the service with aggregation interval A1 for every minute (S1). It aggregates the statistics at the end of the aggregation interval. Similarly for aggregation interval A2 it collects statistics for every five minutes (S2). Intervals S1 and S2 are called sampling intervals. For more information about sample interval, see Sample Intervals Within Aggregation Intervals .

Sample Intervals Within Aggregation Intervals

In ALSB run time statistics are computed at regular intervals, within every aggregation interval. These regular sub intervals are known as the sample interval. The duration of the sample interval depends on the aggregation interval. Table 3-1 gives the length of sample interval for different aggregation intervals:

Table 3-1 Sample Interval
Aggregation Interval	Sample Interval
1, 2, 3, 4, and 5 minutes	1 minute
10, 15, 20, 25, and 30 minutes	5 minutes
40, 50, and 60 minutes	10 minutes
90 and 120 minutes	30 minutes
3, 4, 5, and 6 hours	1 hour
8, 10, and 12 hours	2 hours
16, 20, and 24 hours	4 hours
2, 3, 4, 5, 6, and 7 days	1 day

How to Set the Aggregation Interval for Monitoring Data

You can track statistics for a service over only one aggregation interval.
You cannot set an arbitrary value for an aggregation interval. You must choose from one of the values in the drop-down list.
You can set the aggregation interval for the following:

A service– You must set the aggregation interval for a service in Operation tab of View a Proxy service or View a Business Service page. For more information about how to set aggregation interval for a service, see Setting the Setting the Aggregation Interval for a Service in Monitoring in Using the AquaLogic Service Bus Console.
An alert rule– You must set the aggregation interval for an alert rule by editing Conditions on View Alert Rule Details page. For more information about how to edit conditions for an alert rule, see Defining Alert Rule Conditions in Monitoring in Using the AquaLogic Service Bus Console.

What are the Consequences Of Changing Aggregation Interval Of A Service?

When you modify the aggregation interval of a service, the statistics of the service in the current aggregation interval is reset. However, the status of the endpoint URI for the service remains unaffected by the change in the aggregation interval. A running count metrics of the service is not reset when modify the aggregation interval.

What are the Consequences Of Renaming Or Moving A Service?

When you rename or move a service within ALSB, all the monitoring statistics that have been collected in the ALSB Console are lost. All current aggregation interval and cumulative metrics are reset and the service is monitored from start. If endpoint URI for a service was marked offline before it was renamed or moved, then after you rename or move the service, the URIs are marked online again and the status of the URI is displayed as online.

What Statistics are Available for ALSB Services?

You can monitor services in ALSB and collect statistics for all services. Monitoring system in ALSB supports the following types of statistics:

Counter– A counter simply keeps track of the count of events in ALSB run time such as number of messages received and number of failovers. This is scalar and takes on integral values.
Interval– An interval keeps track of time elapsed between two well-defined events. This tracks the total, average, minimum, and maximum of such events in ALSB. This takes on integral and non-integral values.
Status Type– A Status statistic keeps track of the status. Using this you can keep track of the initial status and the current status of the object.

Accessing Statistical Information for Services

You can access the statistical information for a service through the ALSB Console or directly by using the JMX monitoring APIs. This section describes accessing the information through the ALSB Console and the JMX monitoring APIs. For more information about accessing the statistical information through JMX monitoring APIs, see JMX Monitoring API Programming Guide.

How to Access Service Statistics from the ALSB Console

You can access the service statistics from the ALSB Console for a stand alone server or a cluster. This section describes how to access statistics for a standalone server. For information on how to access service statistics for a cluster, see How to Access Statistics in a Cluster.

How to Access Statistical Information Using the JMX Monitoring APIs

You can also access statistical information directly from a program using the Java Management Extensions (JMX) monitoring APIs. Using the JMX monitoring APIs you can access only the running count statistics. The JMX monitoring APIs provide an efficient lower level support for bulk operations. For more information about using JMX monitoring APIs, see JMX Monitoring API Programming Guide.

How to Access Statistics in a Cluster

In a cluster environment, statistics are available at individual managed server level and the cluster level. In Service Health tab choose Cluster or the name of a managed server from the Server drop-down list, to view statistics for the cluster or the individual managed server.

To get detailed statistics for a particular service in a cluster, access the service monitoring details page for the service from the Service Health tab. On the Service Monitoring Details page, you can access the cluster wide statistics by setting the Server drop-down list to Cluster. By setting it to the individual managed server value, statistics pertaining to that specific server can be viewed.

Set Display Statistics to Current Aggregation Interval to view cluster statistics in the current aggregation interval or Since Last Reset to view the running count statistics for the cluster.

How to Reset Statistics

You can reset the statistics of business services and proxy services from the Service Health tab of the dashboard or from the Service Monitoring Details page for a service. Table 3-2 describes how to reset statistics in each case.

Table 3-2 How to Reset Statistics

To Reset Statistics from …

Description

Service Health Tab

Click the Service Health tab to go to the Service Health page. Click Reset Statistics icon to reset statistics for a service. Click Reset All Statistics link to reset statistics for all the services for which monitoring is enabled.

Note:

The reset Statistics Icon is available only when you set Display Statistics to Since Last Reset.JMS

For more information about how to reset statistics, see Resetting Statistics for Services in Monitoring in Using the AquaLogic Service Bus Console.

Service Monitoring Details Page

Click the name of the service in the alert history table or on the Service Health page to go to the Service Monitoring Details page. In the Service Monitoring Details page in the Display Statistics field select Since Last Reset. Click Reset Statistics to reset the statistics for the given service.

What are the Consequences of Resetting the Statistics?

When you reset statistics for a service in the ALSB Console, all the statistics collected for the service since the last reset is lost. You cannot undo this action. The status of endpoint URIs is not reset when you reset statistics.

The Role of Alerts in Service Monitoring

SLA alerts are raised in ALSB to indicate potential violation of the Service Level Agreements (SLAs). You can use alerts for:

Monitoring and generating e-mail notification of WS-Security errors.
Monitoring the number of messages passing through a particular pipeline.
Detecting the violation of service level agreements with third-party products.
Detecting a non-responsive endpoint.

Pipeline alerts can be raised in the message flow of the proxy service. You can use the alerts in a message flow for:

Detecting errors in a message flow.
Indicating business occurrences.

Assigning Severity for Alerts

You can configure the severity of an alert in an alert rule for SLA alerts or in the Alert action of a message flow of a proxy service. The severity level of alerts is user configurable and has no absolute meaning. You can configure alerts with one of the following levels of severity:

Normal
Warning
Minor
Major
Critical
Fatal

The alert destinations are notified when an alert is raised. If you do not configure any alert destination in an alert rule or an alert action, the notifications are sent to the ALSB Console. For more information in alert destinations, see What are Alert Destinations?.

What are SLA Alerts?

SLA alerts are automated responses to violations of Service Level Agreements (SLAs). These alerts are displayed on the ALSB dashboard. They are generated when the service violates the service level agreement or a predefined condition. To raise an SLA alert, you must enable SLA alerting both at the service level and at the global level. For more information about how to configure operational settings for services, see How to Configure the Operational Settings for a Service. The Alert History panel contains a customizable table displaying information about violations or occurrences of events in the system.

You must define alert rules to specify unacceptable service performance according to your business and performance requirements. Each alert rule allows you to specify the aggregation interval for that rule when configuring the alert rule. This aggregation interval is not affected by the aggregation interval set for the service. For more information about aggregation interval, see Aggregation Intervals. Alert rules also allow you to send notifications to the configured alert destinations. For information on defining alert rules, see Creating and Editing Alert Rules in Monitoring in Using the AquaLogic Service Bus Console.

A Sample Use Case for SLA Alerts

Assume that a particular proxy service is generating SLA alerts due to slow response time. To investigate this problem, you must log into the ALSB Console and review the detailed statistics for the proxy service. At this level, you can able to identify that, a third-party Web service invocation stage in the pipeline is taking a lot of time and is the actual bottleneck. You can use these alerts as the basis for negotiating SLAs. After successfully renegotiating SLAs with the third-party Web service provider, you must configure alert metrics to track the Web service provider's compliance with the new agreement terms.

What are Pipeline Alerts?

Pipeline alerts can be generated in a message flow whenever you define an Alert action available under the reporting category in the message flow.

You can also define conditions under which a pipeline alert is triggered using the conditional constructs available in the pipeline editor such as Xquery Editor or an if-then-else construct. The ALSB Console is the default alert destination for pipeline alerts. You can also configure the Alert Destination resource in an alert action, to define additional destinations for pipeline alerts.

You can have complete control over the alert body including the pipeline, and context variables. Also you can extract the portions of the message. For more information about how to configure Alert actions in a stage, see Adding Alert Actions in Proxy Service: Actions in Using the AquaLogic Service Bus Console. When the alert action is executed the alerts are notified to the appropriate alert destinations.

You can obtain an integrated view of all the pipeline alerts generated by a service on the dashboard page in the ALSB Console.

A Sample Use Case for Pipeline Alerts

Consider a case when you want to be notified when special business conditions are encountered in a message flow. You can configure an alert action is a message flow to raise alerts when such predefined conditions are encountered. You can also configure email alert destination to receive an email notification of the alert. You can also send the details to the email recipient in the form of payload.

For example, in the case of a proxy service that routes orders to a purchase order website, and you want to be notified when an order exceeding $10 million is routed. For this you must configure an alert action in the appropriate place in the pipeline, with the condition and configure email alert destination with the email information and use it as the target destination in the alert action. You can also include the details of the order in the form of payload.

You can also use pipeline alerting to detect errors in a message flow. For example, in the case of a proxy service that validates the input document, you want to be notified when the validation fails so that you can contact the client to fix the problem. For this you must configure an alert action within the error handler for the message flow of the proxy service. In the action you can include the actual error message in the fault variable and other details in the SOAP header, to be sent as the payload. You can also configure additional alert destinations using an alert destination resource in the alert action.

How to View or Delete SLA Alerts

In the ALSB Console the extended alert history page for the SLA alerts contains information about all the SLA alerts that have been generated in the domain. You can view all the alerts that were triggered or search for specific alerts from the table. For more information about data displayed in the extended SLA alert history page, see Locating Alerts in Monitoring in Using the AquaLogic Service Bus Console.

You can delete the alerts from the Extended Alert History page or the View Alert Details page. You can filter your search using the Extended Alert History Filters pane. For more information on how to filter your search, see How to Filter a Search for SLA Alerts.

To view a pie or bar chart of the alerts, click View Bar Chart or View Pie Chart in the page. Click Purge SLA Alert History to delete all the SLA alerts. You can also purge the alerts based on the date and time they were raised.

How to View or Delete Pipeline Alerts

The extended alert history page for the pipeline alerts contains information about all the pipeline alerts that have been generated in the domain. You can view all the alerts that were triggered or search for specific alerts from the table. For more information about data displayed in the extended pipeline alert history page, see Locating Alerts in Monitoring in Using the AquaLogic Service Bus Console.

You can delete the alerts from the Extended Pipeline Alerts page or from the View Alert Details page. For more information on how to filter your search, see How to Filter a Search for Pipeline Alerts. To view a pie or bar chart of the alerts, click View Bar Chart or View Pie Chart link in the page. Click Purge Pipeline Alert History to delete all the pipeline alerts. You can also purge the alerts based on the date and time they were raised.

How to Filter a Search for Specific Alerts

Use Extended Alert History to filter a search for specific alerts. The following sections describe how to search for SLA alerts and pipeline alerts using extended alert history filters.

How to Filter a Search for SLA Alerts

Use Extended Alert History Filters pane to filter a search for SLA alerts. Table 3-3 describes the various criteria on which you can filter SLA alerts.

Table 3-3 Search Criteria for SLA Alerts
Search Criterion	Description
Date Range	Use this to search for pipeline alerts that were generated in the given interval of time. You can set the interval in one of the following ways: All Set timestamp interval in MM/DD/YY HH:Min:SS AM/PM format. Alerts generated during the given time interval in the format days– hours– minutes
Alert Severity	Specify the level of severity. The search result includes all the alerts that have the specified level of severity and above.
Service	Use this to search for a specific service.
Service Type	This is updated automatically when you search for a specific service.
Alert Name	Use this to search by alert name.

How to Filter a Search for Pipeline Alerts

Use Extended Alert History Filters pane to filter a search for pipeline alerts. Table 3-4 describes the various criteria on which you can filter pipeline alerts.

Table 3-4 Search Criteria for Pipeline Alerts
Search Criterion	Description
Date Range	Use this to search for pipeline alerts that were generated in the given interval of time. You can set the time interval in one of the following ways: All Set timestamp interval in MM/DD/YY HH:Min:SS AM/PM format. Alerts generated during the given time interval in the format days– hours– minutes
Alert Severity	Specify the level of severity. The search result includes all the alerts that have the specified level of severity and above.
Service	Use this to search for a specific service.
Service Type	This is updated automatically when you search for a specific service.
Alert Summary	Use this to search by alert summary.

What are Alert Destinations?

Alert destinations are resources to which alerts are sent. The ALSB Console is the default alert destination for the notification of any alert. The alerts are notified to the ALSB Console console regardless of whether you configure an alert destination or not. The console provides information about the alerts generated due to SLA violations or as a result of alert actions configured in the pipeline. The dashboard page displays the overall health of ALSB. It provides an overview of the state of the system comprising server health, services health, and alerts.

For more information about how to interpret the information on the dashboard, see The ALSB Dashboard.

In ALSB you can configure Email, SNMP Traps, Reporting and JMS as alert destinations.

E-mail

E-mail alert destination, allows you to receive messages when alerts are raised in the ALSB Console. To configure this alert destination you have to use the SMTP server global resource or a JavaMail session in the WebLogic server. For more information on configuring a default SMTP Server resource, see Configuring a Default SMTP Server in Global Resource in Using the AquaLogic Service Bus Console. For more information about configuring JavaMail sessions, see Configure Access to JavaMail in WebLogic Server Administration Console Online Help

The SMTP server global resource captures the address of the SMTP server, port number, and if required, the authentication credentials. The authentication credentials are stored inline and are not stored as a service account. The alert manager makes use of the e-mail alert destination to send the outbound e-mail messages when both pipeline alerts and SLA alerts are generated. When an alert is delivered, an e-mail metadata consisting of the details about the alert is prefixed to the details of the payload that is configured.

You can specify the e-mail ID of the recipients in the Mail Recipients field. For more information about configuring an e-mail alert destination, see Adding E-Mail Recipients in Adding E-mail and JMS Recipients in Alert Destinations in Using the AquaLogic Service Bus Console.

SNMP Traps

The Simple Network Management Protocol (SNMP) traps allow any third-party software to interface monitoring service level agreements within ALSB. By enabling the notification of alerts using SNMP, Web Services Management (WSM) and the Enterprise Service Management (ESM) tools can monitor SLA violations and pipeline alerts by monitoring alert notifications.

SNMP is an application-layer protocol which allows the exchange of information on the management of a resource across a network. It enables you to monitor a resource and, if required, take some action based on the data obtained from the resource. Both the SNMP version 1 and SNMP version 2 are supported by ALSB. SNMP includes the following components:

Managed Resource
Management Information Base(MIB)
SNMP Agent
SNMP Manager
Network Management System (NMS)

Reporting

The Reporting destination allows you to send notifications of pipeline alerts or SLA alerts to the custom reporting provider that can be developed using the reporting APIs provided with ALSB. This allows third parties to receive and process alerts in custom Java code.

JMS

Java Messaging Service (JMS) is another destination for pipeline alerts and SLA alerts. You must configure a JNDI URL for the JMS destination for alerts. When you configure an alert rule to post a message to a JMS destination, you must create a JMS connection factory and a queue or topic, and target them to the appropriate JMS server in the WebLogic Server Administration Console. For information on how to do this, see Configuring a JMS Connection Factory and JMS Resource Naming Rules for Domain Interoperability in Configuring JMS System Resources in Configuring and Managing WebLogic JMS. When you define the JMS alert destination you can either use a destination queue or a destination topic. The message type can be bytes or text. For more information about how to configure JMS alert destination see Adding JMS Recipients in Adding E-mail and JMS Recipients in Alert Destinations in Using the AquaLogic Service Bus Console.

What are Operational Settings for a Service?

Operational settings enable you to control the state of a service in the ALSB Console. Table 3-5 describes operational settings for services in the ALSB Console.

Table 3-5 Operational Settings for Services in the ALSB Console
Operational Settings	Usage	Default Value When a Service is Created
State	Use this to enable or disable a service .	Enabled
Monitoring	Use this to enable or disable service monitoring.	Disabled
Aggregation Interval	Use this to set the aggregation interval for the service.	10 minutes
SLA Alerting	Use this to enable SLA alerting for services at a specific level of severity or above. You can also use this to disable SLA alerting for a service.	Enabled
Pipeline Alerting	Use this to enable pipeline alerting for proxy services at a specific severity level or above. You can also use this to disable pipeline alerting for proxy services.	Enabled at Normal level or higher
Message Reporting	Use this to enable or disable message reporting for proxy services.	Enabled at Normal level or higher
Logging	Use this to enable logging at a specific severity level or above. You can also use this to disable logging for proxy services.	Enabled at Debug level or higher
Tracing	Use this to enable or disable tracing for proxy services.	Disabled
Offline Endpoint URIs	Use this to enable or disable non responsive endpoints for business services. You can also specify the interval of time to wait before retrying the offline endpoint URI. You can enable or disable offlline URIs for business services only.	Disabled
Throttling State	Use this to enable or disable throttling for a business service.	Disabled
Maximum Concurrency	Use this to restrict the number of messages that can be concurrently processed by a business service.	0
Throttling Queue	Use this to restrict the maximum number of messages in the throttling queue.	0
Message Expiration	The maximum time interval (in milli seconds) for which a message can be placed in throttling queue.	0

How to Configure the Operational Settings for a Service

You can enable or disable the operational settings for an individual service from the Operation Settings view of the View a Proxy Service (see Figure 3-3) or View a Business Service page (see Figure 3-4). For more information, see Creating and Configuring Business Services in Business Services: Creating and Managing and Creating and Configuring Proxy Services in Proxy Services: Creating and Managing in Using the AquaLogic Service Bus Console.

Some operational settings such as service state, monitoring, SLA alerting, and pipeline alerting can enabled or disabled through public APIs. For more information, see Javadoc for AquaLogic Service Bus.

You can perform the following operational settings for proxy services and business services:

State
Aggregation Interval
Monitoring
SLA Alerting
Pipeline Alerting
Message Reporting
Tracing
Pipeline Logging
URI Offline Interval
Throttling settings

How to Configure the Operational Settings at the Global Level

You can access the Global Settings page from the operations module. You can use the Global Settings page (see Figure 3-5) to configure the operational settings for services. Table 3-6 describes the usage of the operational settings at the global level.

Table 3-6 Usage of Operational Settings at the Global Level
Operational Settings	Usage
Monitoring	Use this to enable monitoring for all services in a domain. Click the check box associated with Enable Monitoring to enable or disable monitoring for all the services in the domain.
SLA Alerting	Use this to enable SLA alerting for all services in a domain. Click the check box associated with Enable SLA Alerting to enable or disable monitoring for all the services in the domain.
Pipeline Alerting	Use this to enable pipeline alerting for proxy services in a domain. Click the check box associated with Enable Pipeline Alerting to enable or disable monitoring for all the services in the domain.
Reporting	Use this to enable message reporting for proxy services in a domain. Click the check box associated with Enable Reporting to enable or disable monitoring for all the services in the domain.
Logging	Use this to enable logging for proxy services in a domain. Click the check box associated with Enable Logging to enable or disable monitoring for all the services in the domain.

For more information, see Enabling Global Settings in Configuration in Using the AquaLogic Service Bus Console.

The Enable Monitoring option allows you to enable or disable monitoring of all services that have individually been enabled for monitoring. If monitoring for a particular service has not been enabled, you must first enable it and set the aggregation interval on the Manage Monitoring page before the system starts collecting statistics for that service.
Enable or disable these settings at the global level in conjunction with the settings at the service level to effectively enable or disable them. The operational settings at the global level supersede the operational settings at the service level.

Updates to Operational Settings During Import of ALSB Configurations

When a service is overwritten by the way of importing configuration from a config jar, the operational settings of this service can be also be overwritten. To preserve the operational settings during import, you must set the Preserve Operational Settings flag to true while importing the service. For more information, see What Happens to Alert Rules When You Import ALSB Configurations?

Updates to Global Settings During the Import of ALSB Configurations

When you import ALSB configurations, if the config jar that is being imported also contains the global settings of the domain from which it is being imported, then these domain level settings can get overwritten. In order to prevent this, set Preserve Operational Settings flag to true while importing the service.

How to Preserve Operational Settings During the Import of ALSB Configuration Through APIs

You can preserve operational settings during import of ALSB configurations using APIs. For more information, see Importing and exporting configuration using the new API in Interface ALSBConfigurationMBean. Modify the MBean as shown in Listing 3-1 to preserve the during the import.

SLA Alerting Functionality in ALSB

In ALSB you must define conditions based on which alerts are raised. The conditions are configured in an SLA alert rule. The alert rule also configures the severity level and an alert destination for an alert.

How to Configure SLA Alert rules

SLA alerts are automated responses to SLAs violations, which are displayed on the dashboard. You must define alert rules to specify unacceptable service performance according to your business and performance requirements. When you configure an alert rule, you must specify the aggregation interval. The alert aggregation interval is not affected by the aggregation interval set for the service. For more information about aggregation interval, see Aggregation Intervals.

General Configuration– defines the name, description, summary, duration, severity, frequency, state of the enabled alert rule and other general characteristics.
Define Condition– defines one or more conditions that trigger the alert rule. Additionally, you must define the aggregation interval for the condition on this page.

For more information about creating an alert rule, see Creating And Editing Alert Rules in Monitoring in Using the AquaLogic Service Bus Console.

For a service for which monitoring is enabled, Alert rule is evaluated at discrete intervals. Once an alert rule is created it is first evaluated at the end of the aggregation interval, and after that at the end of each sample interval. For example, if the aggregation interval of an alert rule is five mins, it is evaluated five minutes after it is created, and then every minute after that (since sample interval for five mins, is one min).

If a rule evaluates to false no alert is generated. If the rule evaluates to true the alert generation is governed by the Alert Frequency. If the frequency is Every Time, an alert is generated every time an alert rule evaluates to true. If the frequency is Notify Once, an alert is generated only if no alert is generated in the previous evaluation. In other words, an alert is generated the first time the alert rule evaluates to true and no more notifications are generated until the condition resets itself and evaluates to True again.

How to Lookup or Edit Existing Alert Rules

The View Alert Rule Details page displays complete information about a specific alert rule, as shown in Figure 3-6. You can view the details of the alert rule in this page. You can edit an alert rule configuration from this page. For more information about how to edit an alert rule, see Creating and Editing Alert Rules in Monitoring in Using the AquaLogic Service Bus Console.

How to Rename Alert rules

You can rename the alert rules from the SLA Alert Rules tab of View a Business Service or View a Proxy Service page. To rename an alert rule click Rename Alert Rule icon. Enter the new name for the alert rule in New Alert Rule Name field of the Rename Alert Rule window. Click Rename.Click Update and activate the session to complete. For more information, see Viewing Alert Rules in Monitoring in Using the AquaLogic Service Bus Console. The Rename icon for the renamed alert is now disabled.

What are the Consequences of Renaming an Alert Rule?

When alerts are triggered, they are listed on the alert history page. Click View Alert Rule Details action icon to access the alert rule page. However, when you rename an alert rule, you cannot access the alert rule by clicking the View Alert Rule Details action from the alert history page, for the alerts that were raised before it was renamed. You can access the Alert Details page from the alert history page for the alerts that are raised before-and after renaming the alert rule. The alert name is greyed in the Alert Details page for the alerts that were raised before the alert rule was renamed. When you rename an alert details icon for the renamed alert gets disabled.

Similar limitations exist when you attempt to access the alert rule page by clicking the alert name link on the alert details page. The alert name generated by alerts rules that are later renamed refers to an outdated name. You can view the old alert rule, but the name is grayed out indicating that the alert rule has been renamed.

When you rename an alert rule, the conditions on which a rule is based are preserved. The aggregation interval of the alert rule is also preserved. The alert is raised at the end of the first aggregation interval after the alert rule is renamed. For example, consider an alert rule a1 with aggregation interval five minutes. If the alert rule is renamed to a2 after two minutes of execution the next alert under the name a2 is generated three minutes after the is renamed.

What Happens to Alert Rules When You Import ALSB Configurations?

You can preserve the alert rule configurations when you import ALSB configurations. When you import ALSB configurations, the operational settings are preserved. When services with alert rules exists in a jar that you import but does not exist in the target domain, then these services along with the alerts rules are imported as is. However, if the same service exists in the target domain as well, then the import behavior is governed by the state of the Preserve Operational Settings during the import operation. For more information on how to preserve operational settings, see Updates to Operational Settings During Import of ALSB Configurations.

The ALSB Dashboard

The ALSB dashboard displays service health, server health and details of all the alerts that have been triggered in ALSB run time. The dynamic refresh of this display is controlled by the Dashboard Refresh Rate setting in the User Preferences page. The default option for this setting is No Refresh. These alerts can be the result of SLA violations or pipeline alerts. Service Level Agreements(SLAs) are agreements that define the precise level of service expected from the business and proxy services in ALSB. Pipeline alerts are defined in the message flow for business purposes such as record the number of message that flow through the message pipeline, to track occurrences of certain business events, or to report errors but not for the health of the system.

Each row of the table displays the information that you have configured, such as the severity, timestamp, and associated service. Clicking the Alert Name link displays Alert Details page for more details about the SLA alert. This helps you to analyze the cause of the SLA alert. Clicking the Alert Summary link displays the Alert Details for more details about the pipeline alert. This helps you to analyze the cause of the SLA alert.

From the dashboard, you can drill-down into the system and easily find specific information, such as the average execution time of a service, the date and time an alert occurred, or the duration for which server has been running.

The following sections helps you to understand the information displayed on ALSB dashboard.

How to Access Service Statistics for the Current Aggregation Interval

Click the Service Health tab to access the Service Health page. The Service Health page is displayed as shown in Figure 3-8.

This is a dynamic view of statistical data collected by each service. This view is available when you select Current Aggregation Interval in the Display Statistics field. The aggregation interval displayed in this view determines the statistics that are displayed. For example, if the aggregation interval of a particular service is twenty minutes, that service’s row displays the data collected in the last twenty minutes. From this page you can view all services or search for services based on the given criteria. For more information about the statistics displayed in this page, in the Current Aggregation Interval view, see Viewing Service Metrics in Monitoring in Using the AquaLogic Service Bus Console.

The Service Monitoring Details page provides you with two views of detailed information about a specific service. Figure 3-9 shows the Service Monitoring Details page for a business service in the current aggregation interval. Figure 3-10 shows the Service Monitoring Details page for a proxy service. To access this page click the name of the service in the Service With Most Alerts section, Alert History table, or extended alert history table for SLA alerts and pipeline alerts. Also the name of the service in Service Health tab is a link to Service Monitoring Details page.

This is a dynamic view of statistical data collected by each service. This view is available when you select Current Aggregation Interval in the Display Statistics field. The aggregation interval displayed in this view determines the statistics that are displayed. For example, if the aggregation interval of a particular service is twenty minutes, this page displays the data collected in the last twenty minutes for that service. For more information on different tabs available for a business service, see Service Metrics, Operations, and Endpoint URIs. For more information on different tabs available for a proxy service, see Service Metrics and Flow Components and Operations.

For more information about the statistics displayed in this page, in the Current Aggregation Interval view, see Viewing Service Metrics in Monitoring in Using the AquaLogic Service Bus Console.

How to Access Running Count Statistics for Services

The running count statistics for a service are statistics that are available since the last reset.

This view is a running count of the service health metrics. This view is available when you select Since Last Reset in the Display Statistics field. The statistics displayed in each row are for the period since you last reset the statistics for an individual service or since you last reset the statistics for all services. You can also reset statistics for selected services or for all services. For more information about the statistics displayed in this page, in the Since Last Reset view, see Viewing Service Metrics in Monitoring in Using the AquaLogic Service Bus Console.

This view is a running count of the service monitoring metrics. This view is available when you select Since Last Reset in the Display Statistics field. The statistics displayed in each row are for the period since you last reset the statistics for an individual service or since you last reset the statistics for all services. From this page you can view all services or search for services based on the given criteria. You can also reset statistics for this service. For more information about the statistics displayed in this page, in the Since Last Reset view, see Viewing Service Metrics in Monitoring in Using the AquaLogic Service Bus Console.

You have the following tabs in the Service Monitoring Details page for each of the views:

Service Metrics

The Service Metrics (see Figure 3-14) view displays the metrics for a proxy service or a business service.

General– This section enables you to quickly view the status of the alerts and service level statistics for the service in the current aggregation interval. When you view the service level statistics for the time interval since the last reset, this displays all the metrics since they were last rest. For more information about the metrics displayed in this view, see Viewing Service Metrics in Monitoring in Using the AquaLogic Service Bus Console.
Throttling – This section enables to view the throttling statistics for a business service. You can also see the minimum and maximum throttling time in milliseconds. For more information on throttling statistics, see Viewing Service Metrics in Monitoring in Using the AquaLogic Service Bus Console.

Operations

These metrics are displayed for WSDL based services for which you have defined operations. The Operations tab (see Figure 3-15) displays the statistics for the operation defined in a WSDL based service . For more information statistics displayed in this tab, see Viewing Operations Metrics for WSDL Based Services in Monitoring in Using the AquaLogic Service Bus Console.

Flow Components

This view (see Figure 3-16) gives information on various components of the pipeline of the service. The Flow Components tab is available only for proxy services. For more information about the statistics displayed in this tab, see Viewing Flow Components Metrics in Monitoring in Using the AquaLogic Service Bus Console.

Endpoint URIs

The Endpoint URIs tab of the Service Monitoring page for a business service gives statistics of the various endpoint URIs configured for a business service and their status. For more information about the statistics displayed in this view, see Viewing Business Services Endpoint URIs Metrics in Monitoring in Using the AquaLogic Service Bus Console.

Viewing SLA Alerts in the Dashboard

You can view the details of SLA alerts in the SLA Alerts tab of the dashboard. Table 3-7 describes the dashboard for SLA alerts:

Table 3-7 ALSB Dashboard for SLA Alerts
Section	Description
SLA Alerts	The pie chart shows the distribution of SLA alerts based on their severity for the duration set for alert history in the dashboard settings page. The severity level of alerts is user configurable and has no absolute meaning. For more information about alert severity, see Assigning Severity for Alerts. Click on a specific area in the pie chart to display the Extended SLA Alert History page for alerts for the chosen level of severity and alert history duration.
Services With Most SLA Alerts	This section lists all the services with most SLA alerts in the current aggregation interval.
Alert History	This section gives details for all the SLA alerts generated during the alert history duration. For more information, see Viewing the Alert History for Pipeline Alerts.

Viewing the Alert History for SLA Alerts

The Alert History (Figure 3-7) for SLA alerts table shows all the SLA alerts, which have occurred in the alert history duration you have set in the User Preferences page. For more information about alert history table, see Viewing SLA Alerts in Monitoring in Using the AquaLogic Service Bus Console.

To view a complete list of alerts, click Extended Alert History. For more information about Extended Alert History, see How to View or Delete SLA Alerts.

Viewing Pipeline Alerts in the Dashboard

You can view the pipeline alerts in the Pipeline Alerts tab of the dashboard (see Figure 3-18). Table 3-8 describes the dashboard for pipeline alerts.

Table 3-8 Dashboard for Pipeline Alerts
Section	Description
Pipeline Alerts	The pie chart shows the distribution of pipeline alerts based on their severity for the duration set for alert history in the dashboard settings page. The severity level of alerts is user configurable and has no absolute meaning. For more information about alert severity, see Assigning Severity for Alerts. Click on a specific area in the pie chart to display the Extended pipeline Alert History page for alerts for the chosen level of severity and alert history duration.
Service With Most Alerts	This section lists all the services with most pipeline alerts in the current aggregation interval.
Alert History	This section gives details for all the pipeline alerts generated during the alert history duration. For more information, see Viewing the Alert History for Pipeline Alerts.

Viewing the Alert History for Pipeline Alerts

The Alert History (Figure 3-7) for pipeline alerts table shows all the pipeline alerts, which have occurred in the alert history duration you have set in the User Preferences page. For more information about Alert History table, see Viewing Pipeline Alerts in Monitoring in Using the AquaLogic Service Bus Console.

To view a complete list of alerts, click Extended Alert History. For more information about Extended Alert History, see How to View or Delete Pipeline Alerts.

Viewing Server Health in the Dashboard

Viewing Log Summary

The Log Summary section in the Server Health tab displays the summary log for the servers associated with the domain. The domain log file provides a central location from which to view the overall status of the domain. Each server instance forwards a subset of its messages to a domain-wide log file. By default, servers forward only messages of severity level Notice or higher. You can modify the set of messages that are forwarded. For more information, see Understanding WebLogic Logging Services in Configuring Log Files and Filtering Log Messages.

If you configure the logging action in a pipeline, the log is forwarded to the admin server log. You can view the logging messages in the Server Health tab of the ALSB Console. In a cluster it is forwarded to the managed server. You cannot view the logging messages in the ALSB Console. Unless you configure WebLogic Server to forward these messages to the domain log, you cannot view this log from the ALSB Console. For information on how to do this, see Create Log Filters in the WebLogic Server Administration Console Online Help.

To see the number of messages currently raised by the system, click the View Log Summary link in the Server Summary panel. A table is displayed that contains the number of messages grouped by severity, as shown in Figure 3-19.

You can view the log summary only if you posses administrator privileges in the WebLogic Server Console.

Table 3-9 Log Summary Messages
Message	Description
Alert	This indicates that a particular service is in an unusable state while other parts of the system continue to function. Automatic recovery is not possible; immediate attention of the administrator is required to resolve the problem.
Critical	This indicates that a system or service error has occurred. The system can recover but there might be a momentary loss or permanent degradation of service.
Emergency	This indicates that the server is in an unusable state. This severity indicates a severe system failure.
Error	This indicates that a user error has occurred. The system or application can handle the error with no interruption. Limited degradation of service may occur.
Info	This reports normal operations; a low-level informational message.
Notice	This is an informational message with a higher level of importance than Info messages.
Warning	This indicates that a suspicious operation or configuration has occurred. However, normal operations may not be affected.

This display is based on the health state of the running servers, as defined by the WebLogic Diagnostic Service. For more information about the WebLogic Diagnostic Service, see Configuring and Using the WebLogic Diagnostics Framework.

To view the domain log for a particular status of alert message, click the number corresponding with the status of alert message. shows an example of a domain log file displayed in the ALSB Console.

For more information about domain log file, see Viewing Domain Log Files in Monitoring in Using the AquaLogic Service Bus Console.

To display details of a single log file on the page, select the appropriate log, then click the View. You can also customize the Domain Log File Entries table to view the following additional information:

Machine
Server
Thread
User ID
Transaction ID
Context ID
Timestamp

For additional description of these information, see Viewing Details of Server Log Files in Monitoring in Using the AquaLogic Service Bus Console. For more information about how to customize the Domain Log File Entries table, see Customizing Your View of Domain Log File Entries in Monitoring in Using the AquaLogic Service Bus Console.

Viewing Server Summary

You can view the Server Summary in the Server Health tab of the dashboard. In a single node domain, the Server Summary displays the summary of the admin server. In a cluster domain, it displays the health of all the servers in a cluster, in case of a cluster environment. For more information on Server Summary, see Viewing Server Information in Monitoring in Using the AquaLogic Service Bus Console.

Viewing Server Details

You can access this page by clicking the name of a server under server summary or by clicking the name of a server in the Servers Summary page.

This page enables you to view more server monitoring details, as shown in Figure 3-21.

The information displayed on this page is a subset of the Monitoring tab in the ALSB Console Server Settings page. Table 3-10 describes the available information.

Table 3-10 Server Information
Information	Description
General	This provides general run-time information about the server. Click Advanced to view more information, such as WebLogic Server version or operating system name.
Channels	This provides monitoring information about each channel.
Performance	This provides information about the performance of the server.
Threads	This provides current run-time characteristics and statistics for the server’s active executable queues.
Timers	This provides information about the timer used by the server.
Workload	This provides statistics for work managers, constraints, and policies configured on the server.
Security	This statistics for work managers, constraints, and policies configured on the server.
JMS	This allows you to monitor JMS information about the server.
JTA	This provides the summary of all transaction information for all resource types on the server.

Operations Guide