6 Monitoring Resequencing Groups

This chapter describes how to monitor resequencing groups for running pipelines, and to recover from any errors that occur while resequencing and processing messages.

This chapter includes the following sections:

6.1 Introduction to Resequencing Groups

Service Bus pipelines can be configured to use a resequencer to re-order messages that arrive in a random order into a new order based on the resequencing strategy chosen.

Resequencer strategies include best effort, standard, and FIFO. Messages are resequenced based on their resequencing group ID and their sequence ID, which are defined in the pipeline configuration. For information about how each strategy orders messages, see Introduction to the Resequencerand Resequencing Orderin Developing SOA Applications with Oracle SOA Suite.

The Resequence Messages tab of the Service Bus Home page in Fusion Middleware Control displays resequencing information so you can monitor the health of resequencing groups. This page shows message and group status, along with the Service Bus pipeline and project associated with each message and group. From this page, you can unlock a group that has timed out, resubmit failed messages, and skip a message that is blocking group processing.

No health statistics are exposed for the resequencer. Instead, you can monitor the statistics for the associated service components, such as the pipelines and proxy services.

6.1.1 Oracle Service Bus Resequencing Message States

The Resequence Messages page displays a running status of message processing for resequencing groups and messages. A resequencing message can be in one of the following states.

Running
Faulted
Completed
Aborted

When a group is in the running state, all messages are being processed normally. In the completed state, the group has finished processing all available messages. A resequencing group can be in a faulted state due a resequencing error, message error, database error, or group time out. Completed messages only appear in the results if the Purge Completed Messages global setting is not selected. Otherwise, completed messages are purged from the database and cannot be displayed on this page.

6.1.2 Resequencer Error Handling

When message processing is suspended in a resequencing group due to a fault or a timeout, you can view additional information about the suspended group and specify how to restart message processing. Depending on the type of fault, you can cancel processing for the message or you can modify the payload and reprocess the message. When a group times out, you can skip to the next available instance to restart processing.

Resequencing errors can occur during message persistence or message execution. Persistence errors include those that occur when evaluating the group ID or sequence ID, or when persisting the payload and message context variables. Execution errors include those that occur when accessing the database or when processing the message when it is sent to the pipeline. A group timeout can occur for the standard resequencer when a group is waiting for an expected message that does not arrive.

6.1.3 Resequencer Database

The resequencer relies on a database for processing messages. The database tables are automatically created when you run Repository Creation Utility (RCU) when you create a Service Bus domain. Messages are purged from the database only when you configure the resequencer global settings to do so. For more information about how messages and message metadata are purged, see Automatic Purging of Completed Resequencer Messages.

Service Bus provides scripts to purge and manage the resequencer tables in the database. For more information, see Managing Resequencer Tables.

6.1.4 How Deployment Activities Affect Resequencing

Modifying a resequencer after it has been activated affects how the messages are processed at runtime. Activities that affect resequencers include updating the resequencer configuration, deleting a resequencer, or renaming or moving a pipeline associated with a resequencer. Under normal processing, the resequencer stores messages in the database and, once the messages are re-ordered, the messages are executed in another thread. When you remove a resequencer from a pipeline while messages are being processed, the following occurs:

Messages that have not been picked up from the database for processing remain in the database and are not automatically cleaned up.
Messages currently being picked up from the database for processing (but not yet sent to the pipeline) might generate an error message stating that the resequencer is undeployed. These messages also remain in the database.
Messages currently being processed by the pipeline are executed using the previous resequencer configuration.

If you rename or move the pipeline associated with a resequencer, the resequencer is stopped and a new resequencer instance is created using the new path. If messages are already being processed when the change is made, messages are processed as described above.

Updating a resequencer configuration while it is processing messages may result in messages that are not processed. For example, if you modify the group ID, messages stored under the old group ID are not picked up for processing and remain in the database until they are manually cleared.

6.1.5 How Server Shutdown Affects Resequencing

The resequencer handles messages differently when the server shuts down depending on where the message is in the process. This section describes three difference cases.

6.1.5.1 Server shuts down while a message is being transferred to the resequencer from Service Bus

For message persistence, the resequencer participates in the current transaction if it exists; otherwise it starts its own transaction. If the proxy service is transactional, the transaction is rolled back and the message is redelivered based on the transactional setting on the inbound service. If the proxy service is non-transactional, the message may be lost depending on whether the resequencer could commit its transaction before the server shut down.

6.1.5.2 Server shuts down while a group is locked by the locker thread

When the locker marks a group as locked and the server that is supposed to process the group's messages shuts down, the resequencer attempts to move this group to a different managed server for processing.

6.1.5.3 Server shuts down while a message is being processed by the resequencer

When the server shuts down while a message is being processed, the message remains available to the resequencer and is processed once the server comes online again. There may be instances where the message is sent twice for processing from the resequencer to Service Bus. As an example, if the resequencer starts a transaction and then calls the Service Bus dispatch, after which the server shuts down, the message is again sent for processing when the server comes back online.

6.2 Configuring Resequencing at Runtime

For services that use a resequencer, you can configure global settings that govern how the resequencers process messages at runtime. Service Bus provides these operational settings for resequencers.

These are global settings only, and cannot be applied at the service level.

Resequencer Locker Thread Sleep: The sleep interval for the locker threads in seconds. When the resequencer is unable to find a group with messages that can be processed, the locker thread sleeps for the specified duration. The locker thread does not sleep between each iteration of a database seek, as long as it finds groups with messages that can be processed.
Resequencer Maximum Groups Locked: The maximum number of resequencer groups that can be retrieved for processing in a single iteration of a database seek. Once retrieved, the groups are assigned to worker threads for processing.
Purge Completed Messages: When this option is selected, Service Bus purges resequenced messages that have completed processing from the resequencer database.

For information about configuring global settings, see Viewing and Configuring Operational Settings.

Note:

If you want to monitor successful as well as faulted instances for resequencing, enable execution tracing for the pipeline as well.

6.3 Monitoring Resequencing Groups and Messages

You can monitor resequenced messages from the Service Bus Home page on the Resequence Messages tab.

The Resequence Messages page lets you search for specific groups or components to monitor, and you can filter the results by the message state. Use this page to see whether any messages have faulted or if all groups are processing messages normally.

6.3.1 Monitoring Resequencing Groups and Messages

Information you can monitor for resequenced messages includes the group and message ID, the service processing the message and the project to which it belongs, the current message status, and the name of the WSDL operation, if any. The following figure shows the Resequence Messages page.

Figure 6-1 Resequence Messages Page

Description of "Figure 6-1 Resequence Messages Page"

To monitor resequencing groups and messages:

In Fusion Middleware Control, expand SOA and select service-bus.
Click the Resequence Messages tab.
To list only specific groups, enter any of the following search criteria:
- In the Resequencing Group field, enter the name of the group whose messages you want to monitor.
- In the Name field, click the Browse icon to search for and select a Service Bus pipeline whose associated resequencing messages you want to monitor.
- In the State field, select one or more states from the drop-down list of options.
  
  Note:
  
  By default, Faulted and Running are both selected. You can only view completed messages if Purge Completed Messages is not selected on the Global Settings page. Any messages that were completed while the purge option was selected cannot be viewed here.
- Click Reset to remove the search filters and display all resequencing messages.
Click Search.

A list of resequencing messages matching your criteria appears in the Resequencing Messages table. For information about the fields shown, see the online help for this page. For information about message states, see Oracle Service Bus Resequencing Message States.
Perform any of these additional steps:

6.3.2 Viewing Information About a Resequencing Group

Clicking a group ID in the Resequencing Messages table opens a Resequencing Group dialog, which displays a message indicating whether the group is processing messages successfully. The Resequencing Group dialog provides the following information about a group and varies based on the state of the group:

Whether the group is timed-out or faulted
The blocking message in the group, if any
The next message to be processed after the group is unlocked
The time after which the processing of the messages in the group stopped
The instruction text to unlock the group

To view information about a resequencing group:

Display the Resequence Messages tab, as described in Monitoring Resequencing Groups and Messages.
Click the name of the group with the message ID you want to view.

The Resequencing Group dialog appears.
Perform any of the tasks described in Managing Resequencing Groups at Runtime to handle resequencing issues. If the selected message is being processed, the dialog simply states that the groups is now processing messages.

6.4 Managing Resequencing Groups at Runtime

When resequencing groups experience message errors, database errors, or time outs, Service Bus provides ways to recover from, and in some cases fix, the issues.

You can skip messages in a group that are stuck and are blocking the group from processing additional messages, and you can modify the payload for faulted messages and attempt to reprocess them. The following sections describe ways to fix and recover from resequencing issues.

6.4.1 Skipping Message Sequence IDs

For standard resequencer groups, the Resequencing Group dialog provides an option to skip the next sequence ID and resume processing from the following message in the sequence. This is useful when a group is still running, but might be waiting for a message that will never arrive. The standard resequencer holds back messages in the database until it can produce the right sequence for the different groups. If the message with the next sequence ID for a given group never arrives, the pending messages for that group are held back until someone manually unlocks the group and skips to the next message.

Note:

When you manually skip a sequence ID and the missing message with that ID subsequently arrives, you need to manually execute the message in Fusion Middleware Control if you want to process it. The message is not automatically recovered.

To skip a message in a resequencing group:

Display the Resequence Messages tab, as described in Monitoring Resequencing Groups and Messages.
Click the name of a standard resequencing group.

The Resequencing Group dialog appears.
To skip the current sequence ID and start processing the next available instances in the group, click Skip.

6.4.2 Recovering when a Resequencing Group Times Out

A group is in the timed-out state when processing of the group stops while waiting for an expected message, blocking any remaining messages in the group. You can skip to the next sequence ID to unblock the group. The following information is displayed for a timed-out group:

The sequence ID of the last processed message
The sequence ID of the next message to be processed, along with its instance ID

Note:

When you manually skip a sequence ID and the missing message with that ID arrives after the timeout, you need to manually execute the message in Fusion Middleware Control. It is not automatically recovered.

To recover from a group time out:

Display the Resequence Messages tab, as described in Monitoring Resequencing Groups and Messages.
Click the name of a group that has timed out.

The Resequencing Group dialog appears, as shown in Figure 6-2.

Figure 6-2 Resequencing Group is Timed Out

Description of "Figure 6-2 Resequencing Group is Timed Out"
To unlock the group and start processing the next available instances in the group, click Skip.

6.4.3 Recovering from Resequencing Faults

A group is in the faulted state when one of its messages throws an error while being processed. When a fault occurs, you can fix and retry the message, or you can cancel processing of the message. For resequencing groups with faults, the Resequencing Group dialog lists the following information for a faulted group:

The last time a message was processed
The sequence ID of the faulted message
The sequence ID of the next message to processed, along with its instance ID
Payload

To recover from a resequencing fault:

Display the Resequence Messages tab, as described in Monitoring Resequencing Groups and Messages.
Click the name of a group with a status of Faulted.

The Resequencing Group dialog appears, as shown in Figure 6-3.

Figure 6-3 Resequencing Group Is Faulted

Description of "Figure 6-3 Resequencing Group Is Faulted"
Do one of the following:
- To recover the message, modify the text of the payload to correct the error and then click Recover.
- To cancel processing of the message and move on to the next, click Abort.