This chapter describes how to manage endpoint URIs in Oracle Service Bus business services, including configuring retries, marking non-responsive endpoints as offline, viewing endpoint metrics, and triggering alerts based on endpoint status.
An endpoint URI is the URL of an external service that is accessed by a business service. In Oracle Service Bus you must define at least one endpoint URI for a business service. When you define multiple endpoint URIs for a business service you must define one of the following load balancing algorithm:
Round robin
Random
Random-weighted
None
The load balancing algorithm controls the manner in which business service tries to access the endpoint URI. The status of the endpoint URI can be online or offline. For more information, see Section 26.9, "Configuring Operational Settings for Business Services."
This chapter contains the following sections:
Section 48.1, "How to Configure a Business Service to Perform Retries"
Section 48.5, "How to Generate Alerts Based on Endpoint URI Status"
You can define the retry option for business services. The retry option specifies the maximum number of times a business service can attempt to access endpoint URIs after an initial failure. For example, consider the behavior of a business service B with endpoint URIs eu1
, eu2
, and eu3
, when the retry count is set to 1
, 2
, and 4
.
When Retry Count = 1 – If business service B fails to process a request or is unable to access the endpoint URI eu1
, it tries to process the request with eu2
(retry 1). If the retry fails then the business service returns failure. The business service does not retry the third endpoint URI eu3
.
When Retry Count = 2 – If business service B fails to process a request or is unable to access the endpoint URI eu1
, it tries to process the request with eu2
(retry 1). If the retry fails then the business service tries to process the request with eu3
(retry 2). If the retry fails then the business service returns failure.
When Retry Count = 4 – If business service B fails to process a request or is unable to access the endpoint URI eu1
, it tries to process the request with eu2
(retry 1). If the retry fails then the business service tries to process the request with eu3
(retry 2). Then the business service waits for a interval you have configured for retry iteration interval (in seconds) before trying eu1
(retry 3). If this fails the business service retries eu2
(retry 4). If the retry fails then the business service returns failure.
If the retry count is set to 0
, then the business service does not retry after the failure.
Note:
The order in which a business service retries the endpoints is controlled by the load balancing algorithm.
A business service fails to process a request due to communication or application errors.
Communication errors occur due to random network problems. Retrying such requests with another endpoint URI can be successful. Application errors occur when a request is malformed or due to errors, and cannot be processed by any of the endpoints. You can turn off retry behavior for the application errors by setting Retry Application Errors to No in the Transport Configuration page for a business service, depending on the transport used.
A communication error occurs each time a business service tries to access a non-responsive URI. You can configure a business service to mark non-responsive URIs offline. Doing so prevents a business service from repeatedly attempting to access a non-responsive URI and therefore avoids these communication errors.
To do so, you must enable the Offline Endpoint URIs operational setting for the business service. You can mark an endpoint URI offline temporarily or permanently as described in the following sections.
Mark an endpoint URI offline temporarily if you want the business service to automatically retry the same endpoint after a short interval of time; mark it offline permanently if you want the business service to treat the endpoint URI as offline until it is reset manually.
Mark an endpoint URI offline temporarily if you want the business service to automatically retry the same endpoint after a specified interval of time.
To mark an endpoint URI offline temporarily, you can specify a Retry Interval value in the Offline Endpoint URI operational setting for the business service. On encountering a communication error, the endpoint URI status is changed to Offline. When the retry interval has passed and this business service attempts to process a new request, it tries to access this endpoint URI. If this attempt is successful, then the endpoint URI is marked online again. If the attempt to access the endpoint URI fails, then the URI is marked offline again for the duration of the retry interval, and the cycle is repeated.
This configuration can be useful for the case in which a communication error is temporary and corrects itself. For example, when an endpoint becomes temporarily overloaded communication errors occur, but reverts to normal operation without requiring manual intervention.
Mark an endpoint URI offline permanently if you want a business service to treat the endpoint URI as offline until you reset it manually.
To mark an endpoint URI offline permanently, you specify a Retry Interval value of 0
hours 0
min 0
sec in the Offline Endpoint URI operational setting for the business service. On encountering a communication error, the endpoint URI status is changed to Offline and remains offline until you mark the endpoint URI online again.
For more information, see Section 26.9, "Configuring Operational Settings for Business Services" and Chapter 46, "Monitoring Oracle Service Bus at Runtime."
This configuration is useful for a case in which a communication error is caused by a problem with the endpoint URI that must be resolved by manual intervention.
You can monitor the metrics using the Oracle Service Bus Administration Console or the JMX monitoring APIs. For information on using the Oracle Service Bus Administration Console, see Section 46.5.1, "How to Access Service Statistics from the Oracle Service Bus Administration Console." For information on using the JMX monitoring APIs, see Appendix D, "JMX Monitoring API."
In the Oracle Service Bus Administration Console, the endpoint URI metrics are available on the Endpoint URIs tab within the Service Monitoring Details page for a service. This includes count, response time, and endpoint URI status metrics.
For more information about the statistics displayed in this view, see Section 26.15, "Viewing Business Services Endpoint URIs Metrics."
You can monitor the endpoint URIs using URI status statistics and URI level statistics. The following items describe the expected behavior when you monitor endpoint URIs:
You can obtain the statistics only when you enable monitoring for a business service.
When you rename or move a service, the URI level statistics is reset.
When you change the aggregation interval, all the URI level statistics except the URI status are reset.
When you reset statistics for the service (or reset all statistics), all the URI level statistics except the URI status are reset.
When you add a new URI to an existing business service, the metrics for the new URI are collected automatically.
The Status statistic on the Oracle Service Bus Administration Console indicates whether the endpoint URI is online or offline. You can also obtain the status of an endpoint URI using the JMX monitoring APIs. Table 48-1 describes the possible states of an endpoint URI.
Table 48-1 Status of Endpoint URIs
Status | Description |
---|---|
Online |
Implies the URI is online on a given server. In a cluster it implies that the URI is online for all servers. |
Offline |
Implies the URI is offline on a given server. In a cluster it implies that the URI is offline for all servers. |
Partial |
Implies that at least one server in the cluster reports a problem for that URI. This metric is available for clusters only. |
Note:
When a URI is associated with more than one business service, the same endpoint URI can have a different status for each of the business services.
The endpoint URI performance metrics provide information on how many messages have been processed by a given endpoint and how many failed and their response times. You can use the following metrics for monitoring the endpoint URIs:
Message Count
Error Count
Average Response Time
Min Response Time
Max Response Time
For more information about these statistics, see Section 26.15, "Viewing Business Services Endpoint URIs Metrics."
The following are the important properties of the endpoint URI statistics:
You can obtain the statistics only when you enable monitoring for a business service.
When you rename or move a service or change the aggregation interval, the URI level statistics is reset.
When you add a new URI to an existing business service, the metrics for the new URI are collected automatically.
You can mark an endpoint URI that is offline as online using the Oracle Service Bus Administration Console or by using the public APIs.
In the Oracle Service Bus Administration Console, you can mark an offline endpoint URI as online from the Service Monitoring Details page. Click the Click to mark this endpoint URI online icon in the Actions column of the Endpoint URIs tab. For more information, see Section 26.15, "Viewing Business Services Endpoint URIs Metrics."
All the endpoint URIs are marked online when:
You add them to a business service
You restart a server
You enable a disabled service
You rename or move a service
A business service is able to successfully access the URI after the retry interval you have configured is past.
You can also use APIs to mark an offline endpoint URI as online. This is useful when the you have not enabled monitoring for a business service but you require to mark its endpoint URIs online. For more information, see com.bea.wli.monitoring.ServiceDomainMBean in the Oracle Fusion Middleware Java API Reference for Oracle Service Bus.
When you mark an endpoint URI online in a cluster domain, it is marked online on all the Managed Servers.
If an endpoint URI is not accessible, the business service trying to access it receives a communication error.
In addition to configuring a business service to take a non responsive URI offline, as described in Section 48.2, "How to Mark a Non-Responsive URI Offline" you can raise an alert when a system encounters non-responsive URIs. You do this by configuring SLA alert rules based on endpoint URI status.
You can configure the alert rule for a business service, based on the status of the Endpoint URI. Complete the tasks as described in Section 26.23.1, "Configuring General Information for Alert Rules."
Then complete the following tasks in the Alert Rule Conditions Configuration page, shown in Figure 48-1.
Figure 48-1 Alert Rule Condition Configuring page
In the Simple Expression section of the Alert Rule Conditions Configuration page, choose Status in the first list.
The endpoint URI status based alert rule condition is comprised of:
A state transition clause–The state transition clause supports notification when any endpoint URI or all endpoint URIs change state from online to offline, or from offline to online. Choose one of the following options to identify the status for which you want to create a notification:
All URIs offline
All URIs online
Any URI offline
Any URI online
For example, consider a business service for which two alert rules are configured, one based on All URIs offline = True
condition and another on Any URI offline = True
condition. If an alert based on All URIs offline = True
condition is generated then it signifies a severe problem because all requests to this service are likely to fail until the situation is resolved.
However, if an alert based on Any URI offline = True
is generated, it implies that the other endpoint URIs are responsive and subsequent requests may not fail.
Note:
All alert rules are independently evaluated. If alerts based on both (any or all URI) clauses have been configured for the same business service, it is likely that both alerts are generated simultaneously when the last endpoint URI is marked offline.
If a business service has only one URI, the All URIs offline = True
and Any URI offline = True
clauses mean the same thing and so they behave in an identical manner.
The evaluation of an alert rule condition based on a transition from offline to online behaves in a similar fashion except that it tracks any or all endpoint URIs being marked back to online state.
A server clause–The server clause lets you specify an alert trigger when a state transition occurs on any or all servers. The server clause is significant only in a cluster domain with multiple Managed Servers. Choose one of the following options to create the expression:
Evaluate on all servers
Evaluate on any server
A communication error can occur due to network problem on a computer hosting a Managed Server. Such an event is interpreted by the business service as the endpoint URI being non-responsive (although the remote endpoint being accessed is responsive). A communication error can also occur because the endpoint URI is not responding.
In the first case, the URIs are marked offline on only one server (on the computer with network problems) and online on all the other servers in the cluster. An alert condition based on Evaluate on any server
clause generates an alert, but an alert condition based on the Evaluate on all servers
clause does not generate an alert.
For the second case, the URI is marked offline on all the Managed Servers (one by one as each server tries to access that endpoint). As each Managed Server marks the endpoint URI offline, the alert rule condition based on Evaluate on any servers
is met and an alert is generated. When the endpoint URI is marked offline on the last of the servers in the cluster domain, the alert rule condition based on Evaluate on all servers
is also met and this alert is also generated.
Notes:
All alert rules are independently evaluated. If you have configured alerts based on both (any or all server) clauses for the same business service, it is likely that both alerts are generated simultaneously when the endpoint URI is marked offline on the last server in the cluster.
In a single server domain, the Evaluate on any server
and Evaluate on all servers
clauses mean the same thing and behave in an identical manner.
When designing the alert rules for your system, you must choose one or more combinations of the clauses in accordance with your requirements.
You must set any one of the conditions to be True or False. These conditions can be evaluated on all servers or any server in a cluster.
Note:
To ensure that you do not miss any alert that is triggered due to frequent changes in the status of the URI, Oracle recommends that you set the aggregation interval for alert rules based on the status of the URI to one minute. For more information on aggregation interval, see Section 46.2, "Aggregation Intervals."