Manage failed webhook calls

This section discusses how to manage webhook calls when they fail.

Oracle Commerce provides two mechanisms for resending failed calls, one for event webhooks and one for function webhooks.

Queue event webhooks for resending

As discussed in Understand event webhooks, an event webhook sends a POST request to specified URLs each time a specific event occurs (for example, when an order is submitted). The body of the request contains the data associated with the event. An external system that receives the message returns a 200-level HTTP status code if the data is received successfully.

If the message is not received successfully by one of the URLs (for example, due to a network issue or an external system being down), Oracle Commerce sends the POST request again to that URL after a specified interval, and continues resending it until it succeeds or until the specified limit on the number of attempts is reached. By default, the interval is one hour, and the maximum number of attempts is 5, but you can change these values using either the updateWebHook or updateWebHooks endpoint.

Messages that are not delivered successfully after the maximum number of attempts are saved to a failed message log for later retrieval. Commerce includes a mechanism for managing failed messages automatically. You can also manage these failed messages manually using endpoints in the Admin REST API, or using the administration interface.

Manage failed messages automatically

To manage failed messages, Commerce monitors each target URL that a webhook is configured to send messages to, and if a URL is unresponsive, disables it as a target. For example, if the Order Submit webhook sends messages to three different URLs, and Commerce detects that calls to one of the URLs are failing consistently (returning non-200-level status codes, or not returning any response), it stops sending messages to this URL, while continuing to send messages to the other two URLs. The messages for the disabled URL are instead added directly to the failed message log.

Commerce continues monitoring the disabled target. When it detects that the URL is responding again, it resumes sending messages to it. Messages to the URL that failed previously (either reached the maximum number of retries, or were sent directly to the failed message log after Commerce disabled the target) are queued for resending. Note that it may take a while for all failed messages to be resent.

Manage failed messages using the REST API

The Admin REST API has several endpoints for viewing, deleting, and resending failed event webhook messages.

You can use the getFailedMessage endpoint to view a failed message that has been stored. You specify the ID of the message in a URL path parameter.

You can use the getFailedMessages endpoint to view all of the failed messages that have been stored. However, there may be a large number of messages, so you may find it desirable to return only a subset of the failed messages.

You can use the q query parameter with the getFailedMessages endpoint to filter the set of messages to return, based on values of the message properties. Typically you would filter based on serverType (production or publishing) or messageType. For example, the following call returns only those failed messages whose messageType is atg.commerce.fulfillment.SubmitOrder:

GET /ccadmin/v1/webhookFailedMessages?q=messageType="atg.commerce.fulfillment.SubmitOrder"  HTTP/1.1
Authorization: Bearer <access_token>

You can also filter messages by when they were saved. For example, to return messages that were saved after a specific time:

GET /ccadmin/v1/webhookFailedMessages?q=savedTime > datetime("2018-9-22  12:05:54 GMT")  HTTP/1.1
Authorization: Bearer <access_token>

To resend failed messages, you can either specify them individually using the updateFailedMessage endpoint, or use the updateFailedMessages endpoint to queue all of the stored messages for resending.

To resend a single failed webhook message, use the updateFailedMessage endpoint. The body of the request should set the resend property of the failed message to true. For example:

PUT /ccadmin/v1/webhookFailedMessages/200001  HTTP/1.1
Authorization: Bearer <access_token>

{
   "resend": true 
}

Setting resend to true causes the message to be added to a queue for resending. If the message was originally sent to multiple URLs, the service that manages the queue ensures that the message is resent to only those URLs for which the webhook failed originally.

You can use the updateFailedMessages endpoint to queue all of the stored messages for resending, or use this endpoint with the q parameter to specify a subset of the stored messages for resending. Note, however, the format of filter expressions for this parameter is different from the format used for the getFailedMessages endpoint. With getFailedMessages, the q parameter accepts expressions in RQL format by default (although it can optionally accept SCIM format instead). With updateFailedMessages, the q parameter accepts expressions in SCIM format only. See REST API query parameters for more information.

For example, the following call adds the failed production messages to the queue for resending:

PUT /ccadmin/v1/webhookFailedMessages?q=serverType eq "production"  HTTP/1.1
Authorization: Bearer <access_token>

{
   "resend": true 
}

The following call adds only the production messages that were saved after a specific time:

PUT /ccadmin/v1/webhookFailedMessages?q=serverType eq "production" and savedTime gt "2019-04-11T02:41:00.000Z"  HTTP/1.1
Authorization: Bearer <access_token>

{
   "resend": true 
}

As an alternative to the updateFailedMessages endpoint, you can use the requeueFailedMessages endpoint, which allows you to specify the set of messages to resend using criteria specified in the endpoint request body.

Manage failed event webhooks in the administration interface

In addition to using Admin API endpoints to retrieve and resend failed event webhook messages, you can also perform these tasks in the Commerce administration interface.

To view a list of failed event webhook messages in the Commerce administration interface:

Click the Service Operations icon.
Commerce displays a list of failed event webhook messages.
Use the options at the top of the page to sort and filter the list of failed webhook messages.
For example, you can sort them from oldest to newest, and filter the list so that it displays only Order Submit messages in your production environment that failed in the last 24 hours.
Click a message’s Information icon to see details about why the message failed.

Once you have filtered the list of failed webhook messages, you can resend or delete some or all of them.

To resend a single webhook message, click its Resend icon. To resend all the webhook messages in the filtered list, click the Resend All icon at the top of the page.
Commerce adds these messages to a queue for resending. If the message was originally sent to multiple URLs, the service that manages the queue ensures that the message is resent to only those URLs for which the webhook failed originally.
To delete all the webhook messages in the list, click the Delete All icon at the top of the page. You cannot delete a message that is queued for retry.

Changes you make on the Service Operations page take effect as soon as you save them. You do not need to publish the changes.

Retry function webhooks

As discussed in Queue event webhooks for resending, Oracle Commerce includes a mechanism for managing failed event webhook calls. Because event webhooks are asynchronous, this mechanism supports queueing the failed messages for periodic retry.

Function webhooks, however, are synchronous, so the queueing mechanism used for event webhooks is not suitable for managing failed function webhook calls. Instead, Commerce provides a synchronous retry mechanism for certain function webhooks. If a webhook call using this mechanism does not initially succeed, it is immediately retried several times until it either succeeds or reaches the maximum number of retries, at which point it fails.

A call succeeds only if it returns an HTTP status code in the 2xx range. If any other status code is returned, or if nothing is returned due to a timeout or network error, the call fails.

Retry is supported for the following function webhooks:

Shipping Calculator
External Tax Calculation
Catalog and Price Group Assignment
Order Approvals
Return Request Validation

Enable retry

Retry is controlled by two JSON properties, supportsSynchronousRetry and synchronousRetries. The supportsSynchronousRetry property is a read-only property that specifies whether the webhook supports the use of retry. It is set to true for the webhooks listed above, and is set to false for all other function webhooks. You cannot change the value of this property on any function webhook.

If a webhook’s supportsSynchronousRetry property is true, you can enable retry for that webhook by setting its synchronousRetries property to an integer greater than zero (0). The value of synchronousRetries specifies the maximum number of times to retry the call. Note that if the value of synchronousRetries is 0, no retry will take place, even if supportsSynchronousRetry is true.

The following example sets the value of synchronousRetries for the Order Approvals webhook:

PUT /ccadmin/v1/functionWebhooks/production-checkOrderApprovalWebhook  HTTP/1.1
Authorization: Bearer <access_token>
Content-Type: application/json

{
    "synchronousRetries": 5
}

The retry mechanism has a one minute limit. If the calls fail with an HTTP error immediately, the retries will be executed in rapid succession until the limit is reached or a call succeeds. But if the calls time out, the mechanism may reach the one minute limit before the maximum number of retries is reached.

Note that the Shipping Calculator and External Tax Calculation webhooks support a fallback mechanism that returns preconfigured default values if calls to the associated shipping calculator or tax calculator fail. If both fallback and retry are enabled for one of these webhooks, in some cases the fallback values may be returned even if the maximum number of retries has not been reached.

To see a list of all of the available function webhooks, including information about which ones support retry, use the getFunctionWebHooks endpoint in the Admin API. The response includes the values of the supportsSynchronousRetry and synchronousRetries properties for each function webhook.