System Response After a Map/Reduce Interruption

A map/reduce script can be interrupted at any time. For example, a disruption to the application server immediately stops the script’s execution. Additionally, an uncaught error, although it does not cause the script to stop executing, stops the current function invocation, even if it is not complete.

For more details, review the following sections:

Important:

Regardless of how your script is configured, you should make sure that it includes logic that checks to see whether a restart has occurred. If the function has been restarted, the script should take any actions needed to avoid unwanted duplicate processing. For details, see Adding Logic to Handle Map/Reduce Restarts.

System Response After an Application-Server Disruption

An application disruption can occur because of a NetSuite update, NetSuite maintenance, or an unexpected failure of the execution environment. When the application server is disrupted in this way, the script stops executing. After the application server restarts, the script also restarts, resuming the same stage that it was in process when the script was interrupted.

When an application server restart interrupts the map or reduce stage, the system writes the SSS_APP_SERVER_RESTART error code to the relevant iterators. This error code is shown alongside the codes recorded for any uncaught errors that were thrown.

For more details, see the following table.

Stage where interruption occurred

Script behavior

SSS_APP_SERVER_RESTART error code written to

Get Input Data

When the disruption occurs, the script stops executing.

After the application server restarts, the system restarts the function.

Map

When the disruption occurs, the script stops executing.

Any data that was saved during the previous invocation by using the context.write() method is discarded. Afterward, the response is as follows:

  1. The system evaluates the retryCount config setting. If retryCount is set to a value greater than 0 or if the retryCount setting is not used, the script tries to process the same set of key-value pairs it was processing when the application server became unavailable. This data includes all pairs that were flagged for processing but not marked complete. However, if retryCount is set to 0, the script moves on to Step 2 without attempting further processing for these key-value pairs.

  2. The job moves on to other key-value pairs that require processing and were not previously flagged as in progress.

  • mapContext.errors — Contains the error codes recorded during previous attempts to process the current key-value pair.

  • mapSummary.errors — Contains all error codes recorded during the map stage.

Reduce

  • reduceContext.errors — Contains the error codes recorded during previous attempts to process the current key-value pair.

  • reduceSummary.errors — Contains all error codes recorded during the reduce stage.

Summarize

When the disruption occurs, the entire script stops executing.

After the application server restarts, the system restarts the function.

System Response After an Uncaught Error

An error that is not caught in a try-catch block does not necessarily end the execution of a map/reduce script, but the error can disrupt the script’s work. Some aspects of this behavior are configurable. For details, see the following table.

Stage where error occurred

Script behavior

Errors written to

Get Input Data

The script ends the function invocation and exits the stage. It proceeds directly to the summarize stage. This behavior cannot be configured.

inputSummary.error

Map

When the error occurs, the function invocation ends, even if its work is not complete. Any data that was saved during the invocation by using the context.write() method is discarded. Afterward, the system responds as follows:

  1. The system evaluates the retryCount setting. If retryCount is set to a value greater than 0, and the maximum number of retries has not yet been used, the script tries to process the same key-value pair again. The rest of the steps in this process are not used.

  2. The system evaluates the exitOnError setting. If exitOnError is set to true, the script exits the current stage and proceeds directly to the summarize stage. The rest of the steps in this process are not used.

  3. The job continues by moving on to other key-value pairs that require processing. It does not do any further work on the pair it was processing when the error occurred.

  • mapContext.errors — Contains the error codes recorded during previous attempts to process the current key-value pair.

  • mapSummary.errors — Contains all error codes recorded during the map stage.

Reduce

  • reduceContext.errors — Contains the error codes recorded during previous attempts to process the current key-value pair.

  • reduceSummary.errors — Contains all error codes recorded during the reduce stage.

Summarize

The script stops executing. This behavior cannot be configured.

Note:

This table describes the behavior for the majority of errors, but a few errors result in different behavior. For example, if one of the jobs being processed in the map or reduce stage fails on SSS_USAGE_LIMIT_EXCEEDED, other jobs in Processing status are normally finished and their executions are not interrupted, but jobs in Pending status are canceled immediately. For details, see Hard Limits on Total Persisted Data.

Related Topics

Map/Reduce Script Error Handling
Configuration Options for Handling Map/Reduce Interruptions
Logging Errors
Execution of Restarted Map/Reduce Stages
Adding Logic to Handle Map/Reduce Restarts

General Notices