Skip Headers

Oracle E-Business Suite Maintenance Procedures
Release 12.1
Part Number E13675-03
Go to Table of Contents
Contents
Go to previous page
Previous
Go to next page
Next

Troubleshooting

This chapter contains information about using the AD Controller to monitor and resolve issues that may arise when using other AD utilities.

This chapter covers the following topics:

Managing Worker Processes

AD Administration and AutoPatch can perform processing jobs in parallel to speed the time it takes to complete them. This section describes the procedures for reviewing these processes and handling situations where processing has been interrupted.

Note: For more information, see Using Parallel Processing in Oracle E-Business Suite Maintenance Utilities.

Reviewing Worker Status

Requirement

How can I monitor the progress of parallel processing jobs?

Discussion

When AD Administration and AutoPatch process jobs in parallel, they assign jobs to workers for completion. There may be situations that cause a worker to stop processing. AD Controller is a utility you can use to determine the status of workers and manage worker tasks. You use it to monitor the actions or workers and the status of the processing jobs they have been assigned.

Note: For more information, see AD Controller in Oracle E-Business Suite Maintenance Utilities.

Actions

To review worker status, perform these steps:

  1. Start AD Controller.

    Set the environment and enter adctrl on the command line.

    Note: For more information, see Setting the Environment and Monitoring and Controlling Parallel Processes in Oracle E-Business Suite Maintenance Utilities.

  2. Review worker status.

    Select "Show worker status" from the AD Controller main menu. AD Controller displays a summary of current worker activity. The summary columns are:

    The following table describes the types of status that may be assigned to a worker and reported in the Status column.

    Worker Status Values
    Status Meaning
    Assigned The manager assigned a job to the worker, and the worker has not started.
    Completed The worker completed the job, and the manager has not yet assigned it a new job.
    Failed The worker encountered a problem.
    Fixed, Restart The worker should retry the failed operation now that the problem has been fixed.
    Restarted The worker is retrying a job or has successfully restarted a job (note that the status does not change to Running).
    Running The worker is running a job
    Wait The worker is idle.

    If the worker status shows as Failed, the problem may need to be fixed before the AD utility can complete its processing. This is described next.

Determining Why a Worker Failed

Requirement

One of the workers has failed. How do I determine the cause of the failure?

Discussion

When a worker fails its job, you do not have to wait until the other workers and the manager stop. Use the worker log files (adworknnn.log) to determine what caused the failure. These log files are written to APPL_TOP/admin/<SID>/log. You can find the worker log file and copy it to a temporary area so that you can review it. If the job was deferred after the worker failed, there may be no action required on your part.

The first time a job fails, the manager defers the job and assigns a new worker. If the deferred job fails a second time, the manager defers it a second time only if the runtime of the job is less than ten minutes. If the deferred job fails a third time, or if the job’s runtime is greater than ten minutes, the job stays at a failed status and the worker waits for intervention.

Note: For more information, see Log and Restart Files in Oracle E-Business Suite Maintenance Utilities.

Actions

  1. Start AD Controller.

    Set the environment and enter adctrl on the command line.

    Note: For more information, see Setting the Environment and Monitoring and Controlling Parallel Processes in Oracle E-Business Suite Maintenance Utilities.

  2. Identify the worker that encountered a problem.

    Workers that have encountered problems stop processing jobs and show a status of Failed. Follow the steps in the Reviewing Worker Status section in this chapter to determine which workers have a status of Failed.

  3. Review the log file to find out why the worker failed.

    The following is an example of a worker failure message:

    AD Worker error:
    The following ORACLE error:
    
    ORA-01630: max # extents (50) reached in temp segment in tablespace TSTEMP
    occurred while executing the SQL statement:
    
    CREATE INDEX AP.AP_INVOICES_N11 ON AP.AP_INVOICES_ALL (PROJECT_ID, TASK_ID)
    NOLOGGING STORAGE (INITIAL 4K NEXT 512K MINEXTENTS 1 MAXEXTENTS 50
    PCTINCREASE 0 FREELISTS 4) PCTFREE 10 MAXTRANS 255 TABLESPACE  APX
    
    AD Worker error:
    Unable to compare or correct tables or indexes or keys because of the error
    above

    In this example, the worker could not create the index AP_INVOICES_N11 because it reached the maximum number of extents in the temporary tablespace.

  4. Determine how to fix the problem that caused the failure.

Handling a Failed Job

Requirement

I have reviewed the log file for the failed worker and determined the problem. What do I do next?

Discussion

A worker usually runs continuously in the background and when it fails to complete the job it was assigned, it reports a status of Failed. When the manager displays an error message, confirm the failed status of a worker by using AD Controller to review worker status. If the job was deferred after the worker failed, no action may be required.

Note: For more information, see Using Parallel Processing in Oracle E-Business Suite Maintenance Utilities.

Actions

Perform the following steps:

  1. Set the environment and enter adctrl on the command line.

    Note: For more information, see Setting the Environment and Monitoring and Controlling Parallel Processes in Oracle E-Business Suite Maintenance Utilities.

  2. Identify the failed file.

    The Worker and Filename columns in the AD Controller worker status screen show the numbers of the workers that failed and list the name of the files that failed to run.

  3. Review the worker log file.

    Each worker logs the status of tasks assigned to it in adworkxxx.log, where xxx is the worker number. These files are in the $APPL_TOP/admin/<SID>/log directory. For example, adwork001.log for worker 1 and adwork002.log for worker 2. Review adworknnn.log for the failed worker to determine the source of the error.

  4. Resolve the error.

    Resolve the error using the information provided in the log files. Contact Oracle Support Services if you do not understand how to resolve the issue.

  5. Restart the failed job.

    Choose Option 2 from the AD Controller main menu to tell the worker to restart a failed job.

  6. Verify worker status.

    Choose Option 1 again. The Status column for the worker that failed should now say Restarted or Fixed, Restart.

    Note: When all workers are in either Failed or Wait status, the manager becomes idle. At this point, you must take action to get the failed workers running again.

Terminating a Hanging Worker Process

Requirement

A worker process has been running for a long time. What should I do?

Discussion

When running AD utilities, there may be situations when a worker process appears to hang, or stop processing. If this occurs, it may be necessary to terminate the process manually. Once you do, you must also restart that process manually.

Caution: A process that appears to be hanging could actually just be a long-running job.

To terminate a process, start AD Controller, obtain the ID of the worker, and then stop any hanging processes. Once you make the necessary changes, you can restart the job or worker.

Note: For more information, see Restarting a Failed Worker in this chapter. See also AD Command Line Utilities and Monitoring and Controlling Parallel Processes in Oracle E-Business Suite Maintenance Utilities.

Actions

  1. Start AD Controller.

    Set the environment and enter adctrl on the command line.

    Note: For more information, see Setting the Environment and Monitoring and Controlling Parallel Processes in Oracle E-Business Suite Maintenance Utilities.

  2. Determine what the worker process is doing.

    Use the AD Controller worker status screen to determine the file being processed and check the worker log file to see what it is doing:

  3. Get the worker’s process ID.

    If the job is identified as "hanging," determine the worker’s process ID.

    UNIX:

    $ ps -a | grep adworker

    Windows:

    Invoke the Windows Task Manager (with Ctrl-Alt-Delete or Ctrl-Shift-Esc) to view processes.

  4. Determine what processes the worker has started, if any.

    If there are child processes, get their process IDs. Examples of child processes include SQL*Plus and FNDLOAD.

  5. Stop the hanging process, using the command that is appropriate for your operating system.

  6. Fix the issue that caused the worker to hang. Contact Oracle Support Services if you require assistance doing this.

  7. Restart the job or the worker.

    See Restarting a Failed Worker in this chapter for more information.

Restarting Processes

This section describes some situations where you may need to choose the restart option in AD Controller.

Restarting a Failed Worker

Requirement

I need to restart a failed worker.

Discussion

If a worker has failed, or if you have terminated a hanging worker process, you need to restart the worker manually.

Some worker processes spawn other processes called child processes. If you terminate a child process (that is hanging), the worker that spawned the process shows Failed as the status. After you fix the problem, choose to restart the failed job. Once the worker is restarted, the associated child processes are started as well.

Actions

Perform these steps:

  1. Start AD Controller.

    Set the environment and enter adctrl on the command line.

    Note: For more information, see Setting the Environment and Monitoring and Controlling Parallel Processes in Oracle E-Business Suite Maintenance Utilities.

  2. Choose Option 1 to review worker status.

  3. Take the appropriate action for each worker status.

    If the worker shows Failed, choose Option 2 to restart the failed job. When prompted, enter the number of the worker that failed.

    If the worker shows Running or Restarted status, but the process is not really running, select the following menu options:

    Caution: Do not choose Option 6 if the worker process is running. Doing so will create duplicate worker processes with the same worker ID.

    The worker will restart its assigned jobs and spawn the necessary child processes.

Restarting an AD Utility After Machine Failure

Requirement

While I was running an AD utility, the machine crashed. What is the best way to the restart the utility?

Discussion

Because the manager cannot automatically detect a machine crash, you must manually notify it that all jobs have failed and manually restart the workers. If you restart the utility without doing this, the utility status and the system status will not be synchronized.

Actions

Perform these steps:

  1. Start AD Controller

    Set the environment and enter adctrl on the command line.

    Note: For more information, see Setting the Environment and Monitoring and Controlling Parallel Processes in Oracle E-Business Suite Maintenance Utilities.

  2. Select the following options:

  3. Restart the AD utility that was running when the machine crashed.

Shutting Down and Restarting Managers

This section discusses some reasons for shutting down and reactivating managers.

Shutting Down a Manager

Requirement

How do I stop an AD utility while it is running?

Discussion

There may be situations when you need to shut down an AD utility while it is running. For example, you may need to shut down the database during an AutoPatch or AD Administration session.

You should perform this shutdown in an orderly fashion so that it does not affect your data. The best way to do this is to shut down the workers manually so that the AD utility quits in an orderly fashion.

Actions

Perform these steps:

  1. Start AD Controller

    Set the environment and enter adctrl on the command line.

    Note: For more information, see Setting the Environment and Monitoring and Controlling Parallel Processes in Oracle E-Business Suite Maintenance Utilities.

  2. Select Option 3 and enter all for the worker number. Each worker stops once it completes or fails its current job.

  3. Verify that no worker processes are running. Use the appropriate command for your platform.

    UNIX:

    $ ps -a | grep adworker

    Windows:

    Invoke the Task Manager (with Ctrl-Alt-Delete or Ctrl-Shift-Esc) to view processes.

  4. When all workers have shut down, the manager and the AD utility quit.

Restarting a Manager

Requirement

No workers are running jobs, when they should be doing so. What is the problem?

Discussion

A restarted worker resumes the failed job immediately as long as the worker process is running. The other workers change to a Waiting status if they cannot run any jobs because of dependencies on the failed job, or because there are no jobs left in the phase. When no workers are able to run, the manager becomes idle and messages like the following will appear on the screen:

ATTENTION: All workers either have failed or are waiting:

FAILED: file cedropcb.sql on worker 1.
FAILED: file adgrnctx.sql on worker 2.
FAILED: file aftwf01.sql on worker 3.

ATTENTION: Please fix the above failed worker(s) so the manager can continue.

Actions

Complete the following steps for each failed worker:

  1. Start AD Controller.

    Note: For more information, see Setting the Environment and Monitoring and Controlling Parallel Processes in Oracle E-Business Suite Maintenance Utilities.

  2. Determine the cause of the error.

    Choose Option 1 to view the status. Review the worker log file for the failed worker to determine the source of the error.

  3. Resolve the error.

    Use the information provided in the log files. Contact Oracle Support Services if you do not understand how to resolve the issue.

  4. Restart the failed job.

    Choose Option 2 on the AD Controller menu to tell the worker to restart a failed job. The worker process restarts, causing the AD utility to become active again.