Sun N1 System Manager 1.3 Troubleshooting Guide

Solaris OS Update Deployment Failures

This section describes troubleshooting scenarios and possible solutions for the following categories of failures during Solaris OS update deployment:

In the following unload command, the update could be either the update name in the list that appears when you type show update all list, or the update could be the actual package name on the target server.


N1-ok> load server server update update

Always check the package is targeted to the correct architecture.


Note –

The N1 System Manager does not distinguish 32-bit from 64-bit for the Solaris (x86 or SPARC) OS, so the package or patch might not install successfully if it is installed on an incompatible OS.


If the package or patch does install successfully, but performance decreases, check that the architecture of the patch matches the architecture of the OS.

The following are common failures that can occur before the job is submitted:


Target server is not initialized

Solution:

Check that the add server feature osmonitor command was issued and that it succeeded.


Another running job on the target server

Solution:

Only one job is allowed at a time on a server. Try again after the job completes.


Update is incompatible with operating system on target server

Solution:

Check that the OS type of the target server matches one of the update OS types. Type show update update-name at the N1–ok> prompt to view the OS type for the update.


Target server is not in a good state or is powered off

Solution:

Check that the target server is up and running. Type show server server-name at the N1–ok> prompt to view the server status. Type reset server server-name force to force a reboot.

The following are possible causes for Load Update job failures:

Sometimes, Load Update jobs fail because either the same package already exists or because a higher version of the package exists. Ensure that the package does not already exist on the target server if the job fails.


error: Failed dependencies:


A prerequisite package and should be installed.

Solution:

For a Solaris system, configure the idepend= parameter in the admin file.


Preinstall or postinstall scripts failure: Non-zero status


pkgadd: ERROR: ... script did not complete successfully

Solution:

Check the pre-installation or post installation scripts for possible errors to resolve this error.


Interactive request script supplied by package

Solution:

This message indicates that the response file is missing or that the setting in the admin file is incorrect. Add a response file to correct this error.


patch-name was installed without backing up the original files

Solution:

This message indicates that the Solaris OS update was installed without backing up the original file. No action needs to be taken.


Insufficient diskspace

Solution:

Load Update jobs might fail due to insufficient disk space. Check the available disk space by typing df -k. Also check the package size. If the package size is too large, create more available disk space on the target server.

The following are stop job failures for loading or unloading update operations:

If you stop a Load Update or Unload Update job and the job does not stop, manually ensure that the following process is killed on the management server:


# ps -ef |grep swi_pkg_pusher
ps -ef |grep pkgadd, pkgrm, scp, ...

Then, check any processes that are running on the manageable server:


# ps -ef |grep pkgadd, pkgrm, ...

The following are common failures for Unload Server and Unload Group jobs:

The rest of this section provides errors and possible solutions for failures related to the following commands: unload server server-name update update-name and unload group group-name update update-name.


Removal of <SUNWssmu> was suspended (interaction required)

Solution:

This message indicates a failed dependency for uninstalling a Solaris package. Check the admin file setting and provide an appropriate response file.


Job step failure without error details

Solution:

This message might indicate that the job was not successfully started internally. Contact a Sun Service Representative for more information.


Job step failure with vague error details: Connection to 10.0.0.xx

Solution:

This message might indicate that the uninstallation failed because some packages were not fully installed. In this case, manually install the package in question on the target server. For example:

To manually install a .pkg file, type the following command:


# pkgadd -d pkg-name -a admin-file

To manually install a patch, type the following command:


# patchadd -d patch-name -a admin-file

Then, run the unload command again.


Job hangs

Solution:

If the job appears to hang, stop the job and manually kill the remaining processes. For example:

To manually kill the job, type the following command:


# n1sh stop job job-ID

Then, find the PID of the PKG and kill the process, by typing the following commands:


# ps -ef |grep pkgadd
# pkill pkgadd-PID

Then run the unload command again.