This chapter provides troubleshooting information on the following topics:
This section provides security-based troubleshooting information.
The Sun N1 System Manager Server uses strong encryption techniques to ensure secure communication between the management server and each managed server.
The keys used by the Sun N1 System Manager are stored under the /etc/opt/sun/cacao/security directory on each server where the servers are running Linux. These keys should be identical across all servers. For servers running the Solaris OS, these keys are stored under the /etc/opt/SUNWcacao/security directory.
Under normal operation, these keys can be left in their default configuration. You might have to regenerate security keys. For example, if there is a risk that the root password of the management server has been exposed or compromised, regenerate the security keys.
On the management server as root, stop the common agent container management daemon.
If the management server is running Linux:
# /opt/sun/cacao/bin/cacaoadm stop |
If the management server is running the Solaris OS:
# /opt/SUNWcacao/bin/cacaoadm stop |
Regenerate security keys using the create-keys subcommand.
If the management server is running Linux:
# /opt/sun/cacao/bin/cacaoadm create-keys --force |
If the management server is running the Solaris OS:
# /opt/SUNWcacao/bin/cacaoadm create-keys --force |
As root on the management server, restart the common agent container management daemon.
If the management server is running Linux:
# /opt/sun/cacao/bin/cacaoadm start |
If the management server is running the Solaris OS:
# /opt/SUNWcacao/bin/cacaoadm start |
The following list provides general security considerations that you should be aware of when you are using the N1 System Manager:
The JavaTM Web Console that is used to launch the N1 System Manager's browser interface uses self-signed certificates. These certificates should be treated with the appropriate level of trust by clients and users.
The terminal emulator applet that is used by the browser interface for the serial console feature does not provide a certificate-based authentication of the applet. The applet also requires that you enable SSHv1 for the management server. For certificate-based authentication or to avoid enabling SSHv1, use the serial console feature by running the connect command from the n1sh shell.
SSH fingerprints that are used to connect from the management server to the provisioning network interfaces on the provisionable servers are automatically acknowledged by the N1 System Manager software. This automation might make the provisionable servers vulnerable to “man-in-the middle” attacks.
The Web Console (Sun ILOM Web GUI) autologin feature for Sun Fire X4100 and Sun Fire X4200 servers exposes the server's service processor credentials to users who can view the web page source for the Login page. To avoid this security issue, disable the autologin feature by running the n1smconfig utility. See Configuring the N1 System Manager System in Sun N1 System Manager 1.1 Installation and Configuration Guide for details.
This section describes scenarios that cause OS deployment to fail and explains how to correct failures.
If the creation of an OS distribution fails with a copying files error, check the size of the ISO image and ensure that it is not corrupted. You might see output similar to the following in the job details:
bash-3.00# /opt/sun/n1gc/bin/n1sh show job 25 Job ID: 25 Date: 2005-07-20T14:28:43-0600 Type: Create OS Distribution Status: Error (2005-07-20T14:29:08-0600) Owner: root Errors: 1 Warnings: 0 Steps ID Type Start Completion Result 1 Acquire Host 2005-07-20T14:28:43-0600 2005-07-20T14:28:43-0600 Completed 2 Run Command 2005-07-20T14:28:43-0600 2005-07-20T14:28:43-0600 Completed 3 Acquire Host 2005-07-20T14:28:46-0600 2005-07-20T14:28:46-0600 Completed 4 Run Command 2005-07-20T14:28:46-0600 2005-07-20T14:29:06-0600 Error 1 Errors Error 1: Description: INFO : Mounting /images/rhel-3-U4-i386-es-disc1.iso at /mnt/loop23308 INFO : Version is 3ES, disc is 1 INFO : Version is 3ES, disc is 1 INFO : type redhat ver: 3ES cp: /var/opt/SUNWscs/data/allstart/image/3ES-bootdisk.img: Bad address INFO : Could not copy PXE file bootdisk.img INFO : umount_exit: mnt is: /mnt/loop23308 INFO : ERROR: Could not add floppy to the Distro Results Result 1: Server: - Status: -1 Message: Creating OS rh30u4-es failed. |
The inability to deploy Solaris 9 OS distributions to servers from a Linux management server is usually due to a problem with NFS mounts. To solve this problem, you need to apply a patch to the mini-root of the Solaris 9 OS distribution. This section provides instructions for applying the required patches. The instructions differ according to the management and patch server configuration scenarios in the following table.
Table 6–1 Task Map for Patching a Solaris 9 Distribution
Management Server |
Patch Server |
Task |
---|---|---|
Red Hat 3.0 u2 |
Solaris 9 OS on x86 platform |
To Patch a Solaris 9 OS Distribution by Using a Solaris 9 OS on x86 Patch Server |
Red Hat 3.0 u2 |
Solaris 9 OS on SPARC platform |
To Patch a Solaris 9 OS Distribution by Using a Solaris 9 OS on SPARC Patch Server |
When you are using a patch server to perform the following tasks, you need to have root access to both the management server and the provisionable server at once. For some tasks, you need to first patch the provisionable server, then mount the management server and patch the distribution.
This procedure describes how to patch a Solaris 9 OS distribution in the N1 System Manager. The steps in this procedure need to be performed on both the patch server and the management server. Consider opening two terminal windows to complete the steps. The following steps first guide you through patching the patch server and then provide steps for patching the distribution.
Create a Solaris 9 OS distribution on the management server. See To Copy an OS Distribution From CDs or a DVD or To Copy an OS Distribution From ISO Files. Type show os os-name at the command line to view the ID of the OS distribution. This number is used in place of DISTRO_ID in the instructions.
Install the Solaris 9 OS on x86 platform software on a machine that is not the management server.
Create a /patch directory on the Solaris 9 x86 patch server.
For a Solaris OS on x86 distribution, download and unzip the following patches into the /patch directory on the Solaris 9 OS on x86 patch server: 117172-17 and 117468-02. You can access these patches from http://sunsolve.sun.com.
For a Solaris OS on SPARC distribution, download and unzip the following patches into the /patch directory on the Solaris 9 OS on x86 patch server: 117171-17, 117175-02, and 113318-20. You can also access these patches from http://sunsolve.sun.com.
Patch the Solaris 9 OS on x86 patch server.
Log in as root.
% su password:password |
The root prompt appears.
Reboot the Solaris 9 patch server to single-user mode.
# reboot -- -s |
In single-user mode, change to the patch directory.
# cd /patch |
Install the patches.
# patchadd -M . 117172-17 # patchadd -M . 117468-02 |
Pressing Control+D returns you to multiuser mode.
Prepare to patch the distribution on the management server.
Patch the distribution that you copied to the management server.
Log in to the Solaris 9 patch server as root.
% su password:password |
The root prompt appears.
Mount the management server.
# mount -o rw management-server-IP:/js/DISTRO_ID /mnt |
Install the patches by performing one of the following actions:
If you are patching an x86 distribution, type the following commands:
# patchadd -C /mnt/Solaris_9/Tools/Boot/ -M /patch 117172-17 # patchadd -C /mnt/Solaris_9/Tools/Boot/ -M /patch 117468-02 |
If you are patching a SPARC distribution, type the following commands:
# patchadd -C /mnt/Solaris_9/Tools/Boot/ -M /patch 117171-17 # patchadd -C /mnt/Solaris_9/Tools/Boot/ -M /patch 117175-02 # patchadd -C /mnt/Solaris_9/Tools/Boot/ -M /patch 113318-20 |
You will receive a partial error for the first patch installation. Ignore this error.
Unmount the management server.
# unmount /mnt |
Restart NFS on the management server.
Fix the Solaris 9 OS on x86 distribution.
If you want to patch another distribution, you might have to delete the /patch/117172-17 directory and re-create it using the unzip 117172-17.zip command. When the first distribution is patched, the patchadd command makes a change to the directory that causes problems with the next patchadd command execution.
This procedure describes how to patch a Solaris 9 OS distribution in the N1 System Manager. The steps in this procedure need to be performed on the provisionable server and the management server. Consider opening two terminal windows to complete the steps. The following steps first guide you through patching the provisionable server and then provide steps for patching the distribution.
Create a Solaris 9 OS distribution on the management server. See To Copy an OS Distribution From CDs or a DVD or To Copy an OS Distribution From ISO Files. Type show os os-name at the command line to view the ID of the OS distribution. This number is used in place of DISTRO_ID in the instructions.
Install the Solaris 9 OS on SPARC software on a machine that is not the management server. See To Load an OS Profile on a Server or a Server Group.
Create a /patch directory on the Solaris 9 SPARC patch server.
For a Solaris OS on x86 distribution, download and unzip the following patches into the /patch directory on the Solaris 9 OS on x86 patch server: 117172-17 and 117468-02. You can access these patches from http://sunsolve.sun.com.
For a Solaris OS on SPARC distribution, download and unzip the following patches into the /patch directory on the Solaris 9 OS on x86 patch server: 117171-17, 117175-02, and 113318-20. You can access these patches from http://sunsolve.sun.com.
Set up and patch the Solaris 9 OS on SPARC machine.
Log in to the Solaris 9 machine as root.
% su password:password |
Reboot the Solaris 9 machine to single-user mode.
# reboot -- -s |
In single-user mode, change to the patch directory.
# cd /patch |
Install the patches.
# patchadd -M . 117171-17 # patchadd -M . 117175-02 # patchadd -M . 113318–20 |
Pressing Control+D returns you to multiuser mode.
Patch the distribution that you copied to the management server.
Log in to the Solaris 9 machine as root.
% su password:password |
Mount the management server.
# mount -o rw management-server-IP:/js/DISTRO_ID /mnt |
Install the patches by performing one of the following actions:
If you are patching a Solaris OS on x86 software distribution, type the following commands:
# patchadd -C /mnt/Solaris_9/Tools/Boot/ -M /patch 117172-17 # patchadd -C /mnt/Solaris_9/Tools/Boot/ -M /patch 117468-02 |
If you are patching a Solaris OS on SPARC software distribution, type the following commands:
# patchadd -C /mnt/Solaris_9/Tools/Boot/ -M /patch 117171-17 # patchadd -C /mnt/Solaris_9/Tools/Boot/ -M /patch 117175-02 # patchadd -C /mnt/Solaris_9/Tools/Boot/ -M /patch 113318-20 |
You will receive a partial error for the first patch installation. Ignore this error.
Unmount the management server.
# unmount /mnt |
Restart NFS on the management server.
Fix the Solaris 9 OS on x86 distribution.
If you want to patch another distribution you might have to delete the /patch/117172-17 directory and re-create it using the unzip 117172-17.zip command. When the first distribution is patched, the patchadd command makes a change to the directory that causes problems with the next patchadd command execution.
OS profile deployments might fail if any of the following conditions occur:
Partitions are not modified to suit a Sun Fire V40z or SPARC V440 server. See To Modify the Default Solaris OS Profile for a Sun Fire V40z or a SPARC v440 Server.
Scripts are not modified to install the driver needed to recognize the Ethernet interface on a Sun Fire V20z server. See To Modify a Solaris 9 OS Profile for a Sun Fire V20z Server With a K2.0 Motherboard.
DHCP is not correctly configured. See Solaris Deployment Job Times Out or Stops.
OS profile installs only the Solaris Core System Support distribution group. See Solaris OS Profile Installation Fails.
The target server cannot access DHCP information or mount distribution directories. See Invalid Management Server Netmask.
The management server cannot access files during a Load OS operation. See Restarting NFS to Resolve Boot Failed Errors.
The Linux deployment stops. See Linux Deployment Stops.
Use the following graphic as a guide to troubleshooting best practices. The graphic describes steps to take when you initiate provisioning operations. Taking these steps will help you troubleshoot deployments with greater efficiency.
This procedure describes how to modify the Solaris OS profile that is created by default. The following modification is required for successful installation of the default Solaris OS profile on a Sun Fire V40z or a SPARC v440 server.
Log in to the N1 System Manager.
See To Access the N1 System Manager Command Line for details.
Clone the default profile.
N1-ok> create osprofile sol10v40z clone sol10 |
Remove the root partition.
N1-ok> remove osprofile sol10v40z partition / |
Remove the swap partition.
N1-ok> remove osprofile sol10v40z partition swap |
Add new root parameters.
N1-ok> add osprofile sol10v40z partition / device c1t0d0s0 sizeoption free type ufs |
Add new swap parameters.
N1-ok> add osprofile sol10v40z partition swap device c1t0d0s1 size 2000 type swap sizeoption fixed |
To find out how to load the modified OS profile, see To Load an OS Profile on a Server or a Server Group.
This procedure describes how to create and add a script to your Solaris OS profile. This script installs the Broadcom 5704 NIC driver needed for Solaris 9 x86 to recognize the NIC Ethernet interface on a Sun Fire V20z server with a K2.0 motherboard. Earlier versions of the Sun Fire V20z server use the K1.0 motherboard. Newer versions use the K2.0 motherboard.
This patch is needed for K2.0 motherboards but can also be used on K1.0 motherboards without negative consequences.
Log in to the N1 System Manager.
See To Access the N1 System Manager Command Line for details.
Type the following command:
% /opt/sun/n1gc/bin/n1sh show os |
The list of available OS distributions appears. Note the name of the Solaris 9 distribution.
Run the as_distro.pl script, and view the output.
# /scs/sbin/as_distro.pl -l |
Note down the DISTRO_ID for the Solaris 9 distribution.
You use this ID in the next step.
Type the following command:
# mkdir /js/DISTRO_ID/patch |
A patch directory is created for the Solaris 9 distribution.
Download the 116666-04 patch from http://sunsolve.sun.com to the /js/DISTRO_ID/patch directory.
Change to the /js/DISTRO_ID/patch directory.
# cd /js/DISTRO_ID/patch |
Unzip the patch file.
# unzip 116666-04.zip |
Type the following command:
# mkdir /js/scripts |
In the /js/scripts directory, create a script called patch_sol9_k2.sh that includes the following three lines:
#!/bin/sh echo "Adding patch for bge devices." patchadd -R /a -M /cdrom/patch 116666-04 |
Ensure the script is executable. You can use the chmod 775 patch_sol9_k2.sh command.
Add the script to the Solaris 9 OS profile.
N1-ok> add osprofile osprofile script /js/scripts/patch_sol9_k2.sh type post |
This example shows how to add a script to an OS profile. The type attribute specifies that the script is to be run after the installation.
N1-ok> add osprofile sol9K2 script /js/scripts/patch_sol9_k2.sh type post |
To load the modified Solaris OS profile, see To Load an OS Profile on a Server or a Server Group.
If you attempt to load a Solaris OS profile and the OS Deploy job times out or stops, check the output in the job details to ensure that the target server completed a PXE boot. For example:
PXE-M0F: Exiting Broadcom PXE ROM. Broadcom UNDI PXE-2.1 v7.5.14 Copyright (C) 2000-2004 Broadcom Corporation Copyright (C) 1997-2000 Intel Corporation All rights reserved. CLIENT MAC ADDR: 00 09 3D 00 A5 FC GUID: 68D3BE2E 6D5D 11D8 BA9A 0060B0B36963 DHCP. |
If the PXE boot fails, the /etc/dhcpd.conf file on the management server might have not been set up correctly by the N1 System Manager.
The best diagnostic tool is to open a console window on the target machine and then run the deployment. See To Open a Server's Serial Console.
If you suspect that the /etc/dhcpd.conf file was configured incorrectly, complete the following procedure to modify the configuration.
Log in to the management server as root.
Inspect the dhcpd.conf file for errors.
# vi /etc/dhcpd.conf |
If errors exist that need to be corrected, run the following command:
# /usr/bin/n1smconfig |
The n1smconfig utility appears.
Modify the provisioning network interface configuration.
See Configuring the N1 System Manager System in Sun N1 System Manager 1.1 Installation and Configuration Guide for detailed instructions.
Load the OS profile on the target server.
OS profiles that install only the Core System Support distribution group do not load successfully. Specify “Entire Distribution plus OEM Support” as the value for the distributiongroup parameter. Doing so configures a profile that will install the needed version of SSH and other tools that are required for servers to be managed by the N1 System Manager.
If the target server cannot access DHCP information or mount the distribution directories on the management server during a Solaris 10 deployment, you might have network problems caused by an invalid netmask. The console output might be similar to the following:
Booting kernel/unix... krtld: Unused kernel arguments: `install'. SunOS? Release 5.10 Version Generic 32-bit Copyright 1983-2005 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. Unsupported Tavor FW version: expected: 0003.0001.0000, actual: 0002.0000.0000 NOTICE: tavor0: driver attached (for maintenance mode only) Configuring devices. Using DHCP for network configuration information. Beginning system identification... Searching for configuration file(s)... Using sysid configuration file /sysidcfg Search complete. Discovering additional network configuration... Completing system identification... Starting remote procedure call (RPC) services: done. System identification complete. Starting Solaris installation program... Searching for JumpStart directory... /sbin/dhcpinfo: primary interface requested but no primary interface is set not found Warning: Could not find matching rule in rules.ok Press the return key for an interactive Solaris install program... |
To fix the problem, set the management server netmask value to 255.255.255.0. See To Configure the Sun N1 System Manager System in Sun N1 System Manager 1.1 Installation and Configuration Guide.
If you are deploying a Linux OS and the deployment stops, check the console of the target server to see if the installer is in interactive mode. If the installer is in interactive mode, the deployment timed out because of a delay in the transmission of data from the management server to the target server. This delay usually occurs because the switch or switches connecting the two machines has spanning tree enabled. Either turn off spanning tree on the switch or disable spanning tree for the ports that are connected to the management server and the target server.
If spanning tree is already disabled and OS deployment stops, you may have a problem with your network.
Error: boot: lookup /js/4/Solaris_10/Tools/Boot failed boot: cannot open kernel/sparcv9/unix
Solution:The message differs depending on the OS that is being deployed. If the management server cannot access files during a Load OS operation, it might be caused by a network problem. To possibly correct this problem, try restarting NFS.
On a Solaris system, type the following:
# /etc/init.d/nfs.server stop # /etc/init.d/nfs.server start |
On a Linux system, type the following:
# /etc/init.d/nfs restart |
You must manually install the wget information if the add server feature osmonitor agentip command fails with the following error: Internal error: wget command failed: /usr/bin/wget —0 /tmp/hostinstall.pl http://xx.xx.xx.xx/pub/hostinstall.pl, where xx.xx.xx.xx is the IP address of the machine in question.
For a Solaris system, install the SUNWwgetu and SUNWwgetr packages in /usr/sfw/bin/wget.
For a Linux system, install all RPMs that begin with wget- in /usr/bin/wget.
Adding the feature might also fail due to stale SSH entries on the management server. If the add server server-name feature osmonitor agentip command fails and no true security breach has occurred, remove the /root/.ssh/known_hosts file or the specific entry in the file that corresponds to the provisionable server. Then, retry the add command.
Additionally, adding the OS monitoring feature to a server that has the base management feature might fail. The following job output shows the error: Repeat attempts for this operation are not allowed. This error indicates that SSH credentials have previously been supplied and cannot be altered. To avoid this error, issue the add server feature osmonitor command without agentssh credentials. See To Add the OS Monitoring Feature for instructions.
N1-ok> show job 61 Job ID: 61 Date: 2005-08-16T16:14:27-0400 Type: Modify OS Monitoring Support Status: Error (2005-08-16T16:14:38-0400) Owner: root Errors: 1 Warnings: 0 Steps ID Type Start Completion Result 1 Acquire Host 2005-08-16T16:14:27-0400 2005-08-16T16:14:28-0400 Completed 2 Run Command 2005-08-16T16:14:28-0400 2005-08-16T16:14:28-0400 Completed 3 Acquire Host 2005-08-16T16:14:29-0400 2005-08-16T16:14:30-0400 Completed 4 Run Command 2005-08-16T16:14:30-0400 2005-08-16T16:14:36-0400 Error Results Result 1: Server: 192.168.2.10 Status: -3 Message: Repeate attempts for this operation are not allowed. |
This section describes possible solutions for the following troubleshooting scenarios:
The name that is specified when you create a new OS update must be unique. The OS update to be created also needs to be unique. That is, in addition to the uniqueness of the file name for each OS update, the combination of the internal package name, version, release, and file name also needs to be unique.
For example, if test1.rpm is the source for an RPM named test1, another OS update called test2 cannot have the same file name as test1.rpm. To avoid additional naming issues, do not name an OS update with the same name as the internal package name for any other existing packages on the provisionable server.
You can specify an adminfile value when you create an OS update. For the Solaris OS update packages, a default admin file is located at /opt/sun/n1gc/etc/admin.
mail= instance=unique partial=nocheck runlevel=nocheck idepend=nocheck rdepend=nocheck space=quit setuid=nocheck conflict=nocheck action=nocheck basedir=default authentication=nocheck |
The default admin file setting used for Solaris package deployments in the N1 System Manager is instance=unique. If you want to report errors for duplicated packages, change the admin file setting to instance=quit. This change causes an error to appear in the Load Update job results if a duplicate package is detected.
See the admin(4) man page for detailed information about admin file parameter settings. Type man -s4 admin as root user on a Solaris system to view the man page.
For Solaris packages, a response file might also be needed. For instructions on how to specify an admin file and a response file when you create an OS update, see To Copy an OS Update.
This section describes troubleshooting scenarios and possible solutions for the following categories of failures:
Failures that occur before the job is submitted
Load Update job failures
Unload Update job failures
Stop Job failures for Load Update
In the following unload command, the update could be either the update name in the list that appears when you type show update all list, or the update could be the actual package name on the target server.
N1-ok> load server server update update |
Always check the package is targeted to the correct architecture. The N1 System Manager does not distinguish 32-bit from 64-bit for the Solaris (x86 or SPARC) OS, so the package or patch might not install successfully if it is installed on an incompatible OS. If the package or patch does install successfully, but performance decreases, check that the architecture of the patch matches the architecture of the OS.
The following are common failures that can occur before the job is submitted:
Target server is not initialized
Solution:Check that the add server feature osmonitor command was issued and that it succeeded.
Another running job on the target server
Solution:Only one job is allowed at a time on a server. Try again after the job completes.
Update is incompatible with operating system on target server
Solution:Check that the OS type of the target server matches one of the update OS types. Type show update update-name at the N1–ok> prompt to view the OS type for the update.
Target server is not in a good state or is powered off
Solution:Check that the target server is up and running. Type show server server-name at the N1–ok> prompt to view the server status. Type reset server server-name force to force a reboot.
The following are possible causes for Load Update job failures:
Sometimes, Load Update jobs fail because either the same package already exists or because a higher version of the package exists. Ensure that the package does not already exist on the target server if the job fails.
error: Failed dependencies:
A prerequisite package and should be installed.
Solution:Use an RPM tool to address and resolve Linux RPM dependencies. For a Solaris system, configure the idepend= parameter in the admin file.
Preinstall or postinstall scripts failure: Non-zero status
pkgadd: ERROR: ... script did not complete successfully
Solution:Check the pre-installation or post installation scripts for possible errors to resolve this error.
Interactive request script supplied by package
Solution:This message indicates that the response file is missing or that the setting in the admin file is incorrect. Add a response file to correct this error.
patch-name was installed without backing up the original files
Solution:This message indicates that the Solaris OS update was installed without backing up the original file. No action needs to be taken.
Insufficient diskspace
Solution:Load Update jobs might fail due to insufficient disk space. Check the available disk space by typing df -k. Also check the package size. If the package size is too large, create more available disk space on the target server.
The following are stop job failures for loading or unloading update operations:
If you stop a Load Update or Unload Update job and the job does not stop, manually ensure that the following process is killed on the management server:
# ps -ef |grep swi_pkg_pusher ps -ef |grep pkgadd, pkgrm, scp, ... |
Then, check any processes that are running on the provisionable server:
# ps -ef |grep pkgadd, pkgrm, ... |
The following are common failures for Unload Server and Unload Group jobs:
The rest of this section provides errors and possible solutions for failures related to the following commands: unload server server-name update update-name and unload group group-name update update-name.
Removal of <SUNWssmu> was suspended (interaction required)
Solution:This message indicates a failed dependency for uninstalling a Solaris package. Check the admin file setting and provide an appropriate response file.
Job step failure without error details
Solution:This message might indicate that the job was not successfully started internally. Contact a Sun Service Representative for more information.
Job step failure with vague error details: Connection to 10.0.0.xx
Solution:This message might indicate that the uninstallation failed because some packages or RPMs were not fully installed. In this case, manually install the package in question on the target server. For example:
To manually install an RPM, type the following command:
# rpm -Uvh rpm-name |
To manually install a .pkg file, type the following command:
# pkgadd -d pkg-name -a admin-file |
To manually install a patch, type the following command:
# patchadd -d patch-name -a admin-file |
Then, run the unload command again.
Job hangs
Solution:If the job appears to hang, stop the job and manually kill the remaining processes. For example:
To manually kill the job, type the following command:
# n1sh stop job job-ID |
Then, find the PID of the RPM and kill the process, by typing the following commands:
# ps -ef |grep rpm-name # pkill rpm-PID |
Or, find the PID of the PKG and kill the process, by typing the following commands:
# ps -ef |grep pkgadd # pkill pkgadd-PID |
Then run the unload command again.
This section provides detailed information to help you download and prepare the firmware versions that are required to discover Sun Fire V20z and V40z servers.
Log in as root to the N1 System Manager management server.
The N1–ok prompt appears.
Create directories into which the V20z and V40z firmware update zip files are to be saved.
Create separate directories for each server type firmware download. For example:
# mkdir V20z-firmware V40z-firmware |
In a web browser, go to http://www.sun.com/servers/entry/v20z/downloads.html.
The Sun Fire V20z/V40z Server downloads page appears.
Click Current Release.
The Sun Fire V20z/V40z NSV Bundles 2.3.0.11 page appears.
Click Download.
The download Welcome page appears. Type your username and password, and then click Login.
The Terms of Use page appears. Read the license agreement carefully. You must accept the terms of the license to continue and download the firmware. Click Accept and then click Continue.
The Download page appears. Several downloadable files are displayed.
To download the V20z firmware zip file, click V20z BIOS and SP Firmware, English (nsv-v20z-bios-fw_V2_3_0_11.zip).
Save the 10.21–Mbyte file to the directory that you created for the V20z firmware in Step 2.
To download the V40z firmware zip file, click V40z BIOS and SP Firmware, English (nsv-v40z-bios-fw_V2_3_0_11.zip).
Save the 10.22–Mbyte file to the directory you created for the V40z firmware in Step 2.
Change to the directory where you downloaded the V20z firmware file.
Type unzip to unpack the file.
Type y to continue.
The sw_images directory is extracted.
The following files in the sw_images directory are used by the N1 System Manager to update V20z provisionable server firmware:
Service Processor:
sw_images/sp/spbase/V2.3.0.11/install.image
BIOS
sw_images/platform/firmware/bios/ V2.33.5.2/bios.sp
Change to the directory where you downloaded the V40z firmware zip file.
Type unzip nsv-v40z-bios-fw_V2_3_0_11.zip to unpack the zip file.
The sw_images directory is extracted.
The following files in the sw_images directory are used by the N1 System Manager to update V40z provisionable server firmware:
Service Processor:
sw_images/sp/spbase/V2.3.0.11/install.image
BIOS:
sw_images/platform/firmware/bios/V2.33.5.2/bios.sp
Copy the firmware updates to the N1 System Manager as described in To Copy a Firmware Update.
Update the firmware on a single server or server group provisionable server as described in To Load a Firmware Update on a Server or a Server Group.
If a threshold value is breached for a monitored attribute, an event is generated. You can create notification rules to warn you about this type of event. Notification of threshold breaches or warnings is done through the event log. This log is most easily viewed through the browser interface.
Notifications can be created using the create notification command and the resulting notification sent by email or to a pager. See create notification in Sun N1 System Manager 1.1 Command Line Reference Manual for syntax details.
If the value of a monitored hardware health attribute, or OS resource utilization attribute breaches a threshold value, an event log indicates that the threshold has been breached. The event log becomes available from the browser interface. The length of time it takes for the event log to be available from the browser interface depends on the polling interval for the attribute:
t + polling interval
The time at which the breach occurs is indicated by t. The polling interval is in seconds, and is the amount of time between successive polls of the monitored attribute. See Setting Polling Intervals for more information. Use the show log command to verify that the event log has been generated:
N1-ok> show log Id Date Severity Subject Message . . 10 2004-11-22T01:45:02-0800 WARNING Sun_V20z_XG041105786 A critical high threshold was violated for server Sun_V20z_XG041105786: Attribute cpu0.vtt-s3 Value 1.32 13 2004-11-22T01:50:08-0800 WARNING Sun_V20z_XG041105786 A normal low threshold was violated for server Sun_V20z_XG041105786: Attribute cpu0.vtt-s3 Value 1.2 |
If the IP addresses of the management server, monitoring agent or the data network are unavailable, an event indicates that there is a network connectivity problem. This is part of network reachability monitoring. See Network Reachability Monitoring for more information. The event log becomes available from the browser interface. The length of time it takes for the event log to be available from the browser interface depends on the polling interval for the attribute:
t + polling interval
The time at which the breach occurs is indicated by t. The polling interval is in seconds, and is the amount of time between successive polls of the monitored attribute. See Setting Polling Intervals for more information. Use the show log command to verify that the event log has been generated:
N1-ok> show log . . 13 2004-11-19T10:24:33-0800 INFORMATION Sun_V20z_XGserial_number Ip Address /<ip_address> on server Sun_V20z_XGserial_number is unreachable. 14 2004-11-19T10:24:38-0800 INFORMATION Sun_V20z_XGserial_number Ip Address /<ip_address> on server Sun_V20z_XGserial_number is unreachable. |
If monitoring is enabled, as described in Enabling Monitoring, and the status in the output of the show server or show group commands is unknown or unreachable, then the server or server group is not being reached successfully for monitoring. If the status remains unknown or unreachable over the duration of less than five polling intervals, it is possible that a transient network problem is occurring. However if the status remains unknown or unreachable over the duration of more than five polling intervals, it is possible that monitoring has failed. This could be the result of a failure in the monitoring agent.
A time stamp is provided in the monitoring data output. The relationship between this time stamp and the value of the polling interval can also be used to judge if there is an error with the monitoring agent. If the monitored output for a provisionable server continues to show the same timestamp, even after several polling intervals have passed, this indicates that the provisionable server has not been successfully polled, and is no longer being monitored. This could be the result of a failure in the monitoring agent.