Solaris Common Messages and Troubleshooting Guide

"W"

WARNING: add_spec: No major number for sf

Cause

The system prints the following warning message while booting:


SunOS Release 5.5.1 Version Generic_103640-03 [UNIX(R)
System V Release 4.0]
Copyright (c) 1983-1996, Sun Microsystems, Inc.
WARNING: add_spec: No major number for sf
The sf(7D) driver is specific for a Sun Enterprise Network Array (SENA), also known as a "photon."

Action

If no SENA is attached to the system, the message can be safely ignored. To stop seeing the message, comment out the last line in /kernel/drv/ssd.conf that references sf(7D).

If you do this, and then later attach a SENA to your system, remember to uncomment this line again.

warning:cachefs:invalid cache version

Cause

While running the Solaris 2.5.1 release and using Adminsuite2.3/Autoclient2.1, the user added 5 autoclients. During startup of the clients, the user received this error message.

Action

The /kernel/fs/cachefs files between server and client are different versions. Cachefs versions on the server and the client should be the same as shown in the following:

On the server:


# cd /kernel/fs 
# ls -al cachefs 
-rwxr-xr-x   1 root     sys       229396 Jul 15  1997 cachefs*
On the client:

# cd /export/root/clientname/kernel/fs  
# ls -al cachefs 
-rwxr-xr-x   1 root     sys       229396 Jul 15  1997 cachefs*  
solution: load patch 104849-02 or higher

To solve the problem, load patch 104849-02 or higher.

WARNING: Clock gained int days-- CHECK AND RESET THE DATE!

Cause

Each workstation contains an internal clock powered by a rechargeable battery. After the system is halted and turned off, the internal clock continues to keep time. When the system is powered on and reboots, the system notices that the internal clock has gained time since the workstation was halted.

Action

In most cases, especially if the power has been off for less than a month, the internal clock keeps the correct time, and you do not have to reset the date. Use the date(1) command to check the date and time on your system. If the date or time is wrong, become superuser and use the date(1) command to reset them.

Warning: Could not find matching rule in rules.ok

Cause

After an upgrade to the Solaris 2.5.1 release, jumpstart fails with this message:


Checking rules.ok file... 
Warning: Could not find matching rule in rules.ok
This message can occur even if the rules file is known to work, or, after review, it appears to be fine, and the check script has been run.

Action

Remove the rule keyword, network, from the rule file and run the check, again. Jumpstart should run without error.

WARNING: FAN FAILURE check if fans are still spinning

Cause

A SPARCcenterTM 2000/2000E might get one of these error messages, WARNING: FAN FAILURE check if fans are still spinning or WARNING: FAN FAILURE still sensed, displayed on the console screen at any time, with a record of the event in /var/adm/messages.

Action

The error itself is descriptive and self-explanatory, and you might suspect that a hardware problem occurred with the system's blower or fan assembly located at the top-most rear of the system cabinet.

Upon further investigation you note that the blower is indeed spinning at a good rate. Given that, you should then check to see if the "AC Dist to Blower to Filter to Keyswitch Harness" plug/adapter is plugged in correctly. Two cable assemblies connect the blower assembly to the unit's power supply. One is the "power supply" cable and the other is the "AC Dist to Blower to Filter to Keyswitch Harness."

Once the harness is securely connected, you see another message, NOTICE: FAN RECOVERED, logged on the system's console screen, or, if missed, it is in /var/adm/messages.

WARNING: FAN FAILURE still sensed

Refer to "WARNING: FAN FAILURE check if fans are still spinning".

WARNING: No network locking on string: contact admin to install server change

Cause

The mount(1M) command issues this message whenever it mounts a file system that does not have NFS locking, such as a standard SunOS 4.1 exported file sytem. Data loss is possible in applications that depend on locking.

Action

On the remote SunOS 4.1 system, install the appropriate rpc.lockd jumbo patch to implement NFS locking. For the SunOS 4.1.4 system, install patch #102264; for the SunOS 4.1.3 system, install patch #100075; for earlier 4.1 releases, install patch #101817.

WARNING: processor level 4 interrupt not serviced

Cause

This message is basically a diagnostic from the SCSI driver. It can appear on the console every 10 minutes or so.

Action

To reduce the frequency of this message, add this line near the bottom of the /etc/system file and reboot:


set esp:esp_use_poll_loop=0

Technical Notes

You might also see this message repeatedly after manually removing a CD when it was busy. Do not do this! To return the system to normal, reboot the system with the -r (reconfigure) option.

WARNING: /tmp: File system full, swap space limit exceeded

Cause

The system swap area (virtual memory) has filled up. You need to reduce swap space consumption by killing some processes or possibly by rebooting the system.

Action

For information about increasing swap space, refer to "Not enough space".

WARNING: TOD clock not initialized-- CHECK AND RESET THE DATE!

Cause

This message indicates that the Time Of Day (TOD) clock reads zero, so its time is the beginning of the UNIX epoch: midnight, 31 December 1969. On a brand-new system, the manufacturer might have neglected to initialize the system clock. On older systems it is more likely that the rechargeable battery has run out and requires replacement.

Action

First replace the battery according to the manufacturer's instructions. Then become superuser and use the date(1) command to set the time and date. On some systems the clock is powered by the same battery as the NVRAM, so a dead battery also causes loss of the machine's Ethernet address and host ID, which are more serious problems for networked systems.

WARNING: Unable to repair the / filesystem. Run fsck

Cause

This message comes at boot time from the /etc/rcS script whenever it gets a bad return code from fsck(1M) after checking a file system. The message recommends an fsck(1M) command line, and instructs you to exit the shell when done to continue booting. Then the script places the system in single-user mode so fsck(1M) can be run effectively.

Action

For information about repairing UFS file systems, refer to "/dev/rdsk/string: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.".

For information about repairing non-UFS file systems, refer to "THE FOLLOWING FILE SYSTEM(S) HAD AN UNEXPECTED INCONSISTENCY:".

WARNING: vxvm:vxio: Illegal vminor encountered

Cause

In this case, the message occurred during booting. The system was sharing an SSA1XX with an identical system. The user was also getting an error in disk group configuration copies during booting. The identical system was booting up fine--without error messages. vxconfigd died. A vxprivutil scan of one of the disks indicated the following:


diskid:  880409237.1043.system_that_comes_up 
hostid: none

Action

The user quickly applied a vxinstall on both systems: first, on the system that did not successfully boot, and then on the system that did. The user had to run a custom vxinstall, selecting only the disks desired for each system.

Technical Notes


Note -

The following attempt to resolve the problem failed.


vxiod set 10 
vxconfigd -m disable 
vxdctl init hostname 
vxdctl enable


Watchdog Reset

Cause

This fatal error usually indicates some kind of hardware problem. Data corruption on the system is possible.

Action

Look for some other message that might help diagnose the problem. By itself, a watchdog reset does not provide enough information; because traps are disabled, all information has been lost. If all that appears on the console is an ok prompt, issue the following PROM command to view the final messages that occurred just before system failure:


ok f8002010 wector p
Yes, that word is wector, not vector.

The result is a display of messages similar to those produced by the dmesg(1M) command. These messages can be useful in finding the cause of system failure.

Technical Notes

This message does not come from the kernel, but from the OpenBoot PROM monitor, a piece of Forth software that gives you the ok prompt before you boot UNIX. If the CPU detects a trap when traps are disabled (an unrecoverable error), it signals a watchdog. The OpenBoot PROM monitor detects the watchdog, issues this message, and shuts down the system.

Who are you?

Cause

Many networking programs can print this message, including from(1B), lpr(1B), lprm(1B), mailx(1), rdist(1), sendmail(1M), talk(1), and rsh(1). The command prints this message when it cannot locate a password file entry for the current user. This error might occur if a user logged in just before the superuser deleted that user's password entry, or if the network naming service fails for a user who has no entry in the local password file.

Action

If a user's password file entry was accidentally deleted, restore it from backups or from another password file. If a user's login name or user ID was changed, ask that user to log out and log in again. If the network naming service failed, check the NIS server(s) and repair or reboot as necessary.

Technical Notes

A known problem exists with starting hundreds of rsh(1) processes on another machine. This message appears because rsh(1) hangs while binding to a reserved port and responds too slowly to interact with the network naming service.

Window Underflow

Cause

This message often occurs at boot time, sometimes along with a Watchdog Reset error. It comes from the OpenBoot PROM monitor, which was passed a processor trap from the hardware. This error indicates that some program tried to access a register window that was not accessible from the processor.

Action

On some system architectures the problem could be that different capacity memory chips are mixed together. Someone might have placed 1-Mbyte SIMMs in the same bank with 4-Mbyte SIMMs. If this is so, rearrange the memory chips. Make sure to put higher-capacity SIMMs in the first bank(s), and lower-capacity SIMMs in the remaining bank(s); never mix different capacity SIMMs in the same bank.

The problem could also be that cache memory on the motherboard has gone bad and needs replacement. If main memory is installed correctly, try swapping the motherboard.

Technical Notes

The best way to isolate the problem is to look at the %pc register to see where it got its arguments, and why the arguments were bad. If you can reproduce the condition causing this message, your system vendor might be able to help diagnose the problem.