Solaris Common Messages and Troubleshooting Guide

"C"

Cannot access a needed shared library

Cause

The system is trying to exec(2) an a.out that requires a static shared library, and the static shared library does not exist or the user does not have permission to use it.

Technical Notes

The symbolic name for this error is ELIBACC, errno=83.

Cannot allocate colormap entry for "string"

Cause

This message from libXt (X Intrinsics library) indicates that the system color map was full, even before the color name specified in quotes was requested. Some applications can continue after this message. Other applications, such as workspace properties color, fail to come up when the color map is full.

Action

Exit the programs that make heavy use of the color map, then restart the failed application and try again.

Cannot assign requested address

Cause

An attempt was made to create a transport endpoint with an address not on the current machine.

Technical Notes

The symbolic name for this error is EADDRNOTAVAIL, errno=126.

Cannot bind to domain domainname: can't communicate with ypbind

Cause

While running the ypinit -m script for the setup of an NIS Master Server, you get this error message.

Action

You could be using the wrong nsswitch template for /etc/nsswitch.conf. During setup, you should be using /etc/nsswitch.files as the name services switch template. After setup is complete, you would then want to use /etc/nsswitch.nis. Do the following to verify that you are using nsswitch.files:


# head /etc/nsswitch.conf 	
#   -->	
# /etc/nsswitch.files:
If you are not using the nsswitch.files, copy it over as shown below:

# cp /etc/nsswitch.files /etc/nsswitch.conf
Run the ypinit -m script, again.

Cannot boot after install, error that points to an .rc file

Cause

The user completes the installation of the Solaris 2.6 IA software. Upon reboot, the user gets an error referencing an .rc file (example: 11045.rc). This file has probably been deleted or placed in a different directory. As the Solaris software looks for this file during the bootup sequence and cannot find it, the system hangs, because it cannot complete the boot process.

Action

During the installation process, there is an option to save the configuration assistant choices to a file. The error is pointing to the saved configuration file. The user was never supposed to have the option to save these choices to a file. Users should exit the setup after making their choices. If the users do save these choices to a file and if this file gets deleted or moved, the system hangs during the boot process. To solve this problem, the user boots in single user mode. From the # prompt, the user should do the following:

  1. cd /platform/i86pc/boot/solaris/machines

  2. Delete all files in this directory.

  3. Reboot the system.

This corrects the problem and allows the Solaris software to complete loading.

cannot change passwd, not correct passwd

Cause

While running yppasswd(1) and trying to change a user's password, the system responded with this message: cannot change passwd, not correct passwd.

Also, the user was getting yppasswd user string does not exist on the server console, but by running ypcat passwd | grep user it returns the user name. It was verified that yppasswdd(1M) was running.

Action

Check the passwd(4) file with pwck(1M) and verify that yppasswdd(1M) is running on the right server. Then verify where the passwd(4) file is located and, if changed, check that yppasswdd(1M) has the location in the process line. The password located in /etc/yp should read /usr/lib/yp/rpc.yppasswdd -D /etc/yp. The -D option with the passwd files directory location tells yppasswdd(1M) where to update and verify password changes.

cannot establish nfs service over /dev/tcp: transport setup problem

Cause

During boot strap of a SunOS 2.5.2 system, nfsd(1M) displays the following:


netdir_getbyname (transport tcp, host/serv \1/nfs), No such file or directory
Cannot establish NFS service over /dev/tcp: transport setup problem.
The problem: The NIS maps have been populated from older systems, and the nfs/tcp entry of the services map is missing. (The user is running NIS+, but this problem can also occur with NIS.)

Action

Either put a files entry before the nis or nisplus in the services line of the /etc/nsswitch.conf file, or, better, merge the changes to the services file into the services map.

It is a good idea to always merge in the new entries to /etc/services, /etc/inet/protocols, and /etc/rpc into their respective maps whenever a new OS is installed.

Cannot exec a shared library directly

Cause

The system is attempting to exec(2) a shared library, directly.

Technical Notes

The symbolic name for this error is ELIBEXEC, errno=87.

Cannot find SERVER hostname in network database

Cause

A brief description: the user is on a different subnet and is running permanent licenses:


ultra1(50)% cc -o hello hello.c
License Error : Cannot find the license server (fry)
in the network database for product(Sun WorkShop Compiler C)
Cannot find SERVER hostname in network database (-14,7)
cc: acomp failed for hello.c
ultra1(51)%

Action

Check the following:

  1. Make sure that the server is up and running.

  2. Make sure that the server is in the /etc/hosts file of the client system by typing: ping servername.

  3. Make sure the license daemon on the server is running.

  4. Make sure there is an elementary license file on the client:


    cd /etc/opt/licenses
    more sunpro.loc

  5. Make sure there are only text license files, such as sunpro.lic.1 in the sunpro,loc directory.

  6. For the client check, see below:


     % cd /etc
     % more nsswitch.conf | grep hosts
     hosts:      nis [NOTFOUND=return] files
    This means that it is using the NIS server to look up the IP address. If it is set first for nis and the /etc/hosts file has the server listed by name, change the line to

    hosts:      files nis 
    Then, see if it can be found. If not, try truss and snoop to see what is happening.

cannot install bootblock

Cause

In this case, the user installs the Solaris IA software on the Intel platform and the install seems fine. When the system is rebooted after the installation, the user receives the above error message at startup. At this point, the user cannot gain access to the system.

Action

This error occurs when you use the fdisk utility in the Solaris operating environment, do a newfs, and then do a restore, but forget to do the install for the boot block. When you do a newfs and then a restore operation, you need to perform an installboot before installing the OS. Otherwise, you get the above error. There is no guarantee, but the installboot procedure might or might not work after booting into single user mode from the CD-ROM.

To install the UFS boot block and partition the boot program on slice 2 of target 0 on controller 1 of the platform, where the command is being run, use the following:


# installboot /usr/platform/uname -i/lib/fs/ufs/pboot \           
/usr/platform/uname -i/lib/fs/ufs/bootblk /dev/rdsk/c1t0d0s2

Cannot open FCC file

Cause

When trying to send mail by Netscape, this message is displayed. Netscape is trying to save the outbound message to a file that has been specified by the user, but does not exist.

Action

To correct this problem do the following: go to options Mail and News Preferences, then go to Compose. A template pops up. There is a section that specifies where to save outgoing mail and news files. Make sure that these files exist or remove them from the template, if you do not care about logging which messages are sent through Netscape.

Cannot send after transport endpoint shutdown

Cause

A request to send data was disallowed, because the transport endpoint has already been shut down.

Technical Notes

The symbolic name for this error is ESHUTDOWN, errno=143.

can't communicate with ypbind

Cause

ypcat passwd returns with the error message, can't communicate with ypbind, but ypbind is running.


ls -l /var/yp/binding/ypbind.pid   
-r--------   1 root     root           3 Dec  1 07:40 ypbind.pid  
umask for root is set to 077.

Action

Set umask for root back to 022. /var/yp/binding/ypbind.pid must be readable by all groups.

Refer to the following example:


ls -l /var/yp/binding/ypbind.pid   
-r--r--r--   1 root     root           3 Dec  1 07:40 ypbind.pid

Can't create public message device (Device busy)

Cause

This message comes from the lp(1) print scheduler, indicating that it is either extremely busy or hanging.

Action

If print jobs are coming out of the printer in question, wait until they are finished and then resubmit this print job. If you see this message again, the lp(1) system is probably hanging.

For a procedure to clear the queue, refer to "lp hang".

Technical Notes

If lp(1) is unable to create a device for printer messages, the message FIFO could already be in use or could be locked by another print job.

See Also

For more information on the print scheduler, see the section on administrating printers in the System Administration Guide, Volume 2.

Can't invoke /etc/init, error int

Cause

This message can appear while a system is booting, indicating that the init(1M) program is missing or corrupted. Note that /etc/init is a symbolic link to /sbin/init.

Action

Do the following:

  1. Boot the mini-root so you can replace init(1M).

  2. Halt the machine by typing Stop-A or by pressing the reset button.

  3. Reboot as a single user from the CD-ROM, the net, or a diskette. For example, type boot cdrom -s at the ok prompt to boot from a CD-ROM.

  4. After the system comes up and gives you a # prompt, mount the device corresponding to the original root (/) partition somewhere, with a command similar to the mount(1M) command, as shown below:


    # mount /dev/dsk/c0t3d0s0 /mnt
    # cp /sbin/init /mnt/sbin/init
    # reboot

  5. Then copy the init(1M) program from the mini-root to the original root (/) partition.

  6. Reboot the system.

If this does not work, other files might be corrupted, and you might need to reinstall the entire system.

Technical Notes

The error number is 2 if /sbin/init is missing, or 8 if /sbin/init has an incorrect executable format. This message is usually followed by a panic: icode message. The system tries to reboot itself, but goes into a loop, because rebooting is impossible without init(1M).

See Also

For more information on booting the system, see the section on halting and booting the system in the System Administration Guide, Volume 1.

can't open /dev/rdsk/string: (null): UNEXPECTED INCONSISTENCY

Cause

In the SunOSTM 4.1.x release, this message indicated that the device containing the /dev file system has become disconnected.

A particular response from the Solaris operating environment has not been defined.

can't synchronize with hayes

Cause

This message sometimes appears when using a modem that the system regards as a "Hayes" type modem, which includes most modems manufactured today. The message can be caused by incorrect switch settings, by poor cable connections, or by not turning the modem on.

Action

Check that the modem is on and that the cables between the modem and your system are securely connected. Check the internal and external modem switch settings. If necessary, turn the modem off and then on again.

cd: Too many arguments

Cause

The C shell's cd(1) command takes only one argument. Either more than one directory was specified, or a directory name containing a space was specified. Directory names with spaces are easy to create with File Manager.

Action

Use only one directory name. To change to a directory whose name contains spaces, enclose the directory name in double (") or single (') quotes, or use File Manager.

Channel number out of range

Cause

The system has run out of stream devices. This error results when a stream head attempts to open a minor device that does not exist or is currently in use.

Action

Check that the stream device in question exists and was created with an appropriate number of minor devices. Make sure that the hardware corresponds to this configuration. If the stream device configuration is correct, try again later when more system resources might be available.

Technical Notes

The symbolic name for this error is ECHRNG, errno=37.

chmod: ERROR: invalid mode

Cause

This message from the chmod(1) command indicates a problem in the first non-option argument.

Action

If you are specifying a numeric file mode, you can provide any number of digits (although only the final one-to-four are considered), but all digits must be between 0 and 7. If you are specifying a symbolic file mode, use the syntax provided in the chmod(1) usage message to avoid the "invalid mode" error message: Usage: chmod [ugoa][+-=][rwxlstugo] file ...

Some combinations of symbolic key letters produce no error message, but fail to have any effect. The first group, [ugoa], is truly optional. The second group, [+-=], is mandatory for chmod(1) to have an effect. The third group, [rwxlstugo], is also mandatory for effect and can be used in combination when that combination does not conflict.

Command not found

Cause

The C shell could not find the program you gave as a command.

Action

Check the form and spelling of the command line. If that looks correct, use echo $path to see if the user's search path is correct. When communications are garbled, it is possible to unset a search path to such an extent that only built-in shell commands are available. Below is a command to reset a basic search path:


 % set path = (/usr/bin /usr/ccs/bin /usr/openwin/bin .)
If the search path looks correct, check the directory contents along the search path to see if programs are missing or if directories are not mounted.

See Also

For more information about the C shell, see csh(1).

Communication error on send

Cause

This error occurs when the current process is waiting for a message from a remote machine, but the link connecting the machines breaks.

Technical Notes

The symbolic name for this error is ECOMM, errno=70.

config error: mail loops back to myself.

Cause

User sees this message when sending mail:


# dle@g3... Connecting to g3.xyz.edu. (ether)... 
220 xyz.edu Sendmail SMI-8.6/SMI-SVR4 ready at Wed, 7 Jan 1998 14:28:20 -0600 
>>> HELO xyz.edu 
250 xyz.edu Hello g1.xyz.edu [129.106.16.1], pleased to meet you 
xyz.edu config error: mail loops back to myself 
>>> QUIT 
221 g1.xyz.edu closing connection 
dle@g3... Local configuration error 
Saving message in /dead.letter 
/dead.letter... Sent
The sending system (see line 220) and the receiving system (see the HELO line) both think they are known as "xyz.edu."

Action

Edit the sendmail.cf file as follows:

  1. Type the official host name.

  2. For the domain, you have choices: If you want the gateway machine to identify itself as the domain, use Dj$m; if you want the gateway machine to appear to be inside the domain, use Dj$w.$m; and if you are using sendmail.mx (or have a fully-qualified host name), use Dj$w.

  3. Uncomment Dj$w.$m and comment Dj$m. This gives each system a unique name. $w is the system host name, and $m is the domain.

connect from hostIP to callit(ypserv): request from non-local host

Refer to "connect from hostIP to callit(ypserv): request from unauthorized host".

connect from hostIP to callit(ypserv): request from unauthorized host

Cause

An example of a message from SunOS:


Jan  5 14:45:37 host1 portmap[86]: connect from 158.175.36.135 to 
callit(ypserv): request from unauthorized host
Other possiblities for the end portion of the error message include:

In the Solaris operating environment, the error might look similar to the following:


Jan  5 14:45:37 host1 rpcbind[86]: refused connect from 158.175.36.135 
to callit(ypserv)

In all cases, the ypserv part of the message might actually be any RPC service, such as mount or nfs or status.

Action

The user has a replacement portmap or rpcbind. The version is enhanced to add access controls, and the error in question is reporting an access violation. The replacements are third-party and are not supported by Sun. The user must locate the access control configuration files and change them to the desired access controls.

connect from hostIP to callit(ypserv): request from unprivileged port

Refer to "connect from hostIP to callit(ypserv): request from unauthorized host".

connect from hostIP to callit(ypserv): request not forwarded

Refer to "connect from hostIP to callit(ypserv): request from unauthorized host".

Connection closed.

Cause

When using rlogin(1), this message can appear under the following circumstances:

Data loss is possible if files were modified and not saved before the connection closed.

Action

Try again. If the other system has gone down, wait for it to reboot first.

Connection closed by foreign host.

Cause

When a user applies telnet(1) to another system, this message can appear under the following circumstances:

Data loss is possible if files were modified and not saved before the connection closed.

Action

Try again. If the other system has gone down, wait for it to reboot first.

[Connection closed. Exiting]

Cause

After using the talk(1) command to communicate with another user, the other person enters an interrupt (usually Control-C), and this message appears on your screen.

Action

Sending an interrupt is the usual way of exiting the talk program. The talk(1) session is over, and you can return to your work.

Connection refused

Cause

No connection could be made because the target machine actively refused it. This happens either when trying to connect to an inactive service or when a service process is not present at the requested address.

Action

Activate the service on the target machine, or start it up again if it has disappeared. If, for security reasons, you do not intend to provide this service, inform the user community, possibly suggesting an alternative.

Technical Notes

The symbolic name for this error is ECONNREFUSED, errno=146.

Connection reset by peer

Cause

A connection was forcibly closed by a peer. This is normally due to a remote host connection loss from a timeout or a reboot.

Technical Notes

The symbolic name for this error is ECONNRESET, errno=131.

Connection timed out

Cause

This error occurs either when the destination host is down or when problems in the network cause a loss in transmission.

Action

Do the following:

  1. Check the operation of the host system, for example by using ping(1M) and ftp(1).

  2. Repair or reboot as necessary.

  3. If the above does not solve the problem, check the network cabling and connections.

Technical Notes

No connection was established in a specified time. A connect or send request failed because the destination host did not properly respond after a reasonable interval. (The time-out period is dependent on the communication protocol.)

The symbolic name for this error is ETIMEDOUT, errno=145.

console login: ^J^M^Q^K^K^P

Cause

This error usually occurs because OpenWindows exited abnormally, leaving the system's keyboard in the wrong mode. The characters that appear when someone attempts to login are garbage transliterations of what someone typed.

Action

If you are on a SPARCTM system, do the following:

  1. Find another machine and remote log in to this system

  2. Run the following command:


    $ /usr/openwin/bin/kbd_mode -a

This puts the console back into ASCII mode.


Note -

kbd_mode is not a windows program; it fixes the console mode.


If you are on an IA system, do the following:

  1. Log in remotely and start

  2. kill the X server or reboot the system

Technical Notes

The usual reason for this problem occurring is an automated script run from cron(1M) that clears the /tmp directory periodically. Ensure that any such scripts do not remove the /tmp/.X11-pipe or /tmp/.X11-unix directories, or any files in them.

core dumped

Cause

A core(4) file contains an image of memory at the time of software failure and is used by programmers to find the reason for the failure.

Action

To see which program produced a core(4) file, run either the file(1) command or the adb(1) command. The following examples show the output of the file(1) and adb(1) commands on a core file from the dtmail program.


$ file core
core: ELF 32-bit MSB core file SPARC Version 1, from `dtmail'

$ adb core
core file = core -- program `dtmail'
SIGSEGV  11: segmentation violation
^D      (use Control-d to quit the program)
Ask the vendor or author of this program for a debugged version.

Technical Notes

Some signals, such as SIGQUIT, SIGBUS, and SIGSEGV, produce a core dump. See the signal(5) man page for a complete list.

If you have the source code for the program, you can try compiling it with cc -g, and debugging it yourself using dbx or a similar debugger. The where directive of dbx provides a stack trace.

On mixed networks, it can be difficult to discern which machine architecture produced a particular core dump, since adb(1) on one type of system generally cannot read a core(4) file from another type of system and can produce an unrecognized file message. Run adb(1) on various machine architectures until you find the right one.

See Also

For information on saving and viewing crash information, see the System Administration Guide, Volume 2. If you are using AnswerBook online documentation, "system crash" is a good search string.

corrupt label - wrong magic number or corrupt label or corrupt label - label checksum failed

Cause

After a power cycle, the machine displays either of the following error messages:

format(1M) displayed the following:


  0 unassigned    wm       0               0         (0/0/0)          0
  1 unassigned    wm       0               0         (0/0/0)          0
  2     backup    wm       0 - 5460        4.2G    (5460/0/0)   4154160
  3 unassigned    wm       0               0         (0/0/0)          0
  4 unassigned    wm       0               0         (0/0/0)          0
  5 unassigned    wm       0               0         (0/0/0)          0
  6 unassigned    wm       0 - 2730       2.1G       (0/0/0)          0
  7 unassigned 	  wm       2730-5460      2.1G       (0/0/0)          0

The disks were using raw partitions beginning at block 0 (cylinder 0). The disk label (VTOC) is kept on the block 0 of cylinder 0. The label eventually gets overwritten by database programs using raw partitions, if the raw partition begins at cylinder 0. (UNIX® file systems avoid this area of the partition.)

Action

As a workaround, do the following:

  1. Go into format(1M) and get the backup label using the backup command.

  2. Relabel the disk using this backup label. You should then be able to access the disk.

  3. Backup the data on this disk.

  4. Go back to the disk and relabel it, starting the raw partition at cylinder 1. (This loses one cylinder, but prevents corrupting the VTOC.)

  5. Label again.

  6. Restore the data from your backup.

could not grant slave pty

Cause

User gets the error message could not grant slave pty when attempting a telnet(1), rlogin(1), or rsh(1) session (anything that requires a shell) or when trying to bring up an x-term.

Action

The user's file permissions were set wrong on /usr/lib/pt_chmod. The user had:


# ls -la /usr/lib/pt_chmod
---s--x--x   1 bin     bin         3120 May  3  1996
The permissions should be:

# ls -la /usr/lib/pt_chmod
---s--x--x   1 root     bin         3120 May  3  1996


Note -

The owner should be root; the user had bin as the owner. Also, the setuid bit must be set.


By using chown root pt_chmod, the problem was corrected.

Could not initialize tooltalk (tt_open): TT_ERR_NOMP

Cause

Various desktop tools display or print this message when the ttsession(1) process is not available. The ToolTalk service generally tries to restart ttsession(1), if it is not running. Thus, this error indicates that the ToolTalk service is either not installed or is not installed correctly.

Action

Verify that the ttsession(1) command exists in /usr/openwin/bin or /usr/dt/bin. If this command is not present, ToolTalk is not installed correctly. The packages constituting ToolTalk are the runtime SUNWtltk, developer support SUNWtltkd, and the manual pages SUNWtltkm. CDE ToolTalk packages have the same names with ".2" appended.

Technical Notes

The full TT_ERR_NOMP message string reads as follows: "No ttsession(1) is running, probably because tt_open(3) has not been called yet. If this is returned from tt_open(3), it means ttsession(1) could not be started, which generally means ToolTalk is not installed on the system."

Could not open ToolTalk Channel

Cause

This error message is displayed while attempting to remotely run workshop.

Action

Do the following:

  1. Make sure workshop is no longer running.

  2. In the telnet/rlogin session window, type: /bin/ps -ef | grep ttsession. If one is running in the system that belongs to the telnet user, type kill pid_of_ttsession.

  3. In the telnet rlogin session, type /usr/dt/bin/ttsession -s -d machine_telnetting_from:0.0.

  4. Start workshop.

Could not start new viewer

Cause

This message appears in the AnswerBook navigator window, along with an XView error message on the console.

Action

For details, refer to "answerbook: XView error: NULL pointer passed to xv_set".

Could not start NFS service for any protocol. Exiting

Cause

The following errors occur at boot time:


/usr/lib/nfs/nfsd[478]: t_bind to wrong address
/usr/lib/nfs/nfsd[478]: t_bind to wrong address
/usr/lib/nfs/nfsd[478]: Cannot establish NFS service over /dev/udp: transport setup problem.
/usr/lib/nfs/nfsd[478]: Cannot establish NFS service over /dev/udp: transport setup problem.
/usr/lib/nfs/nfsd[478]: t_bind to wrong address
/usr/lib/nfs/nfsd[478]: t_bind to wrong address
/usr/lib/nfs/nfsd[478]: Cannot establish NFS service over /dev/tcp: transport setup problem.
/usr/lib/nfs/nfsd[478]: Cannot establish NFS service over /dev/tcp: transport setup problem.
/usr/lib/nfs/nfsd[478]: Could not start NFS service for any protocol. Exiting.
/usr/lib/nfs/nfsd[478]: Could not start NFS service for any protocol. Exiting.

In this situation, a backup copy of the S15nfs.server script in /etc/rc3.d was made. However, the backup copy was renamed to S15nfs.server.BAK. Since the backup copy starts with a upper case "S," it was also executed at boot time. The errors occurred when a second NFSD was attempted.

Action

If a backup copy of any startup script is made, it should be renamed with a lower case "s," so as not to be executed at boot.

cpio: Bad magic number/header.

Cause

A cpio(1) archive has either become corrupted or was written out with an incompatible version of cpio(1).

Action

Use the -k option to cpio(1) to skip I/O errors and corrupted file headers. This might permit you to extract other files from the cpio(1) archive. To extract files with corrupted headers, try editing the archive with a binary editor such as emacs(1). Each cpio(1) file header contains a filename as a string.

See Also

For more information on magic numbers, see magic(4).

cpio : can't read input : end of file encountered prior to expected end of archive.

Cause

This message appears when trying to read a multi-volume floppy in bar format using the following command:


  # cpio -id -H bar -I /dev/diskette0

Action

kill /usr/sbin/vold by running /etc/init.d/volmgt stop and use the device name /dev/rfd0.

Cross-device link

Cause

An attempt was made to make a hard link to a file on another device, such as on another file system.

Action

Establish a symbolic link using ln -s instead. Symbolic links are permitted across file system boundaries.

Technical Notes

The symbolic name for this error is EXDEV, errno=18.