Solaris Naming Administration Guide

NIS Problems and Solutions

This section explains how to resolve problems encountered on networks running NIS. It covers problems seen on an NIS client and those seen on an NIS server.

Before trying to debug an NIS server or client, review Chapter 18, Network Information Service (NIS), which explains the NIS environment. Then look for the subheading in this section that best describes your problem.


Common symptoms of NIS binding problems include:

NIS Problems Affecting One Client

If only one or two clients are experiencing symptoms that indicate NIS binding difficulty, the problems probably are on those clients. If many NIS clients are failing to bind properly, the problem probably exists on one or more of the NIS servers (see "NIS Problems Affecting Many Clients").

ypbind Not Running on Client

One client has problems, but other clients on the same subnet are operating normally. On the problem client, run ls -l on a directory, such as /usr, that contains files owned by many users, including some not in the client /etc/passwd file. If the resulting display lists file owners who are not in the local /etc/passwd as numbers, rather than names, this indicates that NIS service is not working on the client.

These symptoms usually mean that the client ypbind process is not running. Run ps -e and check for ypbind. If you do not find it, log in as superuser and start ypbind by typing:

client# /usr/lib/netsvc/yp/ypstart

Missing or Incorrect Domain Name

One client has problems, the other clients are operating normally, but ypbind is running on the problem client. The client may have an incorrectly set domain.

On the client, run the domainname command to see which domain name is set.

Client#7 domainname

Compare the output with the actual domain name in /var/yp on the NIS master server. The actual NIS domain is shown as a subdirectory in the /var/yp directory.

Client#7 ls /var/yp...
-rwxr-xr-x 1 root Makefile
drwxr-xr-x 2 root binding
drwx------ 2 root

If the domain name returned by running domainname on a machine is not the same as the server domain name listed as a directory in /var/yp, the domain name specified in the machine's /etc/defaultdomain file is incorrect. Log in as superuser and correct the client's domain name in the machine's /etc/defaultdomain file. This assures that the domain name is correct every time the machine boots. Now reboot the machine.

Note -

The domain name is case-sensitive.

Client Not Bound to Server

If your domain name is set correctly, ypbind is running, and commands still hang, then make sure that the client is bound to a server by running the ypwhich command. If you have just started ypbind, then run ypwhich several times (typically, the first one reports that the domain is not bound and the second succeeds normally).

No Server Available

If your domain name is set correctly, ypbind is running, and you get messages indicating that the client cannot communicate with a server, this may indicate a number of different problems:

Note -

For reasons of security and administrative control it is preferable to specify the servers a client is to bind to in the client's ypservers file rather than have the client search for servers through broadcasting. Broadcasting ties up the network, slows the client, and prevents you from balancing server load by listing different servers for different clients.

ypwhich Displays Are Inconsistent

When you use ypwhich several times on the same client, the resulting display varies because the NIS server changes. This is normal. The binding of the NIS client to the NIS server changes over time when the network or the NIS servers are busy. Whenever possible, the network stabilizes at a point where all clients get acceptable response time from the NIS servers. As long as your client machine gets NIS service, it does not matter where the service comes from. For example, an NIS server machine may get its own NIS services from another NIS server on the network.

When Server Binding is Not Possible

In extreme cases where local server binding is not possible, use of the ypset command may temporarily allow binding to another server, if available, on another network or subnet. However, in order to use the -ypset option, ypbind must be started with either the -ypset or -ypsetme options.

Note -

For security reasons, the use of the -ypset and -ypsetme options should be limited to debugging purposes under controlled circumstances. Use of the -ypset and -ypsetme options can result in serious security breaches because while they are operative anyone can then alter server bindings causing trouble for others and permitting unauthorized access to sensitive data. If you must start ypbind with these options, once you have fixed the problem you should kill ypbind and restart it again without those options.

ypbind Crashes

If ypbind crashes almost immediately each time it is started, look for a problem in some other part of the system. Check for the presence of the rpcbind daemon by typing:

% ps -ef | grep rpcbind

If rpcbind is not present or does not stay up or behaves strangely, consult your RPC documentation.

You may be able to communicate with rpcbind on the problematic client from a machine operating normally. From the functioning machine, type:

% rpcinfo client

If rpcbind on the problematic machine is fine, rpcinfo produces the following output:

program	version	netid	address	service	owner
100007	2	udp	ypbind	superuser
100007	1	udp	ypbind	superuser
100007	1	tcp	ypbind	superuser
100007	2	tcp	ypbind	superuser
100007	2	ticotsord	\000\000\020H	ypbind	superuser
100007	2	ticots	\000\000\020K	ypbind	superuser

Your machine will have different addresses. If they are not displayed, ypbind has been unable to register its services. Reboot the machine and run rpcinfo again. If the ypbind processes are there and they change each time you try to restart /usr/lib/netsvc/yp/ypbind, reboot the system, even if the rpcbind daemon is running.

NIS Problems Affecting Many Clients

If only one or two clients are experiencing symptoms that indicate NIS binding difficulty, the problems probably are on those clients (see "NIS Problems Affecting One Client"). If many NIS clients are failing to bind properly, the problem probably exists on one or more of the NIS servers.

Network or Servers are Overloaded

NIS can hang if the network or NIS servers are so overloaded that ypserv cannot get a response back to the client ypbind process within the time-out period.

Under these circumstances, every client on the network experiences the same or similar problems. In most cases, the condition is temporary. The messages usually go away when the NIS server reboots and restarts ypserv, or when the load on the NIS servers or network itself decreases.

Server Malfunction

Make sure the servers are up and running. If you are not physically near the servers, use the ping command.

NIS Daemons Not Running

If the servers are up and running, try to find a client machine behaving normally, and run the ypwhich command. If ypwhich does not respond, kill it. Then log in as root on the NIS server and check if the NIS ypbind process is running by entering:

# ps -e | grep yp

Note -

Do not use the -f option with ps because this option attempts to translate user IDs to names which causes more name service lookups that may not succeed.

If either the ypbind or ypserv daemons are not running, kill them and then restart them by entering:

# /usr/lib/netsvc/yp/ypstop
# /usr/lib/netsvc/yp/ypstart

If both the ypserv and ypbind processes are running on the NIS server, type:

# ypwhich

If ypwhich does not respond, ypserv has probably hung and should be restarted. While logged in as root on the server, kill ypserv and restart it by typing:

# /usr/lib/netsvc/yp/ypstop
# /usr/lib/netsvc/yp/ypstart

Servers Have Different Versions of an NIS Map

Because NIS propagates maps among servers, occasionally you may find different versions of the same map on various NIS servers on the network. This version discrepancy is normal add acceptable if the differences do not last for more than a short time.

The most common cause of map discrepancy is that something is preventing normal map propagation. For example, an NIS server or router between NIS servers is down. When all NIS servers and the routers between them are running, ypxfr should succeed.

If the servers and routers are functioning properly, check the following:

Logging ypxfr Output

If a particular slave server has problems updating maps, log in to that server and run ypxfr interactively. If ypxfr fails, it tells you why it failed, and you can fix the problem. If ypxfr succeeds, but you suspect it has occasionally failed, create a log file to enable logging of messages. To create a log file, enter:

ypslave# cd /var/yp
ypslave# touch ypxfr.log

This creates a ypxfr.log file that saves all output from ypxfr.

The output resembles the output ypxfr displays when run interactively, but each line in the log file is time stamped. (You may see unusual ordering in the time-stamps. That is okay--the time-stamp tells you when ypxfr started to run. If copies of ypxfr ran simultaneously but their work took differing amounts of time, they may actually write their summary status line to the log files in an order different from that which they were invoked.) Any pattern of intermittent failure shows up in the log.

Note -

When you have fixed the problem, turn off logging by removing the log file. If you forget to remove it, it continues to grow without limit.

Check the crontab File and ypxfr Shell Script

Inspect the root crontab file, and check the ypxfr shell script it invokes. Typographical errors in these files can cause propagation problems. Failures to refer to a shell script within the /var/spool/cron/crontabs/root file, or failures to refer to a map within any shell script can also cause errors.

Check the ypservers Map

Also, make sure that the NIS slave server is listed in the ypservers map on the master server for the domain. If it is not, the slave server still operates perfectly as a server, but yppush does not propagate map changes to the slave server.

Work Around

If the NIS slave server problem is not obvious, you can work around it while you debug using rcp or ftp to copy a recent version of the inconsistent map from any healthy NIS server. For instance, here is how you might transfer the problem map:

 ypslave# rcp ypmaster:/var/yp/mydomain/map.\* /var/yp/mydomain

Here the * character has been escaped in the command line, so that it will be expanded on ypmaster, instead of locally on ypslave.

ypserv Crashes

When the ypserv process crashes almost immediately, and does not stay up even with repeated activations, the debug process is virtually identical to that described in "ypbind Crashes". Check for the existence of the rpcbind daemon as follows:

ypserver% ps -e | grep rpcbind

Reboot the server if you do not find the daemon. Otherwise, if the daemon is running, type the following and look for similar output:

% rpcinfo -p ypserver
program 	vers 	proto 	port 	service
100000	4	tcp	111	portmapper
100000	3	tcp	111	portmapper
100068	2	udp	32813	cmsd
100007	1	tcp	34900	ypbind
100004	2	udp	731	ypserv
100004	1	udp	731	ypserv
100004	1	tcp	732	ypserv
100004	2	tcp	32772	ypserv

Your machine may have different port numbers. The four entries representing the ypserv process are:

100004 	2 	udp 	731 	ypserv
100004 	1 	udp 	731 	ypserv
100004 	1 	tcp 	732 	ypserv
100004 	2 	tcp 	32772 	ypserv

If they are not present, and ypserv is unable to register its services with rpcbind, reboot the machine. If they are present, deregister the service from rpcbind before restarting ypserv. To deregister the service from rpcbind, on the server type:

# rpcinfo -d number 1
# rpcinfo -d number 2

Where number is the ID number reported by rpcinfo (100004, in the example above).