System Administration Guide: Naming and Directory Services (FNS and NIS+)

Chapter 24 NIS+ Troubleshooting

In this chapter, problems are grouped according to type. For each problem there is a list of common symptoms, a description of the problem, and one or more suggested solutions.

In addition, Appendix A, Error Messages contains an alphabetic listing of the more common NIS+ error messages.

Note –

NIS+ might not be supported in a future release. Tools to aid the migration from NIS+ to LDAP are available in the Solaris 9 operating environment (see System Administration Guide: Naming and Directory Services (DNS, NIS, and LDAP)). For more information, visit http://www.sun.com/directory/nisplus/transition.html.

NIS+ Debugging Options

The NIS_OPTIONS environment variable can be set to control various NIS+ debugging options.

Options are specified after the NIS_OPTIONS command separated by spaces with the option set enclosed in double quotes. Each option has the format name=value. Values can be integers, character strings, or filenames depending on the particular option. If a value is not specified for an integer value option, the value defaults to 1.

NIS_OPTIONS recognizes the following options:

Table 24–1 NIS_OPTIONS Options and Values


Option	Values	Actions
`debug_file`	`filename`	Directs debug output to specified file. If this option is not specified, debug output goes to `stdout`.
`debug_bind`	`Number`	Displays information about the server selection process.
`debug_rpc`	`1` or `2`	If the value is 1, displays RPC calls made to the NIS+ server and the RPC result code. If the value is 2, displays both the RPC calls and the contents of the RPC and arguments and results.
`debug_calls`	`Number`	Displays calls to the NIS+ API and the results that are returned to the application.
`pref_srvr`	`String`	Specifies preferred servers in the same format as that generated by the `nisprefadm` command (see Table 20–1). This will over-ride the preferred server list specified in `nis_cachemgr`.
`server`	`String`	Bind to a particular server.
`pref_type`	`String`	Not currently implemented.

For example, (assuming that you are using a C-Shell):

To display many debugging messages you would enter:

setenv NIS_OPTIONS “debug_calls=2 debug_bind debug_rpc”

To obtain a simple list of API calls and store them in the file /tmp/CALLS you would enter:

setenv NIS_OPTIONS “debug_calls debug_file=/tmp/CALLS”

To obtain a simple list of API calls sent to a particular server you would enter:

setenv NIS_OPTIONS “debug_calls server=sirius”

NIS+ Administration Problems

This section describes problems that may be encountered in the course of routine NIS+ namespace administration work. Common symptoms include:

“Illegal object type” for operation message.
Other “object problem” error messages
Initialization failure
Checkpoint failures
Difficulty adding a user to a group
Logs too large/lack of disk space/difficulty truncating logs
Cannot delete groups_dir or org_dir

Illegal Object Problems

Symptoms

"Illegal object type" for operation message
Other “object problem” error messages

There are a number of possible causes for this error message:

You have attempted to create a table without any searchable columns.
A database operation has returned the status of DB_BADOBJECT (see the nis_db man page for information on the db error codes).
You are trying to add or modify a database object with a length of zero.
You attempted to add an object without an owner.
The operation expected a directory object, and the object you named was not a directory object.
You attempted to link a directory to a LINK object.
You attempted to link a table entry.
An object that was not a group object was passed to the nisgrpadm command.
An operation on a group object was expected, but the type of object specified was not a group object.
An operation on a table object was expected, but the object specified was not a table object.

`nisinit` Fails

Make sure that:

You can ping the NIS+ server to check that it is up and running as a machine.
The NIS+ server that you specified with the -H option is a valid server and that it is running the NIS+ software.
rpc.nisd is running on the server.
The nobody class has read permission for this domain.
The netmask is properly set up on this machine.

Checkpoint Keeps Failing

If checkpoint operations with a nisping -C command consistently fail, make sure you have sufficient swap and disk space. Check for error messages in syslog. Check for core files filling up space.

Cannot Add User to a Group

A user must first be an NIS+ principal client with a LOCAL credential in the domain's cred table before the user can be added as a member of a group in that domain. A DES credential alone is not sufficient.

Logs Grow too Large

Failure to regularly checkpoint your system with nisping -C causes your log files to grow too large. Logs are not cleared on a master until all replicas for that master are updated. If a replica is down or otherwise out of service or unreachable, the master's logs for that replica cannot be cleared. Thus, if a replica is going to be down or out of service for a period of time, you should remove it as a replica from the master as described in Removing a Directory. Keep in mind that you must first remove the directory's org_dir and groups_dir subdirectories before you remove the directory itself.

Lack of Disk Space

Lack of sufficient disk space will cause a variety of different error messages. (See Insufficient Disk Space for additional information.)

Cannot Truncate Transaction Log File

First, check to make sure that the file in question exists and is readable and that you have permission to write to it.

You can use ls /var/nis/trans.log to display the transaction log.
You can use nisls -l and niscat to check for existence, permissions, and readability.
You can use syslog to check for relevant messages.

The most likely cause of inability to truncate an existing log file for which you have the proper permissions is lack of disk space. (The checkpoint process first creates a duplicate temporary file of the log before truncating the log and then removing the temporary file. If there is not enough disk space for the temporary file, the checkpoint process cannot proceed.) Check your available disk space and free up additional space if necessary.

Domain Name Confusion

Domain names play a key role in many NIS+ commands and operations. To avoid confusion, you must remember that, except for root servers, all NIS+ masters and replicas are clients of the domain above the domain that they serve. If you make the mistake of treating a server or replica as if it were a client of the domain that it serves, you may get Generic system error or Possible loop detected in namespace directoryname:domainname error messages.

For example, the machine altair might be a client of the subdoc.doc.com. domain. If the master server of the subdoc.doc.com. subdomain is the machine sirius, then sirius is a client of the doc.com. domain. When using, specifying, or changing domains, remember these rules to avoid confusion:

Client machines belong to a given domain or subdomain.
Servers and replicas that serve a given subdomain are clients of the domain above the domain they are serving.
The only exception to Rule 2 is that the root master server and root replica servers are clients of the same domain that they serve. In other words, the root master and root replicas are all clients of the root domain.

Thus, in the example above, the fully qualified name of the altair machine is alladin.subdoc.doc.com. The fully qualified name of the sirius machine is sirius.doc.com. The name sirius.subdoc.doc.com. is wrong and will cause an error because sirius is a client of doc.com., not subdoc.doc.com.

Cannot Delete `org_dir` or `groups_dir`

Always delete org_dir and groups_dir before deleting their parent directory. If you use nisrmdir to delete the domain before deleting the domain's groups_dir and org_dir, you will not be able to delete either of those two subdirectories.

Removal or Disassociation of NIS+ Directory from Replica Fails

When removing or disassociating a directory from a replica server you must first remove the directory's org_dir and groups_dir subdirectories before removing the directory itself. After each subdirectory is removed, you must run nisping on the parent directory of the directory you intend to remove. (See Removing a Directory.)

If you fail to perform the nisping operation, the directory will not be completely removed or disassociated.

If this occurs, you need to perform the following steps to correct the problem:

Remove /var/nis/rep/org_dir on the replica.
Make sure that org_dir.domain does not appear in /var/nis/rep/serving_list on the replica.
Perform a nisping on domain.
From the master server, run nisrmdir -f replica_directory.

If the replica server you are trying to dissociate is down or out of communication, the nisrmdir -s command will return a Cannot remove replica name: attempt to remove a non-empty table error message.

In such cases, you can run nisrmdir -f -s replicaname on the master to force the dissociation. Note, however, that if you use nisrmdir -f -s to dissociate an out-of-communication replica, you must run nisrmdir -f -s again as soon as the replica is back on line in order to clean up the replica's /var/nis file system. If you fail to rerun nisrmdir -f -s replicaname when the replica is back in service, the old out-of-date information left on the replica could cause problems.

NIS+ Database Problems

This section covers problems related to the namespace database and tables. Common symptoms include error messages with operative clauses such as:

Abort_transaction: Internal database error
Abort_transaction: Internal Error, log entry corrupt
Callback: select failed
CALLBACK_SVC: bad argument

as well as when rpc.nisd fails.

Multiple `rpc.nisd` Parent Processes

Symptoms:

Various Database and transaction log corruption error messages containing the terms:

Log corrupted
Log entry corrupt
Corrupt database
Database corrupted

Possible Causes:

You have multiple independent rpc.nisd daemons running. In normal operation, rpc.nisd can spawn other child rpc.nisd daemons. This causes no problem. However, if two parent rpc.nisd daemons are running at the same time on the same machine, they will overwrite each other's data and corrupt logs and databases. (Normally, this could only occur if someone started running rpc.nisd by hand.)

Diagnosis:

Run ps -ef | grep rpc.nisd. Make sure that you have no more than one parent rpc.nisd process.

Solution:

If you have more than one “parent” rpc.nisd entries, you must kill all but one of them. Use kill -9 process-id, then run the ps command again to make sure it has died.

Note –

If you started rpc.nisd with the -B option, you must also kill the rpc.nisd_resolv daemon.

If an NIS+ database is corrupt, you will also have to restore it from your most recent backup that contains an uncorrupted version of the database. You can then use the logs to update changes made to your namespace since the backup was recorded. However, if your logs are also corrupted, you will have to recreate by hand any namespace modifications made since the backup was taken.

`rpc.nisd` Fails

If an NIS+ table is too large, rpc.nisd may fail.

Diagnosis:

Use nisls to check your NIS+ table sizes. Tables larger than 7k may cause rpc.nisd to fail.

Solution:

Reduce the size of large NIS+ tables. Keep in mind that as a naming service NIS+ is designed to store references to objects, not the objects themselves.

NIS+ and NIS Compatibility Problems

This section describes problems having to do with NIS compatibility with NIS+ and earlier systems and the switch configuration file. Common symptoms include:

The nsswitch.conf file fails to perform correctly.
Error messages with operative clauses.

Error messages with operative clauses include:

Unknown user
Permission denied
Invalid principal name

User Cannot Log In After Password Change

Symptoms:

New users, or users who recently changed their password are unable to log in at all, or able to log in on one or more machines but not on others. The user may see error messages with operative clauses such as:

Unknown user username
Permission denied
Invalid principal name

First Possible Cause:

Password was changed on NIS machine.

If a user or system administrator uses the passwd command to change a password on a Solaris operating environment machine running NIS in a domain served by NIS+ namespace servers, the user's password is changed only in that machine's /etc/passwd file. If the user then goes to some other machine on the network, the user's new password will not be recognized by that machine. The user will have to use the old password stored in the NIS+ passwd table.

Diagnosis:

Check to see if the user's old password is still valid on another NIS+ machine.

Solution:

Use passwd on a machine running NIS+ to change the user's password.

Second Possible Cause:

Password changes take time to propagate through the domain.

Diagnosis:

Namespace changes take a measurable amount of time to propagate through a domain and an entire system. This time might be as short as a few seconds or as long as many minutes, depending on the size of your domain and the number of replica servers.

Solution:

You can simply wait the normal amount of time for a change to propagate through your domain(s). Or you can use the nisping org_dir command to resynchronize your system.

`nsswitch.conf` File Fails to Perform Correctly

A modified (or newly installed) nsswitch.conf file fails to work properly.

Symptoms:

You install a new nsswitch.conf file or make changes to the existing file, but your system does not implement the changes.

Possible Cause:

Each time an nsswitch.conf file is installed or changed, you must reboot the machine for your changes to take effect. This is because nscd caches the nsswitch.conf file.

Solution:

Check your nsswitch.conf file against the information contained in the nsswitch.conf man page. Correct the file if necessary, and then reboot the machine.

NIS+ Object Not Found Problems

This section describes problem in which NIS+ was unable to find some object or principal. Common symptoms include:

Error messages with operative clauses such as:

Not found
Not exist
Can't find suitable transport forname”
Cannot find
Unable to find
Unable to stat

Syntax or Spelling Error

The most likely cause of some NIS+ object not being found is that you mistyped or misspelled its name. Check the syntax and make sure that you are using the correct name.

Incorrect Path

A likely cause of an “object not found” problem is specifying an incorrect path. Make sure that the path you specified is correct. Also make sure that the NIS_PATH environment variable is set correctly.

Domain Levels Not Correctly Specified

Remember that all servers are clients of the domain above them, not the domain they serve. There are two exceptions to this rule:

The root masters and root replicas are clients of the root domain.
NIS+ domain names end with a period. When using a fully qualified name you must end the domain name with a period. If you do not end the domain name with a period, NIS+ assumes it is a partially qualified name. However, the domain name of a machine should not end with a dot in the /etc/defaultdomain file. If you add a dot to a machine's domain name in the /etc/defaultdomain file, you will get Could not bind to server serving domain name error messages and encounter difficulty in connecting to the net on boot up.

Object Does Not Exist

The NIS+ object may not have been found because it does not exist, either because it has been erased or not yet created. Use nisls -l in the appropriate domain to check that the object exists.

Lagging or Out-of-Sync Replica

When you create or modify an NIS+ object, there is a time lag between the completion of your action and the arrival of the new updated information at a given replica. In ordinary operation, namespace information may be queried from a master or any of its replicas. A client automatically distributes queries among the various servers (master and replicas) to balance system load. This means that at any given moment you do not know which machine is supplying you with namespace information. If a command relating to a newly created or modified object is sent to a replica that has not yet received the updated information from the master, you will get an “object not found” type of error or the old out-of-date information. Similarly, a general command such as nisls may not list a newly created object if the system sends the nisls query to a replica that has not yet been updated.

You can use nisping to resync a lagging or out of sync replica server.

Alternatively, you can use the -M option with most NIS+ commands to specify that the command must obtain namespace information from the domain's master server. In this way you can be sure that you are obtaining and using the most up-to-date information. (However, you should use the -M option only when necessary because a main point of having and using replicas to serve the namespace is to distribute the load and thus increase network efficiency.)

Files Missing or Corrupt

One or more of the files in /var/nis/data directory has become corrupted or erased. Restore these files from your most recent backup.

Old `/var/nis` Filenames

In Solaris Release 2.4 and earlier, the /var/nis directory contained two files named hostname.dict and hostname.log. It also contained a subdirectory named /var/nis/hostname. Starting with Solaris Release 2.5, the two files were renamed trans.log and data.dict, and the subdirectory is named /var/nis/data.

Do not rename the /var/nis or /var/nis/data directories or any of the files in these directories that were created by nisinit or any of the other NIS+ setup procedures.

In Solaris Release 2.5, the content of the files were also changed and they are not backward compatible with Solaris Release 2.4 or earlier. Thus, if you rename either the directories or the files to match the Solaris Release 2.4 patterns, the files will not work with either the Solaris Release 2.4 or the Solaris Release 2.5 or later versions of rpc.nisd. Therefore, you should not rename either the directories or the files.

Blanks in Name

Symptoms:

Sometimes an object is there, sometimes it is not. Some NIS+ or UNIX commands report that an NIS+ object does not exist or cannot be found, while other NIS+ or UNIX commands do find that same object.

Diagnoses:

Use nisls to display the object's name. Look carefully at the object's name to see if the name actually begins with a blank space. (If you accidentally enter two spaces after the flag when creating NIS+ objects from the command line with NIS+ commands, some NIS+ commands will interpret the second space as the beginning of the object's name.)

Solution:

If an NIS+ object name begins with a blank space, you must either rename it without the space or remove it and then recreate it from scratch.

Cannot Use Automounter

Symptoms:

You cannot change to a directory on another host.

Possible Cause:

Under NIS+, automounter names must be renamed to meet NIS+ requirements. NIS+ cannot access /etc/auto* tables that contain a period in the name. For example, NIS+ cannot access a file named auto.direct.

Diagnosis:

Use nisls and niscat to determine if the automounter tables are properly constructed.

Solution:

Change the periods to underscores. For example, change auto.direct to auto_direct. (Be sure to change other maps that might reference these.)

Links To or From Table Entries Do Not Work

You cannot use the nisln command (or any other command) to create links between entries in tables. NIS+ commands do not follow links at the entry level.

NIS+ Ownership and Permission Problems

This section describes problems related to user ownership and permissions. Common symptoms include:

Error messages with operative clauses such as:

Unable to stat name
Unable to stat NIS+ directory name
Security exception on LOCAL system
Unable to make request
Insufficient permission to . . .
You name do not have secure RPC credentials

Another Symptom:

User or root unable to perform any namespace task.

No Permission

The most common permission problem is the simplest: you have not been granted permission to perform some task that you try to do. Use niscat -o on the object in question to determine what permissions you have. If you need additional permission, you, the owner of the object, or the system administrator can either change the permission requirements of the object (as described in Chapter 15, Administering NIS+ Access Rights,) or add you to a group that does have the required permissions (as described in Chapter 17, Administering NIS+ Groups).

No Credentials

Without proper credentials for you and your machine, many operations will fail. Use nismatch on your home domain's cred table to make sure you have the right credentials. See Corrupted Credentials for more on credentials-related problems.

Server Running at Security Level 0

A server running at security level 0 does not create or maintain credentials for NIS+ principals.

If you try to use passwd on a server that is running at security level 0, you will get the error message: You name do not have secure RPC credentials in NIS+ domain domainname.

Security level 0 is only to be used by administrators for initial namespace setup and testing purposes. Level 0 should not be used in any environment where ordinary users are active.

User Login Same as Machine Name

A user cannot have the same login ID as a machine name. When a machine is given the same name as a user (or vice versa), the first principal can no longer perform operations requiring secure permissions because the second principal's key has overwritten the first principal's key in the cred table. In addition, the second principal now has whatever permissions were granted to the first principal.

For example, suppose a user with the login name of saladin is granted namespace read-only permissions. Then a machine named saladin is added to the domain. The user saladin will no longer be able to perform any namespace operations requiring any sort of permission, and the root user of the machine saladin will only have read-only permission in the namespace.

Symptoms:

The user or machine gets “permission denied” type error messages.
Either the user or root for that machine cannot successfully run keylogin.
Security exception on LOCAL system. UNABLE TO MAKE REQUEST. error message.
If the first principal did not have read access, the second principal might not be able to view otherwise visible objects.

Note –

When running nisclient or nisaddcred, if the message Changing Key is displayed rather than Adding Key, there is a duplicate user or host name already in existence in that domain.

Diagnosis:

Run nismatch to find the host and user in the hosts and passwd tables to see if there are identical host names and user names in the respective tables:

nismatch username passwd.org_dir

Then run nismatch on the domain's cred table to see what type of credentials are provided for the duplicate host or user name. If there are both LOCAL and DES credentials, the cred table entry is for the user; if there is only a DES credential, the entry is for the machine.

Solution:

Change the machine name. (It is better to change the machine name than to change the user name.) Then delete the machine's entry from the cred table and use nisclient to reinitialize the machine as an NIS+ client. (If you wish, you can use nistbladm to create an alias for the machine's old name in the hosts tables.) If necessary, replace the user's credentials in the cred table.

Bad Credentials

See Corrupted Credentials.

NIS+ Security Problems

This section describes common password, credential, encryption, and other security-related problems.

Security Problem Symptoms

Error messages with operative clauses such as:

Authentication error
Authentication denied
Cannot get public key
Chkey failed
Insufficient permission to
Login incorrect
Keyserv fails to encrypt
No public key
Permission denied
Password [problems]

User or root unable to perform any namespace operations or tasks. (See also NIS+ Ownership and Permission Problems.)

`Login Incorrect` Message

The most common cause of a “login incorrect” message is the user mistyping the password. Have the user try it again. Make sure the user knows the correct password and understands that passwords are case-sensitive and also that the letter “o” is not interchangeable with the numeral “0,” nor is the letter “l” the same as the numeral “1.”

Other possible causes of the “login incorrect” message are:

The password has been locked by an administrator. See Locking a Password and Unlocking a Password.
The password has been locked because the user has exceeded an inactivity maximum. See Specifying Maximum Number of Inactive Days.
The password has expired. See Password Privilege Expiration.

See Chapter 16, Administering Passwords for general information on passwords.

Password Locked, Expired, or Terminated

A common cause of a Permission denied, password expired, type message is that the user's password has passed its age limit or the user's password privileges have expired. See Chapter 16, Administering Passwords for general information on passwords.

See Setting a Password Age Limit.
See Password Privilege Expiration.

Stale and Outdated Credential Information

Occasionally, you may find that even though you have created the proper credentials and assigned the proper access rights, some client requests still get denied. This may be due to out-of-date information residing somewhere in the namespace.

Storing and Updating Credential Information

Credential-related information, such as public keys, is stored in many locations throughout the namespace. NIS+ updates this information periodically, depending on the time-to-live values of the objects that store it, but sometimes, between updates, it gets out of sync. As a result, you may find that operations that should work, don't work. Table 24–2 lists all the objects, tables, and files that store credential-related information and how to reset it.

Table 24–2 Where Credential-Related Information is Stored


Item	Stores	To Reset or Change
cred table	NIS+ principal's secret key and public key. These are the master copies of these keys.	Use `nisaddcred` to create new credentials; it updates existing credentials. An alternative is `chkey`.
Directory object	A copy of the public key of each server that supports it.	Run the `/usr/lib/nis/` `nisupdkeys` command on the directory object.
Keyserver	The secret key of the NIS+ principal that is currently logged in.	Run `keylogin` for a principal user or `keylogin` `-r` for a principal machine.
NIS+ daemon	Copies of directory objects, which in turn contain copies of their servers' public keys.	Kill the daemon and the cache manager. Then restart both.
Directory cache	A copy of directory objects, which in turn contain copies of their servers' public keys.	Kill the NIS+ cache manager and restart it with the `nis_cachemgr` `-i` command. The `-i` option resets the directory cache from the cold-start file and restarts the cache manager.
Cold-start file	A copy of a directory object, which in turn contains copies of its servers' public keys.	On the root master, kill the NIS+ daemon and restart it. The daemon reloads new information into the existing `NIS_COLD_START` file. For a client, first remove the cold-start and shared directory files from `/var/nis`, and kill the cache manager. Then re-initialize the principal with `nisinit` `-c`. The principal's trusted server reloads new information into the principal's existing cold-start file.
passwd table	A user's password or a machine's superuser password.	Use the `passwd -r nisplus` command. It changes the password in the NIS+ passwd table and updates it in the cred table.
`passwd` file	A user's password or a machine's superuser password.	Use the `passwd -r nisplus` command, whether logged in as superuser or as yourself, whichever is appropriate.
`passwd` map (NIS)	A user's password or a machine's superuser password.	Use `passwd -r nisplus`.

Updating Stale Cached Keys

The most commonly encountered out-of-date information is the existence of stale objects with old versions of a server's public key. You can usually correct this problem by running nisupdkeys on the domain you are trying to access. (See Chapter 12, Administering NIS+ Credentials for information on using the nisupdkeys command.)

Because some keys are stored in files or caches, nisupdkeys cannot always correct the problem. At times you might need to update the keys manually. To do that, you must understand how a server's public key, once created, is propagated through namespace objects. The process usually has five stages of propagation. Each stage is described below.

Stage 1: Server's Public Key Is Generated

An NIS+ server is first an NIS+ client. So, its public key is generated in the same way as any other NIS+ client's public key: with the nisaddcred command. The public key is then stored in the cred table of the server's home domain, not of the domain the server will eventually support.

Stage 2: Public Key Is Propagated to Directory Objects

Once you have set up an NIS+ domain and an NIS+ server, you can associate the server with a domain. This association is performed by the nismkdir command. When the nismkdir command associates the server with the directory, it also copies the server's public key from the cred table to the domain's directory object. For example, assume the server is a client of the doc.com. root domain, and is made the master server of the sales.doc.com. domain.

Figure 24–1 Public Key is Propagated to Directory Objects

Its public key is copied from the cred.org_dir.doc.com. domain and placed in the sales.doc.com. directory object. This can be verified with the niscat -o sales.doc.com. command.

Stage 3: Directory Objects Are Propagated Into Client Files

All NIS+ clients are initialized with the nisinit utility or with the nisclient script.

Among other things, nisinit (or nisclient) creates a cold-start file /var/nis/NIS_COLDSTART. The cold-start file is used to initialize the client's directory cache /var/nis/NIS_SHARED_DIRCACHE. The cold-start file contains a copy of the directory object of the client's domain. Since the directory object already contains a copy of the server's public key, the key is now propagated into the cold-start file of the client.

In addition when a client makes a request to a server outside its home domain, a copy of the remote domains directory object is stored in the client's NIS_SHARED_DIRCACHE file. You can examine the contents of the client's cache by using the nisshowcache command, described on page 184.

This is the extent of the propagation until a replica is added to the domain or the server's key changes.

Stage 4: When a Replica is Added to the Domain

When a replica server is added to a domain, the nisping command (described in The nisping Command) is used to download the NIS+ tables, including the cred table, to the new replica. Therefore, the original server's public key is now also stored in the replica server's cred table.

Stage 5: When the Server's Public Key Is Changed

If you decide to change DES credentials for the server (that is, for the root identity on the server), its public key will change. As a result, the public key stored for that server in the cred table will be different from those stored in:

The cred table of replica servers (for a few minutes only)
The main directory object of the domain supported by the server (until its time-to-live expires)
The NIS_COLDSTART and NIS_SHARED_DIRCACHE files of every client of the domain supported by server (until their time-to-live expires, usually 12 hours)
The NIS_SHARED_DIRCACHE file of clients who have made requests to the domain supported by the server (until their time-to-live expires)

Most of these locations will be updated automatically within a time ranging from a few minutes to 12 hours. To update the server's keys in these locations immediately, use the commands:

Table 24–3 Updating a Server's Keys


Location	Command	See
Cred table of replica servers (instead of using `nisping`, you can wait a few minutes until the table is updated automatically)	`nisping`	The `nisping` Command
Directory object of domain supported by server	`nisupdkeys`	The `nisupdkeys` Command
`NIS_COLDSTART` file of clients	`nisinit` `-c`	The `nisinit` Command
`NIS_SHARED_DIRCACHE` file of clients	`nis_cachemgr`	The `nis_cachemgr` Command

Note –

You must first kill the existing nis_cachemgr before restarting nis_cachemgr.

Corrupted Credentials

When a principal (user or machine) has a corrupt credential, that principal is unable to perform any namespace operations or tasks, not even nisls. This is because a corrupt credential provides no permissions at all, not even the permissions granted to the nobody class.

Symptoms:

User or root cannot perform any namespace tasks or operations. All namespace operations fail with a “permission denied” type of error message. The user or root cannot even perform a nisls operation.

Possible Cause:

Corrupted keys or a corrupt, out-of-date, or otherwise incorrect /etc/.rootkey file.

Diagnosis:

Use snoop to identify the bad credential.

Or, if the principal is listed, log in as the principal and try to run an NIS+ command on an object for which you are sure that the principal has proper authorization. For example, in most cases an object grants read authorization to the nobody class. Thus, the nisls object command should work for any principal listed in the cred table. If the command fails with a “permission denied” error, then the principal's credential is likely corrupted.

Solution

Ordinary user. Perform a keylogout and then a keylogin for that principal.
Root or superuser. Run keylogout -f followed by keylogin -r.

`Keyserv` Failure

The keyserv is unable to encrypt a session. There are several possible causes for this type of problem:

Possible Causes and Solutions:

The client has not keylogged in. Make sure that the client is keylogged in. To determine if a client is properly keylogged in, have the client run nisdefaults -v (or run it yourself as the client). If (not authenticated) is returned on the Principal Name line, the client is not properly keylogged in.
The client (host) does not have appropriate LOCAL or DES credentials. Run niscat on the client's cred table to verify that the client has appropriate credentials. If necessary, add credentials as explained in Creating Credential Information for NIS+ Principals.
The keyserv daemon is not running. Use the ps command to see if keyserv is running. If it is not running, restart it and then do a keylogin.
While keyserv is running, other long running processes that make secure RPC or NIS+ calls are not. For example, automountd, rpc.nisd, and sendmail. Verify that these processes are running correctly. If they are not, restart them.

Machine Previously Was an NIS+ Client

If this machine has been initialized before as an NIS+ client of the same domain, try keylogin -r and enter the root login password at the Secure RPC password prompt.

No Entry in the cred Table

To make sure that an NIS+ password for the principal (user or host) exists in the cred table, run the following command in the principal's home domain

nisgrep -A cname=principal cred.org_dir.domainname

If you are running nisgrep from another domain, the domainname must be fully qualified.

Changed Domain Name

Do not change a domain name.

If you change the name of an existing domain you will create authentication problems because the fully qualified original domain name is embedded in objects throughout your network.

If you have already changed a domain name and are experiencing authentication problems, or error messages containing terms like “malformed” or “illegal” in relation to a domain name, change the domain name back to its original name. The recommended procedure for renaming your domains is to create a new domain with the new name, set up your machines as servers and clients of the new domain, make sure they are performing correctly, and then remove the old domain.

When Changing a Machine to a Different Domain

If this machine is an NIS+ client and you are trying to change it to a client of a different domain, remove the /etc/.rootkey file, and then rerun the nisclient script using the network password supplied by your network administrator or taken from the nispopulate script.

NIS+ and Login Passwords in `/etc/passwd` File

Your NIS+ password is stored in the NIS+ passwd table. Your user login password may be stored in NIS+ passwd table or in your /etc/passwd file. (Your user password and NIS+ password can be the same or different.) To change a password in an /etc/passwd file, you must run the passwd command with the nsswitch.conf file set to files or with the -r files flag.

The nsswitch.conf file specifies which password is used for which purpose. If the nsswitch.conf file is directing system queries to the wrong location, you will get password and permission errors.

Secure RPC Password and Login Passwords Are Different

When a principal's login password is different from his or her Secure RPC password, keylogin cannot decrypt it at login time because keylogin defaults to using the principal's login password, and the private key was encrypted using the principal's Secure RPC password.

When this occurs the principal can log in to the system, but for NIS+ purposes is placed in the authorization class of nobody because the keyserver does not have a decrypted private key for that user. Since most NIS+ environments are set up to deny the nobody class create, destroy, and modify rights to most NIS+ objects this results in “permission denied” types errors when the user tries to access NIS+ objects.

Note –

In this context network password is sometimes used as a synonym for Secure RPC password. When prompted for your “network password,” enter your Secure RPC password.

To be placed in one of the other authorization classes, a user in this situation must explicitly run the keylogin program and give the principal's Secure RPC password when keylogin prompts for password. (See Keylogin.)

But an explicit keylogin provides only a temporary solution that is good only for the current login session. The keyserver now has a decrypted private key for the user, but the private key in the user's cred table is still encrypted using the user's Secure RPC password, which is different than the user's login password. The next time the user logs in, the same problem reoccurs. To permanently solve the problem the user needs to change the private key in the cred table to one based on the user's login ID rather than the user's Secure RPC password. To do this, the user need to run the chkey program as described in Changing Keys for an NIS+ Principal.

Thus, to permanently solve a Secure RPC password different than login password problems, the user (or an administrator acting for the user) must perform the following steps:

Log in using the login password.
Run the keylogin program to temporarily get a decrypted private key stored in the keyserver and thus gain temporary NIS+ access privileges.
Run chkey -pto permanently change the encrypted private key in the cred table to one based on the user's login password.

Preexisting `/etc/.rootkey` File

Symptoms:

Various insufficient permission to and permission denied error messages.

Possible Cause:

An /etc/.rootkey file already existed when you set up or initialized a server or client. This could occur if NIS+ had been previously installed on the machine and the .rootkey file was not erased when NIS+ was removed or the machine returned to using NIS or /etc files.

Diagnosis:

Run ls -l on the /etc directory and nisls -l org_dir and compare the date of the /etc/.rootkey to the date of the cred table. If the /etc/.rootkey date is clearly earlier than that of the cred table, it may be a preexisting file.

Solution:

Run keylogin -r as root on the problem machine and then set up the machine as a client again.

Root Password Change Causes Problem

Symptoms:

You change the root password on a machine, and the change either fails to take effect or you are unable to log in as superuser.

Possible Cause:

Note –

For security reasons, you should not have User ID 0 listed in the passwd table.

You changed the root password, but root's key was not properly updated. Either because you forgot to run chkey -p for root or some problem came up.

Solution

Log in as a user with administration privileges (that is, a user who is a member of a group with administration privileges) and use passwd to restore the old password. Make sure that old password works. Now use passwd to change root's password to the new one, and then run chkey -p.

Caution –

Once your NIS+ namespace is set up and running, you can change the root password on the root master machine. But do not change the root master keys, as these are embedded in all directory objects on all clients, replicas, and servers of subdomains. To avoid changing the root master keys, always use the -p option when running chkey as root.

NIS+ Performance and System Hang Problems

This section describes common slow performance and system hang problems.

Performance Problem Symptoms

Error messages with operative clauses such as:

Busy try again later
Not responding

Other common symptoms:

You issue a command and nothing seems to happen for far too long.
Your system, or shell, no longer responds to keyboard or mouse commands.
NIS+ operations seem to run slower than they should or slower than they did before.

Checkpointing

Someone has issued an nisping or nisping -C command. Or the rpc.nisd daemon is performing a checkpoint operation.

Caution –

Do not reboot! Do not issue any more nisping commands.

When performing a nisping or checkpoint, the server will be sluggish or may not immediately respond to other commands. Depending on the size of your namespace, these commands may take a noticeable amount of time to complete. Delays caused by checkpoint or ping commands are multiplied if you, or someone else, enter several such commands at one time. Do not reboot. This kind of problem will solve itself. Just wait until the server finishes performing the nisping or checkpoint command.

During a full master-replica resync, the involved replica server will be taken out of service until the resync is complete. Do not reboot—just wait.

Variable `NIS_PATH`

Make sure that your NIS_PATH variable is set to something clean and simple. For example, the default: org_dir.$:$. A complex NIS_PATH, particularly one that itself contains a variable, will slow your system and may cause some operations to fail. (See NIS_PATH Environment Variablefor more information.)

Do not use nistbladm to set nondefault table paths. Nondefault table paths will slow performance.

Table Paths

Do not use table paths because they will slow performance.

Too Many Replicas

Too many replicas for a domain degrade system performance during replication. There should be no more than 10 replicas in a given domain or subdomain. If you have more than five replicas in a domain, try removing some of them to see if that improves performance.

Recursive Groups

A recursive group is a group that contains the name of some other group. While including other groups in a group reduces your work as system administrator, doing so slows down the system. You should not use recursive groups.

Large NIS+ Database Logs at Start-up

When rpc.nisd starts up it goes through each log. If the logs are long, this process could take a long time. If your logs are long, you may want to checkpoint them using nisping -C before starting rpc.nisd.

The Master `rpc.nisd` Daemon Died

Symptoms:

If you used the -M option to specify that your request be sent to the master server, and the rpc.nisd daemon has died on that machine, you will get a “server not responding” type error message and no updates will be permitted. (If you did not use the -M option, your request will be automatically routed to a functioning replica server.)

Possible Cause:

Using uppercase letters in the name of a home directory or host can sometimes cause rpc.nisd to die.

Diagnosis:

First make sure that the server itself is up and running. If it is, run ps -ef | grep rpc.nisd to see if the daemon is still running.

Solution:

If the daemon has died, restart it. If rpc.nisd frequently dies, contact your service provider.

No `nis_cachemgr`

Symptoms:

It takes too long for a machine to locate namespace objects in other domains.

Possible Cause:

You do not have nis_cachemgr running.

Diagnosis:

Run ps -ef | grep nis_cachemgr to see if it is still running.

Solution

Start nis_cachemgr on that machine.

Server Very Slow at Start-up After NIS+ Installation

Symptoms:

A server performs slowly and sluggishly after using the NIS+ scripts to install NIS+ on it.

Possible Cause:

You forgot to run nisping -C -a after running the nispopulate script.

Solution:

Run nisping -C -a to checkpoint the system as soon as you are able to do so.

`niscat` Returns: `Server busy. Try Again`

Symptoms:

You run niscat and get an error message indicating that the server is busy.

Possible Cause:

The server is busy with a heavy load, such as when doing a resync.
The server is out of swap space.

Diagnosis:

Run swap -s to check your server's swap space.

Solution:

You must have adequate swap and disk space to run NIS+. If necessary, increase your space.

NIS+ Queries Hang After Changing Host Name

Symptoms:

Setting the host name for an NIS+ server to be fully qualified is not recommended. If you do so, and NIS+ queries then just hang with no error messages, check the following possibilities:

Possible Cause:

Fully qualified host names must meet the following criteria:

The domain part of the host name must be the same as the name returned by the domainname command.
After the setting the host name to be fully qualified, you must also update all the necessary /etc and /etc/inet files with the new host name information.
The host name must end in a period.

Solution:

Kill the NIS+ processes that are hanging and then kill rpc.nisd on that host or server. Rename the host to match the two requirements listed above. Then reinitialize the server with nisinit. (If queries still hang after you are sure that the host is correctly named, check other problem possibilities in this section.)

NIS+ System Resource Problems

This section describes problems having to do with lack of system resources such as memory, disk space, and so forth.

Resource Problem Symptoms

Error messages with operative clauses such as:

No memory
Out of disk space
“Cannot [do something] with log” type messages
Unable to fork

Insufficient Memory

Lack of sufficient memory or swap space on the system you are working with will cause a wide variety of NIS+ problems and error messages. As a short-term, temporary solution, try to free additional memory by killing unneeded windows and processes. If necessary, exit your windowing system and work from the terminal command line. If you still get messages indicating inadequate memory, you will have to install additional swap space or memory, or switch to a different system that has enough swap space or memory.

Under some circumstances, applications and processes may develop memory leaks and grow too large. you can check the current size of an application or process by running:

ps -el

The sz (size) column shows the current memory size of each process. If necessary, compare the sizes with comparable processes and applications on a machine that is not having memory problems to see if any have grown too large.

Insufficient Disk Space

Lack of disk space will cause a variety of error messages. A common cause of insufficient disk space is failure to regularly remove tmp files and truncate log files. log and tmp files grow steadily larger unless truncated. The speed at which these files grow varies from system to system and with the system state. log files on a system that is working inefficiently or having namespace problems will grow very fast.

Note –

If you are doing a lot of troubleshooting, check your log and tmp files frequently. Truncate log files and remove tmp files before lack of disk space creates additional problems. Also check the root directory and home directories for core files and delete them.

The way to truncate log files is to regularly checkpoint your system (Keep in mind that a checkpoint process may take some time and will slow down your system while it is being performed, checkpointing also requires enough disk space to create a complete copy of the files before they are truncated.)

To checkpoint a system, run nisping -C.

Insufficient Processes

On a heavily loaded machine it is possible that you could reach the maximum number of simultaneous processes that the machine is configured to handle. This causes messages with clauses like “unable to fork”. The recommended method of handling this problem is to kill any unnecessary processes. If the problem persists, you can reconfigure the machine to handle more processes as described in your system administration documentation.

NIS+ User Problems

This section describes NIS+ problems that a typical user might encounter.

User Problem Symptoms

User cannot log in.
User cannot rlogin to other domain

User Cannot Log In

There are many possible reasons for a user being unable to log in:

User forgot password. To set up a new password for a user who has forgotten the previous one, run passwd for that user on another machine (naturally, you have to be the NIS+ administrator to do this).
Mistyping password. Make sure the user knows the correct password and understands that passwords are case-sensitive and that the letter “o” is not interchangeable with the numeral “0,” nor is the letter “l” the same as the numeral “1.”
“Login incorrect” type message. For causes other than simply mistyping the password, see Login Incorrect Message.
The user's password privileges have expired (see Password Privilege Expiration).
An inactivity maximum has been set for this user, and the user has passed it (see Specifying Maximum Number of Inactive Days).
The user's nsswitch.conf file is incorrect. The passwd entry in that file must be one of the following five permitted configurations:
- passwd: files
- passwd: files nis
- passwd: files nisplus
- passwd: compat
- passwd: compat passwd_compat: nisplus
Any other configuration will prevent a user from logging in.

(See nsswitch.conf File Requirements for further details.)

User Cannot Log In Using New Password

Symptoms:

Users who recently changed their password are unable to log in at all, or are able to log in on some machines but not on others.

Possible Causes:

It may take some time for the new password to propagate through the network. Have users try to log in with the old password.
The password was changed on a machine that was not running NIS+ (see User Cannot Log In Using New Password).

User Cannot Remote Log In to Remote Domain

Symptoms:

User tries to rlogin to a machine in some other domain and is refused with a “Permission denied” type error message.

Possible Cause:

To rlogin to a machine in another domain, a user must have LOCAL credentials in that domain.

Diagnosis:

Run nismatch username.domainname. cred.org_dir in the other domain to see if the user has a LOCAL credential in that domain.

Solution:

Go to the remote domain and use nisaddcred to create a LOCAL credential for the user in the that domain.

User Cannot Change Password

The most common cause of a user being unable to change passwords is that the user is mistyping (or has forgotten) the old password.

Other possible causes:

The password Min value has been set to be greater than the password Max value. See Setting Minimum Password Life.
The password is locked or expired. See Login Incorrect Message and Password Locked, Expired, or Terminated.

Other NIS+ Problems

This section describes problems that do not fit any of the previous categories.

How to Tell if NIS+ Is Running

You may need to know whether a given host is running NIS+. A script may also need to determine whether NIS+ is running.

You can assume that NIS+ is running if:

nis_cachemgr is running.
The host has a /var/nis/NIS_COLD_START file.
nisls succeeds.

Replica Update Failure

Symptoms:

Error messages indicating that the update was not successfully complete. (Note that the message: replica_update: number updates number errors indicates a successful update.)

Possible Causes:

Any of the following error messages indicate that the server was busy and that the update should be rescheduled:

Master server busy, full dump rescheduled
replica_update error result was Master server busy full dump rescheduled, full dump rescheduled
replica_update: master server busy, rescheduling the resync
replica_update: master server busy, will try later
replica_update: nis dump result Master server busy, full dump rescheduled
nis_dump_svc: one replica is already resyncing

(These messages are generated by, or in conjunction with, the NIS+ error code constant: NIS_DUMPLATER one replica is already resyncing.)

These messages indicate that there was some other problem:

replica_update: error result was ...
replica_update: nis dump result nis_perror error string
rootreplica_update: update failednis dump result nis_perror string-variable: could not fetch object from master

(If rpc.nisd is being run with the -C (open diagnostic channel) option, additional information may be entered in either the master server or replica server's system log.

These messages indicate possible problems such as:

The server is out of child processes that can be allocated.
A read-only child process was requested to dump.
Another replica is currently resynching.

Diagnosis:

Check both the replica and server's system log for additional information. How much, if any, additional information is recorded in the system logs depends on your system's error reporting level, and whether or not you are running rpc.nisd with the -C option (diagnostics).

Solution:

In most cases, these messages indicate minor software problems which the system is capable of correcting. If the message was the result of a command, simply wait for a while and then try the command again. If these messages appear often, you can change the threshold level in your /etc/syslog.conf file. See the syslog.conf man page for details.

Chapter 24 NIS+ Troubleshooting

NIS+ Debugging Options

NIS+ Administration Problems

Illegal Object Problems

nisinit Fails

Checkpoint Keeps Failing

Cannot Add User to a Group

Logs Grow too Large

Lack of Disk Space

Cannot Truncate Transaction Log File

Domain Name Confusion

Cannot Delete org_dir or groups_dir

Removal or Disassociation of NIS+ Directory from Replica Fails

NIS+ Database Problems

Multiple rpc.nisd Parent Processes

rpc.nisd Fails

NIS+ and NIS Compatibility Problems

User Cannot Log In After Password Change

nsswitch.conf File Fails to Perform Correctly

NIS+ Object Not Found Problems

Syntax or Spelling Error

Incorrect Path

Domain Levels Not Correctly Specified

Object Does Not Exist

Lagging or Out-of-Sync Replica

Files Missing or Corrupt

Old /var/nis Filenames

Blanks in Name

Cannot Use Automounter

Links To or From Table Entries Do Not Work

NIS+ Ownership and Permission Problems

No Permission

No Credentials

Server Running at Security Level 0

User Login Same as Machine Name

Bad Credentials

NIS+ Security Problems

Security Problem Symptoms

Login Incorrect Message

Password Locked, Expired, or Terminated

Stale and Outdated Credential Information

Storing and Updating Credential Information

Updating Stale Cached Keys

Figure 24–1 Public Key is Propagated to Directory Objects

Corrupted Credentials

Keyserv Failure

Machine Previously Was an NIS+ Client

No Entry in the cred Table

Changed Domain Name

When Changing a Machine to a Different Domain

NIS+ and Login Passwords in /etc/passwd File

Secure RPC Password and Login Passwords Are Different

Preexisting /etc/.rootkey File

Root Password Change Causes Problem

NIS+ Performance and System Hang Problems

Performance Problem Symptoms

Checkpointing

Variable NIS_PATH

Table Paths

Too Many Replicas

Recursive Groups

Large NIS+ Database Logs at Start-up

The Master rpc.nisd Daemon Died

No nis_cachemgr

Server Very Slow at Start-up After NIS+ Installation

niscat Returns: Server busy. Try Again

NIS+ Queries Hang After Changing Host Name

NIS+ System Resource Problems

Resource Problem Symptoms

Insufficient Memory

Insufficient Disk Space

Insufficient Processes

NIS+ User Problems

User Problem Symptoms

User Cannot Log In

User Cannot Log In Using New Password

User Cannot Remote Log In to Remote Domain

User Cannot Change Password

Other NIS+ Problems

How to Tell if NIS+ Is Running

`nisinit` Fails

Cannot Delete `org_dir` or `groups_dir`

Multiple `rpc.nisd` Parent Processes

`rpc.nisd` Fails

`nsswitch.conf` File Fails to Perform Correctly

Old `/var/nis` Filenames

`Login Incorrect` Message

`Keyserv` Failure

NIS+ and Login Passwords in `/etc/passwd` File

Preexisting `/etc/.rootkey` File

Variable `NIS_PATH`

The Master `rpc.nisd` Daemon Died

No `nis_cachemgr`

`niscat` Returns: `Server busy. Try Again`