System Administration Guide: Naming and Directory Services (FNS and NIS+)

Chapter 24 NIS+ Troubleshooting

In this chapter, problems are grouped according to type. For each problem there is a list of common symptoms, a description of the problem, and one or more suggested solutions.

In addition, Appendix A, Error Messages contains an alphabetic listing of the more common NIS+ error messages.


Note –

NIS+ might not be supported in a future release. Tools to aid the migration from NIS+ to LDAP are available in the Solaris 9 operating environment (see System Administration Guide: Naming and Directory Services (DNS, NIS, and LDAP)). For more information, visit http://www.sun.com/directory/nisplus/transition.html.


NIS+ Debugging Options

The NIS_OPTIONS environment variable can be set to control various NIS+ debugging options.

Options are specified after the NIS_OPTIONS command separated by spaces with the option set enclosed in double quotes. Each option has the format name=value. Values can be integers, character strings, or filenames depending on the particular option. If a value is not specified for an integer value option, the value defaults to 1.

NIS_OPTIONS recognizes the following options:

Table 24–1 NIS_OPTIONS Options and Values

Option 

Values 

Actions 

debug_file

filename

Directs debug output to specified file. If this option is not specified, debug output goes to stdout.

debug_bind

Number

Displays information about the server selection process. 

debug_rpc

1 or 2

If the value is 1, displays RPC calls made to the NIS+ server and the RPC result code. If the value is 2, displays both the RPC calls and the contents of the RPC and arguments and results. 

debug_calls

Number

Displays calls to the NIS+ API and the results that are returned to the application. 

pref_srvr

String

Specifies preferred servers in the same format as that generated by the nisprefadm command (see Table 20–1). This will over-ride the preferred server list specified in nis_cachemgr.

server

String

Bind to a particular server. 

pref_type

String

Not currently implemented. 

For example, (assuming that you are using a C-Shell):


setenv NIS_OPTIONS “debug_calls=2 debug_bind debug_rpc”

setenv NIS_OPTIONS “debug_calls debug_file=/tmp/CALLS”

setenv NIS_OPTIONS “debug_calls server=sirius”

NIS+ Administration Problems

This section describes problems that may be encountered in the course of routine NIS+ namespace administration work. Common symptoms include:

Illegal Object Problems

Symptoms

There are a number of possible causes for this error message:

nisinit Fails

Make sure that:

Checkpoint Keeps Failing

If checkpoint operations with a nisping -C command consistently fail, make sure you have sufficient swap and disk space. Check for error messages in syslog. Check for core files filling up space.

Cannot Add User to a Group

A user must first be an NIS+ principal client with a LOCAL credential in the domain's cred table before the user can be added as a member of a group in that domain. A DES credential alone is not sufficient.

Logs Grow too Large

Failure to regularly checkpoint your system with nisping -C causes your log files to grow too large. Logs are not cleared on a master until all replicas for that master are updated. If a replica is down or otherwise out of service or unreachable, the master's logs for that replica cannot be cleared. Thus, if a replica is going to be down or out of service for a period of time, you should remove it as a replica from the master as described in Removing a Directory. Keep in mind that you must first remove the directory's org_dir and groups_dir subdirectories before you remove the directory itself.

Lack of Disk Space

Lack of sufficient disk space will cause a variety of different error messages. (See Insufficient Disk Space for additional information.)

Cannot Truncate Transaction Log File

First, check to make sure that the file in question exists and is readable and that you have permission to write to it.

The most likely cause of inability to truncate an existing log file for which you have the proper permissions is lack of disk space. (The checkpoint process first creates a duplicate temporary file of the log before truncating the log and then removing the temporary file. If there is not enough disk space for the temporary file, the checkpoint process cannot proceed.) Check your available disk space and free up additional space if necessary.

Domain Name Confusion

Domain names play a key role in many NIS+ commands and operations. To avoid confusion, you must remember that, except for root servers, all NIS+ masters and replicas are clients of the domain above the domain that they serve. If you make the mistake of treating a server or replica as if it were a client of the domain that it serves, you may get Generic system error or Possible loop detected in namespace directoryname:domainname error messages.

For example, the machine altair might be a client of the subdoc.doc.com. domain. If the master server of the subdoc.doc.com. subdomain is the machine sirius, then sirius is a client of the doc.com. domain. When using, specifying, or changing domains, remember these rules to avoid confusion:

  1. Client machines belong to a given domain or subdomain.

  2. Servers and replicas that serve a given subdomain are clients of the domain above the domain they are serving.

  3. The only exception to Rule 2 is that the root master server and root replica servers are clients of the same domain that they serve. In other words, the root master and root replicas are all clients of the root domain.

Thus, in the example above, the fully qualified name of the altair machine is alladin.subdoc.doc.com. The fully qualified name of the sirius machine is sirius.doc.com. The name sirius.subdoc.doc.com. is wrong and will cause an error because sirius is a client of doc.com., not subdoc.doc.com.

Cannot Delete org_dir or groups_dir

Always delete org_dir and groups_dir before deleting their parent directory. If you use nisrmdir to delete the domain before deleting the domain's groups_dir and org_dir, you will not be able to delete either of those two subdirectories.

Removal or Disassociation of NIS+ Directory from Replica Fails

When removing or disassociating a directory from a replica server you must first remove the directory's org_dir and groups_dir subdirectories before removing the directory itself. After each subdirectory is removed, you must run nisping on the parent directory of the directory you intend to remove. (See Removing a Directory.)

If you fail to perform the nisping operation, the directory will not be completely removed or disassociated.

If this occurs, you need to perform the following steps to correct the problem:

  1. Remove /var/nis/rep/org_dir on the replica.

  2. Make sure that org_dir.domain does not appear in /var/nis/rep/serving_list on the replica.

  3. Perform a nisping on domain.

  4. From the master server, run nisrmdir -f replica_directory.

If the replica server you are trying to dissociate is down or out of communication, the nisrmdir -s command will return a Cannot remove replica name: attempt to remove a non-empty table error message.

In such cases, you can run nisrmdir -f -s replicaname on the master to force the dissociation. Note, however, that if you use nisrmdir -f -s to dissociate an out-of-communication replica, you must run nisrmdir -f -s again as soon as the replica is back on line in order to clean up the replica's /var/nis file system. If you fail to rerun nisrmdir -f -s replicaname when the replica is back in service, the old out-of-date information left on the replica could cause problems.

NIS+ Database Problems

This section covers problems related to the namespace database and tables. Common symptoms include error messages with operative clauses such as:

as well as when rpc.nisd fails.

See also NIS+ Ownership and Permission Problems.

Multiple rpc.nisd Parent Processes

Symptoms:

Various Database and transaction log corruption error messages containing the terms:

Possible Causes:

You have multiple independent rpc.nisd daemons running. In normal operation, rpc.nisd can spawn other child rpc.nisd daemons. This causes no problem. However, if two parent rpc.nisd daemons are running at the same time on the same machine, they will overwrite each other's data and corrupt logs and databases. (Normally, this could only occur if someone started running rpc.nisd by hand.)

Diagnosis:

Run ps -ef | grep rpc.nisd. Make sure that you have no more than one parent rpc.nisd process.

Solution:

If you have more than one “parent” rpc.nisd entries, you must kill all but one of them. Use kill -9 process-id, then run the ps command again to make sure it has died.


Note –

If you started rpc.nisd with the -B option, you must also kill the rpc.nisd_resolv daemon.


If an NIS+ database is corrupt, you will also have to restore it from your most recent backup that contains an uncorrupted version of the database. You can then use the logs to update changes made to your namespace since the backup was recorded. However, if your logs are also corrupted, you will have to recreate by hand any namespace modifications made since the backup was taken.

rpc.nisd Fails

If an NIS+ table is too large, rpc.nisd may fail.

Diagnosis:

Use nisls to check your NIS+ table sizes. Tables larger than 7k may cause rpc.nisd to fail.

Solution:

Reduce the size of large NIS+ tables. Keep in mind that as a naming service NIS+ is designed to store references to objects, not the objects themselves.

NIS+ and NIS Compatibility Problems

This section describes problems having to do with NIS compatibility with NIS+ and earlier systems and the switch configuration file. Common symptoms include:

Error messages with operative clauses include:

User Cannot Log In After Password Change

Symptoms:

New users, or users who recently changed their password are unable to log in at all, or able to log in on one or more machines but not on others. The user may see error messages with operative clauses such as:

First Possible Cause:

Password was changed on NIS machine.

If a user or system administrator uses the passwd command to change a password on a Solaris operating environment machine running NIS in a domain served by NIS+ namespace servers, the user's password is changed only in that machine's /etc/passwd file. If the user then goes to some other machine on the network, the user's new password will not be recognized by that machine. The user will have to use the old password stored in the NIS+ passwd table.

Diagnosis:

Check to see if the user's old password is still valid on another NIS+ machine.

Solution:

Use passwd on a machine running NIS+ to change the user's password.

Second Possible Cause:

Password changes take time to propagate through the domain.

Diagnosis:

Namespace changes take a measurable amount of time to propagate through a domain and an entire system. This time might be as short as a few seconds or as long as many minutes, depending on the size of your domain and the number of replica servers.

Solution:

You can simply wait the normal amount of time for a change to propagate through your domain(s). Or you can use the nisping org_dir command to resynchronize your system.

nsswitch.conf File Fails to Perform Correctly

A modified (or newly installed) nsswitch.conf file fails to work properly.

Symptoms:

You install a new nsswitch.conf file or make changes to the existing file, but your system does not implement the changes.

Possible Cause:

Each time an nsswitch.conf file is installed or changed, you must reboot the machine for your changes to take effect. This is because nscd caches the nsswitch.conf file.

Solution:

Check your nsswitch.conf file against the information contained in the nsswitch.conf man page. Correct the file if necessary, and then reboot the machine.

NIS+ Object Not Found Problems

This section describes problem in which NIS+ was unable to find some object or principal. Common symptoms include:

Error messages with operative clauses such as:

Syntax or Spelling Error

The most likely cause of some NIS+ object not being found is that you mistyped or misspelled its name. Check the syntax and make sure that you are using the correct name.

Incorrect Path

A likely cause of an “object not found” problem is specifying an incorrect path. Make sure that the path you specified is correct. Also make sure that the NIS_PATH environment variable is set correctly.

Domain Levels Not Correctly Specified

Remember that all servers are clients of the domain above them, not the domain they serve. There are two exceptions to this rule:

Object Does Not Exist

The NIS+ object may not have been found because it does not exist, either because it has been erased or not yet created. Use nisls -l in the appropriate domain to check that the object exists.

Lagging or Out-of-Sync Replica

When you create or modify an NIS+ object, there is a time lag between the completion of your action and the arrival of the new updated information at a given replica. In ordinary operation, namespace information may be queried from a master or any of its replicas. A client automatically distributes queries among the various servers (master and replicas) to balance system load. This means that at any given moment you do not know which machine is supplying you with namespace information. If a command relating to a newly created or modified object is sent to a replica that has not yet received the updated information from the master, you will get an “object not found” type of error or the old out-of-date information. Similarly, a general command such as nisls may not list a newly created object if the system sends the nisls query to a replica that has not yet been updated.

You can use nisping to resync a lagging or out of sync replica server.

Alternatively, you can use the -M option with most NIS+ commands to specify that the command must obtain namespace information from the domain's master server. In this way you can be sure that you are obtaining and using the most up-to-date information. (However, you should use the -M option only when necessary because a main point of having and using replicas to serve the namespace is to distribute the load and thus increase network efficiency.)

Files Missing or Corrupt

One or more of the files in /var/nis/data directory has become corrupted or erased. Restore these files from your most recent backup.

Old /var/nis Filenames

In Solaris Release 2.4 and earlier, the /var/nis directory contained two files named hostname.dict and hostname.log. It also contained a subdirectory named /var/nis/hostname. Starting with Solaris Release 2.5, the two files were renamed trans.log and data.dict, and the subdirectory is named /var/nis/data.

Do not rename the /var/nis or /var/nis/data directories or any of the files in these directories that were created by nisinit or any of the other NIS+ setup procedures.

In Solaris Release 2.5, the content of the files were also changed and they are not backward compatible with Solaris Release 2.4 or earlier. Thus, if you rename either the directories or the files to match the Solaris Release 2.4 patterns, the files will not work with either the Solaris Release 2.4 or the Solaris Release 2.5 or later versions of rpc.nisd. Therefore, you should not rename either the directories or the files.

Blanks in Name

Symptoms:

Sometimes an object is there, sometimes it is not. Some NIS+ or UNIX commands report that an NIS+ object does not exist or cannot be found, while other NIS+ or UNIX commands do find that same object.

Diagnoses:

Use nisls to display the object's name. Look carefully at the object's name to see if the name actually begins with a blank space. (If you accidentally enter two spaces after the flag when creating NIS+ objects from the command line with NIS+ commands, some NIS+ commands will interpret the second space as the beginning of the object's name.)

Solution:

If an NIS+ object name begins with a blank space, you must either rename it without the space or remove it and then recreate it from scratch.

Cannot Use Automounter

Symptoms:

You cannot change to a directory on another host.

Possible Cause:

Under NIS+, automounter names must be renamed to meet NIS+ requirements. NIS+ cannot access /etc/auto* tables that contain a period in the name. For example, NIS+ cannot access a file named auto.direct.

Diagnosis:

Use nisls and niscat to determine if the automounter tables are properly constructed.

Solution:

Change the periods to underscores. For example, change auto.direct to auto_direct. (Be sure to change other maps that might reference these.)

Links To or From Table Entries Do Not Work

You cannot use the nisln command (or any other command) to create links between entries in tables. NIS+ commands do not follow links at the entry level.

NIS+ Ownership and Permission Problems

This section describes problems related to user ownership and permissions. Common symptoms include:

Error messages with operative clauses such as:

Another Symptom:

No Permission

The most common permission problem is the simplest: you have not been granted permission to perform some task that you try to do. Use niscat -o on the object in question to determine what permissions you have. If you need additional permission, you, the owner of the object, or the system administrator can either change the permission requirements of the object (as described in Chapter 15, Administering NIS+ Access Rights,) or add you to a group that does have the required permissions (as described in Chapter 17, Administering NIS+ Groups).

No Credentials

Without proper credentials for you and your machine, many operations will fail. Use nismatch on your home domain's cred table to make sure you have the right credentials. See Corrupted Credentials for more on credentials-related problems.

Server Running at Security Level 0

A server running at security level 0 does not create or maintain credentials for NIS+ principals.

If you try to use passwd on a server that is running at security level 0, you will get the error message: You name do not have secure RPC credentials in NIS+ domain domainname.

Security level 0 is only to be used by administrators for initial namespace setup and testing purposes. Level 0 should not be used in any environment where ordinary users are active.

User Login Same as Machine Name

A user cannot have the same login ID as a machine name. When a machine is given the same name as a user (or vice versa), the first principal can no longer perform operations requiring secure permissions because the second principal's key has overwritten the first principal's key in the cred table. In addition, the second principal now has whatever permissions were granted to the first principal.

For example, suppose a user with the login name of saladin is granted namespace read-only permissions. Then a machine named saladin is added to the domain. The user saladin will no longer be able to perform any namespace operations requiring any sort of permission, and the root user of the machine saladin will only have read-only permission in the namespace.

Symptoms:


Note –

When running nisclient or nisaddcred, if the message Changing Key is displayed rather than Adding Key, there is a duplicate user or host name already in existence in that domain.


Diagnosis:

Run nismatch to find the host and user in the hosts and passwd tables to see if there are identical host names and user names in the respective tables:


nismatch username passwd.org_dir

Then run nismatch on the domain's cred table to see what type of credentials are provided for the duplicate host or user name. If there are both LOCAL and DES credentials, the cred table entry is for the user; if there is only a DES credential, the entry is for the machine.

Solution:

Change the machine name. (It is better to change the machine name than to change the user name.) Then delete the machine's entry from the cred table and use nisclient to reinitialize the machine as an NIS+ client. (If you wish, you can use nistbladm to create an alias for the machine's old name in the hosts tables.) If necessary, replace the user's credentials in the cred table.

Bad Credentials

See Corrupted Credentials.

NIS+ Security Problems

This section describes common password, credential, encryption, and other security-related problems.

Security Problem Symptoms

Error messages with operative clauses such as:

User or root unable to perform any namespace operations or tasks. (See also NIS+ Ownership and Permission Problems.)

Login Incorrect Message

The most common cause of a “login incorrect” message is the user mistyping the password. Have the user try it again. Make sure the user knows the correct password and understands that passwords are case-sensitive and also that the letter “o” is not interchangeable with the numeral “0,” nor is the letter “l” the same as the numeral “1.”

Other possible causes of the “login incorrect” message are:

See Chapter 16, Administering Passwords for general information on passwords.

Password Locked, Expired, or Terminated

A common cause of a Permission denied, password expired, type message is that the user's password has passed its age limit or the user's password privileges have expired. See Chapter 16, Administering Passwords for general information on passwords.

Stale and Outdated Credential Information

Occasionally, you may find that even though you have created the proper credentials and assigned the proper access rights, some client requests still get denied. This may be due to out-of-date information residing somewhere in the namespace.

Storing and Updating Credential Information

Credential-related information, such as public keys, is stored in many locations throughout the namespace. NIS+ updates this information periodically, depending on the time-to-live values of the objects that store it, but sometimes, between updates, it gets out of sync. As a result, you may find that operations that should work, don't work. Table 24–2 lists all the objects, tables, and files that store credential-related information and how to reset it.

Table 24–2 Where Credential-Related Information is Stored

Item 

Stores 

To Reset or Change 

cred table 

NIS+ principal's secret key and public key. These are the master copies of these keys. 

Use nisaddcred to create new credentials; it updates existing credentials. An alternative is chkey.

Directory object 

A copy of the public key of each server that supports it. 

Run the /usr/lib/nis/

nisupdkeys command on the directory object.

Keyserver 

The secret key of the NIS+ principal that is currently logged in. 

Run keylogin for a principal user or keylogin -r for a principal machine.

NIS+ daemon 

Copies of directory objects, which in turn contain copies of their servers' public keys. 

Kill the daemon and the cache manager. Then restart both. 

Directory cache 

A copy of directory objects, which in turn contain copies of their servers' public keys. 

Kill the NIS+ cache manager and restart it with the nis_cachemgr -i command. The -i option resets the directory cache from the cold-start file and restarts the cache manager.

Cold-start file 

A copy of a directory object, which in turn contains copies of its servers' public keys. 

On the root master, kill the NIS+ daemon and restart it. The daemon reloads new information into the existing NIS_COLD_START file.

For a client, first remove the cold-start and shared directory files from /var/nis, and kill the cache manager. Then re-initialize the principal with nisinit -c. The principal's trusted server reloads new information into the principal's existing cold-start file.

passwd table 

A user's password or a machine's superuser password. 

Use the passwd -r nisplus command. It changes the password in the NIS+ passwd table and updates it in the cred table.

passwd file

A user's password or a machine's superuser password. 

Use the passwd -r nisplus command, whether logged in as superuser or as yourself, whichever is appropriate.

passwd map

(NIS) 

A user's password or a machine's superuser password. 

Use passwd -r nisplus.

Updating Stale Cached Keys

The most commonly encountered out-of-date information is the existence of stale objects with old versions of a server's public key. You can usually correct this problem by running nisupdkeys on the domain you are trying to access. (See Chapter 12, Administering NIS+ Credentials for information on using the nisupdkeys command.)

Because some keys are stored in files or caches, nisupdkeys cannot always correct the problem. At times you might need to update the keys manually. To do that, you must understand how a server's public key, once created, is propagated through namespace objects. The process usually has five stages of propagation. Each stage is described below.

Stage 1: Server's Public Key Is Generated

An NIS+ server is first an NIS+ client. So, its public key is generated in the same way as any other NIS+ client's public key: with the nisaddcred command. The public key is then stored in the cred table of the server's home domain, not of the domain the server will eventually support.

Stage 2: Public Key Is Propagated to Directory Objects

Once you have set up an NIS+ domain and an NIS+ server, you can associate the server with a domain. This association is performed by the nismkdir command. When the nismkdir command associates the server with the directory, it also copies the server's public key from the cred table to the domain's directory object. For example, assume the server is a client of the doc.com. root domain, and is made the master server of the sales.doc.com. domain.

Figure 24–1 Public Key is Propagated to Directory Objects

Graphic

Its public key is copied from the cred.org_dir.doc.com. domain and placed in the sales.doc.com. directory object. This can be verified with the niscat -o sales.doc.com. command.

Stage 3: Directory Objects Are Propagated Into Client Files

All NIS+ clients are initialized with the nisinit utility or with the nisclient script.

Among other things, nisinit (or nisclient) creates a cold-start file /var/nis/NIS_COLDSTART. The cold-start file is used to initialize the client's directory cache /var/nis/NIS_SHARED_DIRCACHE. The cold-start file contains a copy of the directory object of the client's domain. Since the directory object already contains a copy of the server's public key, the key is now propagated into the cold-start file of the client.

In addition when a client makes a request to a server outside its home domain, a copy of the remote domains directory object is stored in the client's NIS_SHARED_DIRCACHE file. You can examine the contents of the client's cache by using the nisshowcache command, described on page 184.

This is the extent of the propagation until a replica is added to the domain or the server's key changes.

Stage 4: When a Replica is Added to the Domain

When a replica server is added to a domain, the nisping command (described in The nisping Command) is used to download the NIS+ tables, including the cred table, to the new replica. Therefore, the original server's public key is now also stored in the replica server's cred table.

Stage 5: When the Server's Public Key Is Changed

If you decide to change DES credentials for the server (that is, for the root identity on the server), its public key will change. As a result, the public key stored for that server in the cred table will be different from those stored in:

Most of these locations will be updated automatically within a time ranging from a few minutes to 12 hours. To update the server's keys in these locations immediately, use the commands:

Table 24–3 Updating a Server's Keys

Location 

Command 

See 

Cred table of replica servers (instead of using nisping, you can wait a few minutes until the table is updated automatically)

nisping

The nisping Command

Directory object of domain supported by server 

nisupdkeys

The nisupdkeys Command

NIS_COLDSTART file of clients

nisinit -c

The nisinit Command

NIS_SHARED_DIRCACHE file of clients

nis_cachemgr

The nis_cachemgr Command


Note –

You must first kill the existing nis_cachemgr before restarting nis_cachemgr.


Corrupted Credentials

When a principal (user or machine) has a corrupt credential, that principal is unable to perform any namespace operations or tasks, not even nisls. This is because a corrupt credential provides no permissions at all, not even the permissions granted to the nobody class.

Symptoms:

User or root cannot perform any namespace tasks or operations. All namespace operations fail with a “permission denied” type of error message. The user or root cannot even perform a nisls operation.

Possible Cause:

Corrupted keys or a corrupt, out-of-date, or otherwise incorrect /etc/.rootkey file.

Diagnosis:

Use snoop to identify the bad credential.

Or, if the principal is listed, log in as the principal and try to run an NIS+ command on an object for which you are sure that the principal has proper authorization. For example, in most cases an object grants read authorization to the nobody class. Thus, the nisls object command should work for any principal listed in the cred table. If the command fails with a “permission denied” error, then the principal's credential is likely corrupted.

Solution

Keyserv Failure

The keyserv is unable to encrypt a session. There are several possible causes for this type of problem:

Possible Causes and Solutions:

Machine Previously Was an NIS+ Client

If this machine has been initialized before as an NIS+ client of the same domain, try keylogin -r and enter the root login password at the Secure RPC password prompt.

No Entry in the cred Table

To make sure that an NIS+ password for the principal (user or host) exists in the cred table, run the following command in the principal's home domain


nisgrep -A cname=principal cred.org_dir.domainname

If you are running nisgrep from another domain, the domainname must be fully qualified.

Changed Domain Name

Do not change a domain name.

If you change the name of an existing domain you will create authentication problems because the fully qualified original domain name is embedded in objects throughout your network.

If you have already changed a domain name and are experiencing authentication problems, or error messages containing terms like “malformed” or “illegal” in relation to a domain name, change the domain name back to its original name. The recommended procedure for renaming your domains is to create a new domain with the new name, set up your machines as servers and clients of the new domain, make sure they are performing correctly, and then remove the old domain.

When Changing a Machine to a Different Domain

If this machine is an NIS+ client and you are trying to change it to a client of a different domain, remove the /etc/.rootkey file, and then rerun the nisclient script using the network password supplied by your network administrator or taken from the nispopulate script.

NIS+ and Login Passwords in /etc/passwd File

Your NIS+ password is stored in the NIS+ passwd table. Your user login password may be stored in NIS+ passwd table or in your /etc/passwd file. (Your user password and NIS+ password can be the same or different.) To change a password in an /etc/passwd file, you must run the passwd command with the nsswitch.conf file set to files or with the -r files flag.

The nsswitch.conf file specifies which password is used for which purpose. If the nsswitch.conf file is directing system queries to the wrong location, you will get password and permission errors.

Secure RPC Password and Login Passwords Are Different

When a principal's login password is different from his or her Secure RPC password, keylogin cannot decrypt it at login time because keylogin defaults to using the principal's login password, and the private key was encrypted using the principal's Secure RPC password.

When this occurs the principal can log in to the system, but for NIS+ purposes is placed in the authorization class of nobody because the keyserver does not have a decrypted private key for that user. Since most NIS+ environments are set up to deny the nobody class create, destroy, and modify rights to most NIS+ objects this results in “permission denied” types errors when the user tries to access NIS+ objects.


Note –

In this context network password is sometimes used as a synonym for Secure RPC password. When prompted for your “network password,” enter your Secure RPC password.


To be placed in one of the other authorization classes, a user in this situation must explicitly run the keylogin program and give the principal's Secure RPC password when keylogin prompts for password. (See Keylogin.)

But an explicit keylogin provides only a temporary solution that is good only for the current login session. The keyserver now has a decrypted private key for the user, but the private key in the user's cred table is still encrypted using the user's Secure RPC password, which is different than the user's login password. The next time the user logs in, the same problem reoccurs. To permanently solve the problem the user needs to change the private key in the cred table to one based on the user's login ID rather than the user's Secure RPC password. To do this, the user need to run the chkey program as described in Changing Keys for an NIS+ Principal.

Thus, to permanently solve a Secure RPC password different than login password problems, the user (or an administrator acting for the user) must perform the following steps:

  1. Log in using the login password.

  2. Run the keylogin program to temporarily get a decrypted private key stored in the keyserver and thus gain temporary NIS+ access privileges.

  3. Run chkey -pto permanently change the encrypted private key in the cred table to one based on the user's login password.

Preexisting /etc/.rootkey File

Symptoms:

Various insufficient permission to and permission denied error messages.

Possible Cause:

An /etc/.rootkey file already existed when you set up or initialized a server or client. This could occur if NIS+ had been previously installed on the machine and the .rootkey file was not erased when NIS+ was removed or the machine returned to using NIS or /etc files.

Diagnosis:

Run ls -l on the /etc directory and nisls -l org_dir and compare the date of the /etc/.rootkey to the date of the cred table. If the /etc/.rootkey date is clearly earlier than that of the cred table, it may be a preexisting file.

Solution:

Run keylogin -r as root on the problem machine and then set up the machine as a client again.

Root Password Change Causes Problem

Symptoms:

You change the root password on a machine, and the change either fails to take effect or you are unable to log in as superuser.

Possible Cause:


Note –

For security reasons, you should not have User ID 0 listed in the passwd table.


You changed the root password, but root's key was not properly updated. Either because you forgot to run chkey -p for root or some problem came up.

Solution

Log in as a user with administration privileges (that is, a user who is a member of a group with administration privileges) and use passwd to restore the old password. Make sure that old password works. Now use passwd to change root's password to the new one, and then run chkey -p.


Caution – Caution –

Once your NIS+ namespace is set up and running, you can change the root password on the root master machine. But do not change the root master keys, as these are embedded in all directory objects on all clients, replicas, and servers of subdomains. To avoid changing the root master keys, always use the -p option when running chkey as root.


NIS+ Performance and System Hang Problems

This section describes common slow performance and system hang problems.

Performance Problem Symptoms

Error messages with operative clauses such as:

Other common symptoms:

Checkpointing

Someone has issued an nisping or nisping -C command. Or the rpc.nisd daemon is performing a checkpoint operation.


Caution – Caution –

Do not reboot! Do not issue any more nisping commands.


When performing a nisping or checkpoint, the server will be sluggish or may not immediately respond to other commands. Depending on the size of your namespace, these commands may take a noticeable amount of time to complete. Delays caused by checkpoint or ping commands are multiplied if you, or someone else, enter several such commands at one time. Do not reboot. This kind of problem will solve itself. Just wait until the server finishes performing the nisping or checkpoint command.

During a full master-replica resync, the involved replica server will be taken out of service until the resync is complete. Do not reboot—just wait.

Variable NIS_PATH

Make sure that your NIS_PATH variable is set to something clean and simple. For example, the default: org_dir.$:$. A complex NIS_PATH, particularly one that itself contains a variable, will slow your system and may cause some operations to fail. (See NIS_PATH Environment Variablefor more information.)

Do not use nistbladm to set nondefault table paths. Nondefault table paths will slow performance.

Table Paths

Do not use table paths because they will slow performance.

Too Many Replicas

Too many replicas for a domain degrade system performance during replication. There should be no more than 10 replicas in a given domain or subdomain. If you have more than five replicas in a domain, try removing some of them to see if that improves performance.

Recursive Groups

A recursive group is a group that contains the name of some other group. While including other groups in a group reduces your work as system administrator, doing so slows down the system. You should not use recursive groups.

Large NIS+ Database Logs at Start-up

When rpc.nisd starts up it goes through each log. If the logs are long, this process could take a long time. If your logs are long, you may want to checkpoint them using nisping -C before starting rpc.nisd.

The Master rpc.nisd Daemon Died

Symptoms:

If you used the -M option to specify that your request be sent to the master server, and the rpc.nisd daemon has died on that machine, you will get a “server not responding” type error message and no updates will be permitted. (If you did not use the -M option, your request will be automatically routed to a functioning replica server.)

Possible Cause:

Using uppercase letters in the name of a home directory or host can sometimes cause rpc.nisd to die.

Diagnosis:

First make sure that the server itself is up and running. If it is, run ps -ef | grep rpc.nisd to see if the daemon is still running.

Solution:

If the daemon has died, restart it. If rpc.nisd frequently dies, contact your service provider.

No nis_cachemgr

Symptoms:

It takes too long for a machine to locate namespace objects in other domains.

Possible Cause:

You do not have nis_cachemgr running.

Diagnosis:

Run ps -ef | grep nis_cachemgr to see if it is still running.

Solution

Start nis_cachemgr on that machine.

Server Very Slow at Start-up After NIS+ Installation

Symptoms:

A server performs slowly and sluggishly after using the NIS+ scripts to install NIS+ on it.

Possible Cause:

You forgot to run nisping -C -a after running the nispopulate script.

Solution:

Run nisping -C -a to checkpoint the system as soon as you are able to do so.

niscat Returns: Server busy. Try Again

Symptoms:

You run niscat and get an error message indicating that the server is busy.

Possible Cause:

Diagnosis:

Run swap -s to check your server's swap space.

Solution:

You must have adequate swap and disk space to run NIS+. If necessary, increase your space.

NIS+ Queries Hang After Changing Host Name

Symptoms:

Setting the host name for an NIS+ server to be fully qualified is not recommended. If you do so, and NIS+ queries then just hang with no error messages, check the following possibilities:

Possible Cause:

Fully qualified host names must meet the following criteria:

Solution:

Kill the NIS+ processes that are hanging and then kill rpc.nisd on that host or server. Rename the host to match the two requirements listed above. Then reinitialize the server with nisinit. (If queries still hang after you are sure that the host is correctly named, check other problem possibilities in this section.)

NIS+ System Resource Problems

This section describes problems having to do with lack of system resources such as memory, disk space, and so forth.

Resource Problem Symptoms

Error messages with operative clauses such as:

Insufficient Memory

Lack of sufficient memory or swap space on the system you are working with will cause a wide variety of NIS+ problems and error messages. As a short-term, temporary solution, try to free additional memory by killing unneeded windows and processes. If necessary, exit your windowing system and work from the terminal command line. If you still get messages indicating inadequate memory, you will have to install additional swap space or memory, or switch to a different system that has enough swap space or memory.

Under some circumstances, applications and processes may develop memory leaks and grow too large. you can check the current size of an application or process by running:


ps -el

The sz (size) column shows the current memory size of each process. If necessary, compare the sizes with comparable processes and applications on a machine that is not having memory problems to see if any have grown too large.

Insufficient Disk Space

Lack of disk space will cause a variety of error messages. A common cause of insufficient disk space is failure to regularly remove tmp files and truncate log files. log and tmp files grow steadily larger unless truncated. The speed at which these files grow varies from system to system and with the system state. log files on a system that is working inefficiently or having namespace problems will grow very fast.


Note –

If you are doing a lot of troubleshooting, check your log and tmp files frequently. Truncate log files and remove tmp files before lack of disk space creates additional problems. Also check the root directory and home directories for core files and delete them.


The way to truncate log files is to regularly checkpoint your system (Keep in mind that a checkpoint process may take some time and will slow down your system while it is being performed, checkpointing also requires enough disk space to create a complete copy of the files before they are truncated.)

To checkpoint a system, run nisping -C.

Insufficient Processes

On a heavily loaded machine it is possible that you could reach the maximum number of simultaneous processes that the machine is configured to handle. This causes messages with clauses like “unable to fork”. The recommended method of handling this problem is to kill any unnecessary processes. If the problem persists, you can reconfigure the machine to handle more processes as described in your system administration documentation.

NIS+ User Problems

This section describes NIS+ problems that a typical user might encounter.

User Problem Symptoms

User Cannot Log In

There are many possible reasons for a user being unable to log in:

(See nsswitch.conf File Requirements for further details.)

User Cannot Log In Using New Password

Symptoms:

Users who recently changed their password are unable to log in at all, or are able to log in on some machines but not on others.

Possible Causes:

User Cannot Remote Log In to Remote Domain

Symptoms:

User tries to rlogin to a machine in some other domain and is refused with a “Permission denied” type error message.

Possible Cause:

To rlogin to a machine in another domain, a user must have LOCAL credentials in that domain.

Diagnosis:

Run nismatch username.domainname. cred.org_dir in the other domain to see if the user has a LOCAL credential in that domain.

Solution:

Go to the remote domain and use nisaddcred to create a LOCAL credential for the user in the that domain.

User Cannot Change Password

The most common cause of a user being unable to change passwords is that the user is mistyping (or has forgotten) the old password.

Other possible causes:

Other NIS+ Problems

This section describes problems that do not fit any of the previous categories.

How to Tell if NIS+ Is Running

You may need to know whether a given host is running NIS+. A script may also need to determine whether NIS+ is running.

You can assume that NIS+ is running if:

Replica Update Failure

Symptoms:

Error messages indicating that the update was not successfully complete. (Note that the message: replica_update: number updates number errors indicates a successful update.)

Possible Causes:

Any of the following error messages indicate that the server was busy and that the update should be rescheduled:

(These messages are generated by, or in conjunction with, the NIS+ error code constant: NIS_DUMPLATER one replica is already resyncing.)

These messages indicate that there was some other problem:

(If rpc.nisd is being run with the -C (open diagnostic channel) option, additional information may be entered in either the master server or replica server's system log.

These messages indicate possible problems such as:

Diagnosis:

Check both the replica and server's system log for additional information. How much, if any, additional information is recorded in the system logs depends on your system's error reporting level, and whether or not you are running rpc.nisd with the -C option (diagnostics).

Solution:

In most cases, these messages indicate minor software problems which the system is capable of correcting. If the message was the result of a command, simply wait for a while and then try the command again. If these messages appear often, you can change the threshold level in your /etc/syslog.conf file. See the syslog.conf man page for details.