Solaris Naming Administration Guide

Appendix A Problems and Solutions

This appendix describes some of the problems you may encounter while administering Solaris operating environment namespaces and how to correct them.

Troubleshooting NIS+

In this appendix, problems are grouped according to type. For each problem there is a list of common symptoms, a description of the problem, and one or more suggested solutions.

In addition to this appendix, there is an appendix containing an alphabetic listing of the more common NIS+ error messages. If you are responding to a specific error message, check Appendix B, Error Messages, first. If the problem is simple, or specific to a single error message, its solution is usually described in Appendix B, Error Messages.

NIS+ Debugging Options

The NIS_OPTIONS environment variable can be set to control various NIS+ debugging options.

Options are specified after the NIS_OPTIONS command separated by spaces with the option set enclosed in double quotes. Each option has the format name=value. Values can be integers, character strings, or filenames depending on the particular option. If a value is not specified for an integer value option, the value defaults to 1.

NIS_OPTIONS recognizes the following options:

Table A-1 NIS_OPTIONS Options and Values

Option 

Values 

Actions 

debug_file

filename

Directs debug output to specified file. If this option is not specified, debug output goes to stdout.

debug_bind

Number

Displays information about the server selection process. 

debug_rpc

1 or 2

If the value is 1, displays RPC calls made to the NIS+ server and the RPC result code. If the value is 2, displays both the RPC calls and the contents of the RPC and arguments and results. 

debug_calls

Number

Displays calls to the NIS+ API and the results that are returned to the application. 

pref_srvr

String

Specifies preferred servers in the same format as that generated by the nisprefadm command (see Table 15-1). This will over-ride the preferred server list specified in nis_cachemgr.

server

String

Bind to a particular server. 

pref_type

String

Not currently implemented. 

For example, (assuming that you are using a C-Shell):


setenv NIS_OPTIONS "debug_calls=2 debug_bind debug_rpc"

setenv NIS_OPTIONS "debug_calls debug_file=/tmp/CALLS"

setenv NIS_OPTIONS "debug_calls server=sirius"

NIS+ Administration Problems

This section describes problems that may be encountered in the course of routine NIS+ namespace administration work. Common symptoms include:

Illegal Object Problems

Symptoms

There are a number of possible causes for this error message:

nisinit Fails

Make sure that:

Checkpoint Keeps Failing

If checkpoint operations with a nisping -C command consistently fail, make sure you have sufficient swap and disk space. Check for error messages in syslog. Check for core files filling up space.

Cannot Add User to a Group

A user must first be an NIS+ principal client with a LOCAL credential in the domain's cred table before the user can be added as a member of a group in that domain. A DES credential alone is not sufficient.

Logs Grow too Large

Failure to regularly checkpoint your system with nisping -C causes your log files to grow too large. Logs are not cleared on a master until all replicas for that master are updated. If a replica is down or otherwise out of service or unreachable, the master's logs for that replica cannot be cleared. Thus, if a replica is going to be down or out of service for a period of time, you should remove it as a replica from the master as described in "Removing a Directory". Keep in mind that you must first remove the directory's org_dir and groups_dir subdirectories before you remove the directory itself.

Lack of Disk Space

Lack of sufficient disk space will cause a variety of different error messages. (See "Insufficient Disk Space" for additional information.)

Cannot Truncate Transaction Log File

First, check to make sure that the file in question exists and is readable and that you have permission to write to it.

The most likely cause of inability to truncate an existing log file for which you have the proper permissions is lack of disk space. (The checkpoint process first creates a duplicate temporary file of the log before truncating the log and then removing the temporary file. If there is not enough disk space for the temporary file, the checkpoint process cannot proceed.) Check your available disk space and free up additional space if necessary.

Domain Name Confusion

Domain names play a key role in many NIS+ commands and operations. To avoid confusion, you must remember that, except for root servers, all NIS+ masters and replicas are clients of the domain above the domain that they serve. If you make the mistake of treating a server or replica as if it were a client of the domain that it serves, you may get Generic system error or Possible loop detected in namespace directoryname:domainname error messages.

For example, the machine altair might be a client of the subdoc.doc.com. domain. If the master server of the subdoc.doc.com. subdomain is the machine sirius, then sirius is a client of the doc.com. domain. When using, specifying, or changing domains, remember these rules to avoid confusion:

  1. Client machines belong to a given domain or subdomain.

  2. Servers and replicas that serve a given subdomain are clients of the domain above the domain they are serving.

  3. The only exception to Rule 2 is that the root master server and root replica servers are clients of the same domain that they serve. In other words, the root master and root replicas are all clients of the root domain.

Thus, in the example above, the fully qualified name of the altair machine is alladin.subdoc.doc.com. The fully qualified name of the sirius machine is sirius.doc.com. The name sirius.subdoc.doc.com. is wrong and will cause an error because sirius is a client of doc.com., not subdoc.doc.com.

Cannot Delete org_dir or groups_dir

Always delete org_dir and groups_dir before deleting their parent directory. If you use nisrmdir to delete the domain before deleting the domain's groups_dir and org_dir, you will not be able to delete either of those two subdirectories.

Removal or Disassociation of NIS+ Directory from Replica Fails

When removing or disassociating a directory from a replica server you must first remove the directory's org_dir and groups_dir subdirectories before removing the directory itself. After each subdirectory is removed, you must run nisping on the parent directory of the directory you intend to remove. (See "Removing a Directory".)

If you fail to perform the nisping operation, the directory will not be completely removed or disassociated.

If this occurs, you need to perform the following steps to correct the problem:

  1. Remove /var/nis/rep/org_dir on the replica.

  2. Make sure that org_dir.domain does not appear in /var/nis/rep/serving_list on the replica.

  3. Perform a nisping on domain.

  4. From the master server, run nisrmdir -f replica_directory.

If the replica server you are trying to dissociate is down or out of communication, the nisrmdir -s command will return a Cannot remove replica name: attempt to remove a non-empty table error message.

In such cases, you can run nisrmdir -f -s replicaname on the master to force the dissociation. Note, however, that if you use nisrmdir -f -s to dissociate an out-of-communication replica, you must run nisrmdir -f -s again as soon as the replica is back on line in order to clean up the replica's /var/nis file system. If you fail to rerun nisrmdir -f -s replicaname when the replica is back in service, the old out-of-date information left on the replica could cause problems.

NIS+ Database Problems

This section covers problems related to the namespace database and tables. Common symptoms include error messages with operative clauses such as:

as well as when rpc.nisd fails.

See also "NIS+ Ownership and Permission Problems".

Multiple rpc.nisd Parent Processes

Symptoms:

Various Database and transaction log corruption error messages containing the terms:

Possible Causes:

You have multiple independent rpc.nisd daemons running. In normal operation, rpc.nisd can spawn other child rpc.nisd daemons. This causes no problem. However, if two parent rpc.nisd daemons are running at the same time on the same machine, they will overwrite each other's data and corrupt logs and databases. (Normally, this could only occur if someone started running rpc.nisd by hand.)

Diagnosis:

Run ps -ef | grep rpc.nisd. Make sure that you have no more than one parent rpc.nisd process.

Solution:

If you have more than one "parent" rpc.nisd entries, you must kill all but one of them. Use kill -9 process-id, then run the ps command again to make sure it has died.


Note -

If you started rpc.nisd with the -B option, you must also kill the rpc.nisd_resolv daemon.


If an NIS+ database is corrupt, you will also have to restore it from your most recent backup that contains an uncorrupted version of the database. You can then use the logs to update changes made to your namespace since the backup was recorded. However, if your logs are also corrupted, you will have to recreate by hand any namespace modifications made since the backup was taken.

rpc.nisd Fails

If an NIS+ table is too large, rpc.nisd may fail.

Diagnosis:

Use nisls to check your NIS+ table sizes. Tables larger than 7k may cause rpc.nisd to fail.

Solution:

Reduce the size of large NIS+ tables. Keep in mind that as a naming service NIS+ is designed to store references to objects, not the objects themselves.

NIS+ and NIS Compatibility Problems

This section describes problems having to do with NIS compatibility with NIS+ and earlier systems and the switch configuration file. Common symptoms include:

Error messages with operative clauses include:

User Cannot Log In After Password Change

Symptoms:

New users, or users who recently changed their password are unable to log in at all, or able to log in on one or more machines but not on others. The user may see error messages with operative clauses such as:

First Possible Cause:

Password was changed on NIS machine.

If a user or system administrator uses the passwd command to change a password on a Solaris operating environment machine running NIS in a domain served by NIS+ namespace servers, the user's password is changed only in that machine's /etc/passwd file. If the user then goes to some other machine on the network, the user's new password will not be recognized by that machine. The user will have to use the old password stored in the NIS+ passwd table.

Diagnosis:

Check to see if the user's old password is still valid on another NIS+ machine.

Solution:

Use passwd on a machine running NIS+ to change the user's password.

Second Possible Cause:

Password changes take time to propagate through the domain.

Diagnosis:

Namespace changes take a measurable amount of time to propagate through a domain and an entire system. This time might be as short as a few seconds or as long as many minutes, depending on the size of your domain and the number of replica servers.

Solution:

You can simply wait the normal amount of time for a change to propagate through your domain(s). Or you can use the nisping org_dir command to resynchronize your system.

nsswitch.conf File Fails to Perform Correctly

A modified (or newly installed) nsswitch.conf file fails to work properly.

Symptoms:

You install a new nsswitch.conf file or make changes to the existing file, but your system does not implement the changes.

Possible Cause:

Each time an nsswitch.conf file is installed or changed, you must reboot the machine for your changes to take effect. This is because nscd caches the nsswitch.conf file.

Solution:

Check your nsswitch.conf file against the information contained in the nsswitch.conf man page. Correct the file if necessary, and then reboot the machine.

NIS+ Object Not Found Problems

This section describes problem in which NIS+ was unable to find some object or principal. Common symptoms include:

Error messages with operative clauses such as:

Syntax or Spelling Error

The most likely cause of some NIS+ object not being found is that you mistyped or misspelled its name. Check the syntax and make sure that you are using the correct name.

Incorrect Path

A likely cause of an "object not found" problem is specifying an incorrect path. Make sure that the path you specified is correct. Also make sure that the NIS_PATH environment variable is set correctly.

Domain Levels Not Correctly Specified

Remember that all servers are clients of the domain above them, not the domain they serve. There are two exceptions to this rule:

Object Does Not Exist

The NIS+ object may not have been found because it does not exist, either because it has been erased or not yet created. Use nisls -lin the appropriate domain to check that the object exists.

Lagging or Out-of-Sync Replica

When you create or modify an NIS+ object, there is a time lag between the completion of your action and the arrival of the new updated information at a given replica. In ordinary operation, namespace information may be queried from a master or any of its replicas. A client automatically distributes queries among the various servers (master and replicas) to balance system load. This means that at any given moment you do not know which machine is supplying you with namespace information. If a command relating to a newly created or modified object is sent to a replica that has not yet received the updated information from the master, you will get an "object not found" type of error or the old out-of-date information. Similarly, a general command such as nisls may not list a newly created object if the system sends the nisls query to a replica that has not yet been updated.

You can use nisping to resync a lagging or out of sync replica server.

Alternatively, you can use the -M option with most NIS+ commands to specify that the command must obtain namespace information from the domain's master server. In this way you can be sure that you are obtaining and using the most up-to-date information. (However, you should use the -M option only when necessary because a main point of having and using replicas to serve the namespace is to distribute the load and thus increase network efficiency.)

Files Missing or Corrupt

One or more of the files in /var/nis/data directory has become corrupted or erased. Restore these files from your most recent backup.

Old /var/nis Filenames

In Solaris Release 2.4 and earlier, the /var/nis directory contained two files named hostname.dict and hostname.log. It also contained a subdirectory named /var/nis/hostname. Starting with Solaris Release 2.5, the two files were renamed trans.log and data.dict, and the subdirectory is named /var/nis/data.

Do not rename the /var/nis or /var/nis/data directories or any of the files in these directories that were created by nisinit or any of the other NIS+ setup procedures.

In Solaris Release 2.5, the content of the files were also changed and they are not backward compatible with Solaris Release 2.4 or earlier. Thus, if you rename either the directories or the files to match the Solaris Release 2.4 patterns, the files will not work with either the Solaris Release 2.4 or the Solaris Release 2.5 or later versions of rpc.nisd. Therefore, you should not rename either the directories or the files.

Blanks in Name

Symptoms:

Sometimes an object is there, sometimes it is not. Some NIS+ or UNIX commands report that an NIS+ object does not exist or cannot be found, while other NIS+ or UNIX commands do find that same object.

Diagnoses:

Use nisls to display the object's name. Look carefully at the object's name to see if the name actually begins with a blank space. (If you accidentally enter two spaces after the flag when creating NIS+ objects from the command line with NIS+ commands, some NIS+ commands will interpret the second space as the beginning of the object's name.)

Solution:

If an NIS+ object name begins with a blank space, you must either rename it without the space or remove it and then recreate it from scratch.

Cannot Use Automounter

Symptoms:

You cannot change to a directory on another host.

Possible Cause:

Under NIS+, automounter names must be renamed to meet NIS+ requirements. NIS+ cannot access /etc/auto* tables that contain a period in the name. For example, NIS+ cannot access a file named auto.direct.

Diagnosis:

Use nisls and niscat to determine if the automounter tables are properly constructed.

Solution:

Change the periods to underscores. For example, change auto.direct to auto_direct. (Be sure to change other maps that might reference these.)

Links To or From Table Entries Do Not Work

You cannot use the nisln command (or any other command) to create links between entries in tables. NIS+ commands do not follow links at the entry level.

NIS+ Ownership and Permission Problems

This section describes problems related to user ownership and permissions. Common symptoms include:

Error messages with operative clauses such as:

Another Symptom:

No Permission

The most common permission problem is the simplest: you have not been granted permission to perform some task that you try to do. Use niscat -o on the object in question to determine what permissions you have. If you need additional permission, you, the owner of the object, or the system administrator can either change the permission requirements of the object (as described in Chapter 10, Administering NIS+ Access Rights,) or add you to a group that does have the required permissions (as described in Chapter 12, Administering NIS+ Groups).

No Credentials

Without proper credentials for you and your machine, many operations will fail. Use nismatch on your home domain's cred table to make sure you have the right credentials. See "Corrupted Credentials" for more on credentials-related problems.

Server Running at Security Level 0

A server running at security level 0 does not create or maintain credentials for NIS+ principals.

If you try to use passwd on a server that is running at security level 0, you will get the error message: You name do not have secure RPC credentials in NIS+ domain domainname.

Security level 0 is only to be used by administrators for initial namespace setup and testing purposes. Level 0 should not be used in any environment where ordinary users are active.

User Login Same as Machine Name

A user cannot have the same login ID as a machine name. When a machine is given the same name as a user (or vice versa), the first principal can no longer perform operations requiring secure permissions because the second principal's key has overwritten the first principal's key in the cred table. In addition, the second principal now has whatever permissions were granted to the first principal.

For example, suppose a user with the login name of saladin is granted namespace read-only permissions. Then a machine named saladin is added to the domain. The user saladin will no longer be able to perform any namespace operations requiring any sort of permission, and the root user of the machine saladin will only have read-only permission in the namespace.

Symptoms:


Note -

When running nisclient or nisaddcred, if the message Changing Key is displayed rather than Adding Key, there is a duplicate user or host name already in existence in that domain.


Diagnosis:

Run nismatch to find the host and user in the hosts and passwd tables to see if there are identical host names and user names in the respective tables:


nismatch username passwd.org_dir

Then run nismatch on the domain's cred table to see what type of credentials are provided for the duplicate host or user name. If there are both LOCAL and DES credentials, the cred table entry is for the user; if there is only a DES credential, the entry is for the machine.

Solution:

Change the machine name. (It is better to change the machine name than to change the user name.) Then delete the machine's entry from the cred table and use nisclient to reinitialize the machine as an NIS+ client. (If you wish, you can use nistbladm to create an alias for the machine's old name in the hosts tables.) If necessary, replace the user's credentials in the cred table.

Bad Credentials

See "Corrupted Credentials".

NIS+ Security Problems

This section describes common password, credential, encryption, and other security-related problems.

Security Problem Symptoms

Error messages with operative clauses such as:

User or root unable to perform any namespace operations or tasks. (See also "NIS+ Ownership and Permission Problems".)

Login Incorrect Message

The most common cause of a "login incorrect" message is the user mistyping the password. Have the user try it again. Make sure the user knows the correct password and understands that passwords are case-sensitive and also that the letter "o" is not interchangeable with the numeral "0," nor is the letter "l" the same as the numeral "1."

Other possible causes of the "login incorrect" message are:

See Chapter 11, Administering Passwords for general information on passwords.

Password Locked, Expired, or Terminated

A common cause of a Permission denied, password expired, type message is that the user's password has passed its age limit or the user's password privileges have expired. See Chapter 11, Administering Passwords for general information on passwords.

Stale and Outdated Credential Information

Occasionally, you may find that even though you have created the proper credentials and assigned the proper access rights, some client requests still get denied. This may be due to out-of-date information residing somewhere in the namespace.

Storing and Updating Credential Information

Credential-related information, such as public keys, is stored in many locations throughout the namespace. NIS+ updates this information periodically, depending on the time-to-live values of the objects that store it, but sometimes, between updates, it gets out of sync. As a result, you may find that operations that should work, don't work. Table A-2 lists all the objects, tables, and files that store credential-related information and how to reset it.

Table A-2 Where Credential-Related Information is Stored

Item 

Stores 

To Reset or Change 

cred table 

NIS+ principal's secret key and public key. These are the master copies of these keys. 

Use nisaddcred to create new credentials; it updates existing credentials. An alternative is chkey.

Directory object 

A copy of the public key of each server that supports it. 

Run the /usr/lib/nis/

nisupdkeys command on the directory object.

Keyserver 

The secret key of the NIS+ principal that is currently logged in. 

Run keylogin for a principal user or keylogin -r for a principal workstation.

NIS+ daemon 

Copies of directory objects, which in turn contain copies of their servers' public keys. 

Kill the daemon and the cache manager. Then restart both. 

Directory cache 

A copy of directory objects, which in turn contain copies of their servers' public keys. 

Kill the NIS+ cache manager and restart it with the nis_cachemgr -i command. The -i option resets the directory cache from the cold-start file and restarts the cache manager.

Cold-start file 

A copy of a directory object, which in turn contains copies of its servers' public keys. 

On the root master, kill the NIS+ daemon and restart it. The daemon reloads new information into the existing NIS_COLD_START file.

For a client, first remove the cold-start and shared directory files from /var/nis, and kill the cache manager. Then re-initialize the principal with nisinit -c. The principal's trusted server reloads new information into the principal's existing cold-start file.

passwd table 

A user's password or a workstation's superuser password. 

Use the passwd -r nisplus command. It changes the password in the NIS+ passwd table and updates it in the cred table.

passwd file

A user's password or a workstation's superuser password. 

Use the passwd -r nisplus command, whether logged in as superuser or as yourself, whichever is appropriate.

passwd map

(NIS) 

A user's password or a workstation's superuser password. 

Use passwd -r nisplus.

Updating Stale Cached Keys

The most commonly encountered out-of-date information is the existence of stale objects with old versions of a server's public key. You can usually correct this problem by running nisupdkeys on the domain you are trying to access. (See Chapter 7, Administering NIS+ Credentials for information on using the nisupdkeys command.)

Because some keys are stored in files or caches, nisupdkeys cannot always correct the problem. At times you might need to update the keys manually. To do that, you must understand how a server's public key, once created, is propagated through namespace objects. The process usually has five stages of propagation. Each stage is described below.

Stage 1: Server's Public Key Is Generated

An NIS+ server is first an NIS+ client. So, its public key is generated in the same way as any other NIS+ client's public key: with the nisaddcred command. The public key is then stored in the cred table of the server's home domain, not of the domain the server will eventually support.

Stage 2: Public Key Is Propagated to Directory Objects

Once you have set up an NIS+ domain and an NIS+ server, you can associate the server with a domain. This association is performed by the nismkdir command. When the nismkdir command associates the server with the directory, it also copies the server's public key from the cred table to the domain's directory object. For example, assume the server is a client of the doc.com. root domain, and is made the master server of the sales.doc.com. domain.

Figure A-1 Public Key is Propagated to Directory Objects

Graphic

Its public key is copied from the cred.org_dir.doc.com. domain and placed in the sales.doc.com. directory object. This can be verified with the niscat -o sales.doc.com. command.

Stage 3: Directory Objects Are Propagated Into Client Files

All NIS+ clients are initialized with the nisinit utility or with the nisclient script.

Among other things, nisinit (or nisclient) creates a cold-start file /var/nis/NIS_COLDSTART. The cold-start file is used to initialize the client's directory cache /var/nis/NIS_SHARED_DIRCACHE. The cold-start file contains a copy of the directory object of the client's domain. Since the directory object already contains a copy of the server's public key, the key is now propagated into the cold-start file of the client.

In addition when a client makes a request to a server outside its home domain, a copy of the remote domains directory object is stored in the client's NIS_SHARED_DIRCACHE file. You can examine the contents of the client's cache by using the nisshowcache command, described on page 184.

This is the extent of the propagation until a replica is added to the domain or the server's key changes.

Stage 4: When a Replica is Added to the Domain

When a replica server is added to a domain, the nisping command (described in "The nisping Command") is used to download the NIS+ tables, including the cred table, to the new replica. Therefore, the original server's public key is now also stored in the replica server's cred table.

Stage 5: When the Server's Public Key Is Changed

If you decide to change DES credentials for the server (that is, for the root identity on the server), its public key will change. As a result, the public key stored for that server in the cred table will be different from those stored in:

Most of these locations will be updated automatically within a time ranging from a few minutes to 12 hours. To update the server's keys in these locations immediately, use the commands:

Table A-3 Updating a Server's Keys

Location 

Command 

See 

Cred table of replica servers (instead of using nisping, you can wait a few minutes until the table is updated automatically)

nisping

"The nisping Command"

Directory object of domain supported by server 

nisupdkeys

"The nisupdkeys Command"

NIS_COLDSTART file of clients

nisinit -c

"The nisinit Command"

NIS_SHARED_DIRCACHE file of clients

nis_cachemgr

"The nis_cachemgr Command"


Note -

You must first kill the existing nis_cachemgr before restarting nis_cachemgr.


Corrupted Credentials

When a principal (user or machine) has a corrupt credential, that principal is unable to perform any namespace operations or tasks, not even nisls. This is because a corrupt credential provides no permissions at all, not even the permissions granted to the nobody class.

Symptoms:

User or root cannot perform any namespace tasks or operations. All namespace operations fail with a "permission denied" type of error message. The user or root cannot even perform a nisls operation.

Possible Cause:

Corrupted keys or a corrupt, out-of-date, or otherwise incorrect /etc/.rootkey file.

Diagnosis:

Use snoop to identify the bad credential.

Or, if the principal is listed, log in as the principal and try to run an NIS+ command on an object for which you are sure that the principal has proper authorization. For example, in most cases an object grants read authorization to the nobody class. Thus, the nisls object command should work for any principal listed in the cred table. If the command fails with a "permission denied" error, then the principal's credential is likely corrupted.

Solution

Keyserv Failure

The keyserv is unable to encrypt a session. There are several possible causes for this type of problem:

Possible Causes and Solutions:

Machine Previously Was an NIS+ Client

If this machine has been initialized before as an NIS+ client of the same domain, try keylogin -r and enter the root login password at the Secure RPC password prompt.

No Entry in the cred Table

To make sure that an NIS+ password for the principal (user or host) exists in the cred table, run the following command in the principal's home domain


nisgrep -A cname=principal cred.org_dir.domainname

If you are running nisgrep from another domain, the domainname must be fully qualified.

Changed Domain Name

Do not change a domain name.

If you change the name of an existing domain you will create authentication problems because the fully qualified original domain name is embedded in objects throughout your network.

If you have already changed a domain name and are experiencing authentication problems, or error messages containing terms like "malformed" or "illegal" in relation to a domain name, change the domain name back to its original name. The recommended procedure for renaming your domains is to create a new domain with the new name, set up your machines as servers and clients of the new domain, make sure they are performing correctly, and then remove the old domain.

When Changing a Machine to a Different Domain

If this machine is an NIS+ client and you are trying to change it to a client of a different domain, remove the /etc/.rootkey file, and then rerun the nisclient script using the network password supplied by your network administrator or taken from the nispopulate script.

NIS+ and Login Passwords in /etc/passwd File

Your NIS+ password is stored in the NIS+ passwd table. Your user login password may be stored in NIS+ passwd table or in your /etc/passwd file. (Your user password and NIS+ password can be the same or different.) To change a password in an /etc/passwd file, you must run the passwd command with the nsswitch.conf file set to files or with the -r files flag.

The nsswitch.conf file specifies which password is used for which purpose. If the nsswitch.conf file is directing system queries to the wrong location, you will get password and permission errors.

Secure RPC Password and Login Passwords Are Different

When a principal's login password is different from his or her Secure RPC password, keylogin cannot decrypt it at login time because keylogin defaults to using the principal's login password, and the private key was encrypted using the principal's Secure RPC password.

When this occurs the principal can log in to the system, but for NIS+ purposes is placed in the authorization class of nobody because the keyserver does not have a decrypted private key for that user. Since most NIS+ environments are set up to deny the nobody class create, destroy, and modify rights to most NIS+ objects this results in "permission denied" types errors when the user tries to access NIS+ objects.


Note -

In this context network password is sometimes used as a synonym for Secure RPC password. When prompted for your "network password," enter your Secure RPC password.


To be placed in one of the other authorization classes, a user in this situation must explicitly run the keylogin program and give the principal's Secure RPC password when keylogin prompts for password. (See "Keylogin".)

But an explicit keylogin provides only a temporary solution that is good only for the current login session. The keyserver now has a decrypted private key for the user, but the private key in the user's cred table is still encrypted using the user's Secure RPC password, which is different than the user's login password. The next time the user logs in, the same problem reoccurs. To permanently solve the problem the user needs to change the private key in the cred table to one based on the user's login ID rather than the user's Secure RPC password. To do this, the user need to run the chkey program as described in "Changing Keys for an NIS+ Principal".

Thus, to permanently solve a Secure RPC password different than login password problems, the user (or an administrator acting for the user) must perform the following steps:

  1. Log in using the login password.

  2. Run the keylogin program to temporarily get a decrypted private key stored in the keyserver and thus gain temporary NIS+ access privileges.

  3. Run chkey -pto permanently change the encrypted private key in the cred table to one based on the user's login password.

Preexisting /etc/.rootkey File

Symptoms:

Various insufficient permission to and permission denied error messages.

Possible Cause:

An /etc/.rootkey file already existed when you set up or initialized a server or client. This could occur if NIS+ had been previously installed on the machine and the .rootkey file was not erased when NIS+ was removed or the machine returned to using NIS or /etc files.

Diagnosis:

Run ls -l on the /etc directory and nisls -l org_dir and compare the date of the /etc/.rootkey to the date of the cred table. If the /etc/.rootkey date is clearly earlier than that of the cred table, it may be a preexisting file.

Solution:

Run keylogin -r as root on the problem machine and then set up the machine as a client again.

Root Password Change Causes Problem

Symptoms:

You change the root password on a machine, and the change either fails to take effect or you are unable to log in as superuser.

Possible Cause:


Note -

For security reasons, you should not have User ID 0 listed in the passwd table.


You changed the root password, but root's key was not properly updated. Either because you forgot to run chkey -p for root or some problem came up.

Solution

Log in as a user with administration privileges (that is, a user who is a member of a group with administration privileges) and use passwd to restore the old password. Make sure that old password works. Now use passwd to change root's password to the new one, and then run chkey -p.


Caution - Caution -

Once your NIS+ namespace is set up and running, you can change the root password on the root master machine. But do not change the root master keys, as these are embedded in all directory objects on all clients, replicas, and servers of subdomains. To avoid changing the root master keys, always use the -p option when running chkey as root.


NIS+ Performance and System Hang Problems

This section describes common slow performance and system hang problems.

Performance Problem Symptoms

Error messages with operative clauses such as:

Other common symptoms:

Checkpointing

Someone has issued an nisping or nisping -C command. Or the rpc.nisd daemon is performing a checkpoint operation.


Caution - Caution -

Do not reboot! Do not issue any more nisping commands.


When performing a nisping or checkpoint, the server will be sluggish or may not immediately respond to other commands. Depending on the size of your namespace, these commands may take a noticeable amount of time to complete. Delays caused by checkpoint or ping commands are multiplied if you, or someone else, enter several such commands at one time. Do not reboot. This kind of problem will solve itself. Just wait until the server finishes performing the nisping or checkpoint command.

During a full master-replica resync, the involved replica server will be taken out of service until the resync is complete. Do not reboot--just wait.

Variable NIS_PATH

Make sure that your NIS_PATH variable is set to something clean and simple. For example, the default: org_dir.$:$. A complex NIS_PATH, particularly one that itself contains a variable, will slow your system and may cause some operations to fail. (See "NIS_PATH Environment Variable" for more information.)

Do not use nistbladm to set nondefault table paths. Nondefault table paths will slow performance.

Table Paths

Do not use table paths because they will slow performance.

Too Many Replicas

Too many replicas for a domain degrade system performance during replication. There should be no more than 10 replicas in a given domain or subdomain. If you have more than five replicas in a domain, try removing some of them to see if that improves performance.

Recursive Groups

A recursive group is a group that contains the name of some other group. While including other groups in a group reduces your work as system administrator, doing so slows down the system. You should not use recursive groups.

Large NIS+ Database Logs at Start-up

When rpc.nisd starts up it goes through each log. If the logs are long, this process could take a long time. If your logs are long, you may want to checkpoint them using nisping -C before starting rpc.nisd.

The Master rpc.nisd Daemon Died

Symptoms:

If you used the -M option to specify that your request be sent to the master server, and the rpc.nisd daemon has died on that machine, you will get a "server not responding" type error message and no updates will be permitted. (If you did not use the -M option, your request will be automatically routed to a functioning replica server.)

Possible Cause:

Using uppercase letters in the name of a home directory or host can sometimes cause rpc.nisd to die.

Diagnosis:

First make sure that the server itself is up and running. If it is, run ps -ef | grep rpc.nisd to see if the daemon is still running.

Solution:

If the daemon has died, restart it. If rpc.nisd frequently dies, contact your service provider.

No nis_cachemgr

Symptoms:

It takes too long for a machine to locate namespace objects in other domains.

Possible Cause:

You do not have nis_cachemgr running.

Diagnosis:

Run ps -ef | grep nis_cachemgr to see if it is still running.

Solution

Start nis_cachemgr on that machine.

Server Very Slow at Start-up After NIS+ Installation

Symptoms:

A server performs slowly and sluggishly after using the NIS+ scripts to install NIS+ on it.

Possible Cause:

You forgot to run nisping -C -a after running the nispopulate script.

Solution:

Run nisping -C -a to checkpoint the system as soon as you are able to do so.

niscat Returns: Server busy. Try Again

Symptoms:

You run niscat and get an error message indicating that the server is busy.

Possible Cause:

Diagnosis:

Run swap -s to check your server's swap space.

Solution:

You must have adequate swap and disk space to run NIS+. If necessary, increase your space.

NIS+ Queries Hang After Changing Host Name

Symptoms:

Setting the host name for an NIS+ server to be fully qualified is not recommended. If you do so, and NIS+ queries then just hang with no error messages, check the following possibilities:

Possible Cause:

Fully qualified host names must meet the following criteria:

Solution:

Kill the NIS+ processes that are hanging and then kill rpc.nisd on that host or server. Rename the host to match the two requirements listed above. Then reinitialize the server with nisinit. (If queries still hang after you are sure that the host is correctly named, check other problem possibilities in this section.)

NIS+ System Resource Problems

This section describes problems having to do with lack of system resources such as memory, disk space, and so forth.

Resource Problem Symptoms

Error messages with operative clauses such as:

Insufficient Memory

Lack of sufficient memory or swap space on the system you are working with will cause a wide variety of NIS+ problems and error messages. As a short-term, temporary solution, try to free additional memory by killing unneeded windows and processes. If necessary, exit your windowing system and work from the terminal command line. If you still get messages indicating inadequate memory, you will have to install additional swap space or memory, or switch to a different system that has enough swap space or memory.

Under some circumstances, applications and processes may develop memory leaks and grow too large. you can check the current size of an application or process by running:


ps -el

The sz (size) column shows the current memory size of each process. If necessary, compare the sizes with comparable processes and applications on a machine that is not having memory problems to see if any have grown too large.

Insufficient Disk Space

Lack of disk space will cause a variety of error messages. A common cause of insufficient disk space is failure to regularly remove tmp files and truncate log files. log and tmp files grow steadily larger unless truncated. The speed at which these files grow varies from system to system and with the system state. log files on a system that is working inefficiently or having namespace problems will grow very fast.


Note -

If you are doing a lot of troubleshooting, check your log and tmp files frequently. Truncate log files and remove tmp files before lack of disk space creates additional problems. Also check the root directory and home directories for core files and delete them.


The way to truncate log files is to regularly checkpoint your system (Keep in mind that a checkpoint process may take some time and will slow down your system while it is being performed, checkpointing also requires enough disk space to create a complete copy of the files before they are truncated.)

To checkpoint a system, run nisping -C.

Insufficient Processes

On a heavily loaded machine it is possible that you could reach the maximum number of simultaneous processes that the machine is configured to handle. This causes messages with clauses like "unable to fork". The recommended method of handling this problem is to kill any unnecessary processes. If the problem persists, you can reconfigure the machine to handle more processes as described in your system administration documentation.

NIS+ User Problems

This section describes NIS+ problems that a typical user might encounter.

User Problem Symptoms

User Cannot Log In

There are many possible reasons for a user being unable to log in:

(See "nsswitch.conf File Requirements" for further details.)

User Cannot Log In Using New Password

Symptoms:

Users who recently changed their password are unable to log in at all, or are able to log in on some machines but not on others.

Possible Causes:

User Cannot Remote Log In to Remote Domain

Symptoms:

User tries to rlogin to a machine in some other domain and is refused with a "Permission denied" type error message.

Possible Cause:

To rlogin to a machine in another domain, a user must have LOCAL credentials in that domain.

Diagnosis:

Run nismatch username.domainname. cred.org_dir in the other domain to see if the user has a LOCAL credential in that domain.

Solution:

Go to the remote domain and use nisaddcred to create a LOCAL credential for the user in the that domain.

User Cannot Change Password

The most common cause of a user being unable to change passwords is that the user is mistyping (or has forgotten) the old password.

Other possible causes:

Other NIS+ Problems

This section describes problems that do not fit any of the previous categories.

How to Tell if NIS+ Is Running

You may need to know whether a given host is running NIS+. A script may also need to determine whether NIS+ is running.

You can assume that NIS+ is running if:

Replica Update Failure

Symptoms:

Error messages indicating that the update was not successfully complete. (Note that the message: replica_update: number updates number errors indicates a successful update.)

Possible Causes:

Any of the following error messages indicate that the server was busy and that the update should be rescheduled:

(These messages are generated by, or in conjunction with, the NIS+ error code constant: NIS_DUMPLATER one replica is already resyncing.)

These messages indicate that there was some other problem:

(If rpc.nisd is being run with the -C (open diagnostic channel) option, additional information may be entered in either the master server or replica server's system log.

These messages indicate possible problems such as:

Diagnosis:

Check both the replica and server's system log for additional information. How much, if any, additional information is recorded in the system logs depends on your system's error reporting level, and whether or not you are running rpc.nisd with the -C option (diagnostics).

Solution:

In most cases, these messages indicate minor software problems which the system is capable of correcting. If the message was the result of a command, simply wait for a while and then try the command again. If these messages appear often, you can change the threshold level in your /etc/syslog.conf file. See the syslog.conf man page for details.

NIS Problems and Solutions

This section explains how to resolve problems encountered on networks running NIS. It covers problems seen on an NIS client and those seen on an NIS server.

Before trying to debug an NIS server or client, review Chapter 18, Network Information Service (NIS), which explains the NIS environment. Then look for the subheading in this section that best describes your problem.

Symptoms:

Common symptoms of NIS binding problems include:

NIS Problems Affecting One Client

If only one or two clients are experiencing symptoms that indicate NIS binding difficulty, the problems probably are on those clients. If many NIS clients are failing to bind properly, the problem probably exists on one or more of the NIS servers (see "NIS Problems Affecting Many Clients").

ypbind Not Running on Client

One client has problems, but other clients on the same subnet are operating normally. On the problem client, run ls -l on a directory, such as /usr, that contains files owned by many users, including some not in the client /etc/passwd file. If the resulting display lists file owners who are not in the local /etc/passwd as numbers, rather than names, this indicates that NIS service is not working on the client.

These symptoms usually mean that the client ypbind process is not running. Run ps -e and check for ypbind. If you do not find it, log in as superuser and start ypbind by typing:


client# /usr/lib/netsvc/yp/ypstart

Missing or Incorrect Domain Name

One client has problems, the other clients are operating normally, but ypbind is running on the problem client. The client may have an incorrectly set domain.

On the client, run the domainname command to see which domain name is set.


Client#7 domainname neverland.com

Compare the output with the actual domain name in /var/yp on the NIS master server. The actual NIS domain is shown as a subdirectory in the /var/yp directory.


Client#7 ls /var/yp...
-rwxr-xr-x 1 root Makefile
drwxr-xr-x 2 root binding
drwx------ 2 root doc.com
...

If the domain name returned by running domainname on a machine is not the same as the server domain name listed as a directory in /var/yp, the domain name specified in the machine's /etc/defaultdomain file is incorrect. Log in as superuser and correct the client's domain name in the machine's /etc/defaultdomain file. This assures that the domain name is correct every time the machine boots. Now reboot the machine.


Note -

The domain name is case-sensitive.


Client Not Bound to Server

If your domain name is set correctly, ypbind is running, and commands still hang, then make sure that the client is bound to a server by running the ypwhich command. If you have just started ypbind, then run ypwhich several times (typically, the first one reports that the domain is not bound and the second succeeds normally).

No Server Available

If your domain name is set correctly, ypbind is running, and you get messages indicating that the client cannot communicate with a server, this may indicate a number of different problems:


Note -

For reasons of security and administrative control it is preferable to specify the servers a client is to bind to in the client's ypservers file rather than have the client search for servers through broadcasting. Broadcasting ties up the network, slows the client, and prevents you from balancing server load by listing different servers for different clients.


ypwhich Displays Are Inconsistent

When you use ypwhich several times on the same client, the resulting display varies because the NIS server changes. This is normal. The binding of the NIS client to the NIS server changes over time when the network or the NIS servers are busy. Whenever possible, the network stabilizes at a point where all clients get acceptable response time from the NIS servers. As long as your client machine gets NIS service, it does not matter where the service comes from. For example, an NIS server machine may get its own NIS services from another NIS server on the network.

When Server Binding is Not Possible

In extreme cases where local server binding is not possible, use of the ypset command may temporarily allow binding to another server, if available, on another network or subnet. However, in order to use the -ypset option, ypbind must be started with either the -ypset or -ypsetme options.


Note -

For security reasons, the use of the -ypset and -ypsetme options should be limited to debugging purposes under controlled circumstances. Use of the -ypset and -ypsetme options can result in serious security breaches because while they are operative anyone can then alter server bindings causing trouble for others and permitting unauthorized access to sensitive data. If you must start ypbind with these options, once you have fixed the problem you should kill ypbind and restart it again without those options.


ypbind Crashes

If ypbind crashes almost immediately each time it is started, look for a problem in some other part of the system. Check for the presence of the rpcbind daemon by typing:


% ps -ef | grep rpcbind

If rpcbind is not present or does not stay up or behaves strangely, consult your RPC documentation.

You may be able to communicate with rpcbind on the problematic client from a machine operating normally. From the functioning machine, type:


% rpcinfo client

If rpcbind on the problematic machine is fine, rpcinfo produces the following output:


program	version	netid	address	service	owner
...
100007	2	udp	0.0.0.0.2.219	ypbind	superuser
100007	1	udp	0.0.0.0.2.219	ypbind	superuser
100007	1	tcp	0.0.0.0.2.220	ypbind	superuser
100007	2	tcp	0.0.0.0.128.4	ypbind	superuser
100007	2	ticotsord	\000\000\020H	ypbind	superuser
100007	2	ticots	\000\000\020K	ypbind	superuser
...

Your machine will have different addresses. If they are not displayed, ypbind has been unable to register its services. Reboot the machine and run rpcinfo again. If the ypbind processes are there and they change each time you try to restart /usr/lib/netsvc/yp/ypbind, reboot the system, even if the rpcbind daemon is running.

NIS Problems Affecting Many Clients

If only one or two clients are experiencing symptoms that indicate NIS binding difficulty, the problems probably are on those clients (see "NIS Problems Affecting One Client"). If many NIS clients are failing to bind properly, the problem probably exists on one or more of the NIS servers.

Network or Servers are Overloaded

NIS can hang if the network or NIS servers are so overloaded that ypserv cannot get a response back to the client ypbind process within the time-out period.

Under these circumstances, every client on the network experiences the same or similar problems. In most cases, the condition is temporary. The messages usually go away when the NIS server reboots and restarts ypserv, or when the load on the NIS servers or network itself decreases.

Server Malfunction

Make sure the servers are up and running. If you are not physically near the servers, use the ping command.

NIS Daemons Not Running

If the servers are up and running, try to find a client machine behaving normally, and run the ypwhich command. If ypwhich does not respond, kill it. Then log in as root on the NIS server and check if the NIS ypbind process is running by entering:


# ps -e | grep yp

Note -

Do not use the -f option with ps because this option attempts to translate user IDs to names which causes more name service lookups that may not succeed.


If either the ypbind or ypserv daemons are not running, kill them and then restart them by entering:


# /usr/lib/netsvc/yp/ypstop
# /usr/lib/netsvc/yp/ypstart

If both the ypserv and ypbind processes are running on the NIS server, type:


# ypwhich

If ypwhich does not respond, ypserv has probably hung and should be restarted. While logged in as root on the server, kill ypserv and restart it by typing:


# /usr/lib/netsvc/yp/ypstop
# /usr/lib/netsvc/yp/ypstart

Servers Have Different Versions of an NIS Map

Because NIS propagates maps among servers, occasionally you may find different versions of the same map on various NIS servers on the network. This version discrepancy is normal add acceptable if the differences do not last for more than a short time.

The most common cause of map discrepancy is that something is preventing normal map propagation. For example, an NIS server or router between NIS servers is down. When all NIS servers and the routers between them are running, ypxfr should succeed.

If the servers and routers are functioning properly, check the following:

Logging ypxfr Output

If a particular slave server has problems updating maps, log in to that server and run ypxfr interactively. If ypxfr fails, it tells you why it failed, and you can fix the problem. If ypxfr succeeds, but you suspect it has occasionally failed, create a log file to enable logging of messages. To create a log file, enter:


ypslave# cd /var/yp
ypslave# touch ypxfr.log

This creates a ypxfr.log file that saves all output from ypxfr.

The output resembles the output ypxfr displays when run interactively, but each line in the log file is time stamped. (You may see unusual ordering in the time-stamps. That is okay--the time-stamp tells you when ypxfr started to run. If copies of ypxfr ran simultaneously but their work took differing amounts of time, they may actually write their summary status line to the log files in an order different from that which they were invoked.) Any pattern of intermittent failure shows up in the log.


Note -

When you have fixed the problem, turn off logging by removing the log file. If you forget to remove it, it continues to grow without limit.


Check the crontab File and ypxfr Shell Script

Inspect the root crontab file, and check the ypxfr shell script it invokes. Typographical errors in these files can cause propagation problems. Failures to refer to a shell script within the /var/spool/cron/crontabs/root file, or failures to refer to a map within any shell script can also cause errors.

Check the ypservers Map

Also, make sure that the NIS slave server is listed in the ypservers map on the master server for the domain. If it is not, the slave server still operates perfectly as a server, but yppush does not propagate map changes to the slave server.

Work Around

If the NIS slave server problem is not obvious, you can work around it while you debug using rcp or ftp to copy a recent version of the inconsistent map from any healthy NIS server. For instance, here is how you might transfer the problem map:


 ypslave# rcp ypmaster:/var/yp/mydomain/map.\* /var/yp/mydomain

Here the * character has been escaped in the command line, so that it will be expanded on ypmaster, instead of locally on ypslave.

ypserv Crashes

When the ypserv process crashes almost immediately, and does not stay up even with repeated activations, the debug process is virtually identical to that described in "ypbind Crashes". Check for the existence of the rpcbind daemon as follows:


ypserver% ps -e | grep rpcbind

Reboot the server if you do not find the daemon. Otherwise, if the daemon is running, type the following and look for similar output:


% rpcinfo -p ypserver
program 	vers 	proto 	port 	service
100000	4	tcp	111	portmapper
100000	3	tcp	111	portmapper
100068	2	udp	32813	cmsd
...
100007	1	tcp	34900	ypbind
100004	2	udp	731	ypserv
100004	1	udp	731	ypserv
100004	1	tcp	732	ypserv
100004	2	tcp	32772	ypserv

Your machine may have different port numbers. The four entries representing the ypserv process are:


100004 	2 	udp 	731 	ypserv
100004 	1 	udp 	731 	ypserv
100004 	1 	tcp 	732 	ypserv
100004 	2 	tcp 	32772 	ypserv

If they are not present, and ypserv is unable to register its services with rpcbind, reboot the machine. If they are present, deregister the service from rpcbind before restarting ypserv. To deregister the service from rpcbind, on the server type:


# rpcinfo -d number 1
# rpcinfo -d number 2

Where number is the ID number reported by rpcinfo (100004, in the example above).

DNS Problems and Solutions

This section describes some common DNS problems and how to solve them.

Clients Can Find Machine by Name but Server Cannot

Symptoms:

DNS clients can find machines by either IP address or by host name, but the server can only find machines by their IP addresses.

Probable cause and solution:

This is most likely caused by omitting DNS from the hosts line of the server's nsswitch.conf file. For example, a bad hosts line might look like this: hosts: files

When using DNS you must include dns in the hosts record of every machine's nsswitch.conf file. For example:


hosts: dns nisplus [NOTFOUND=return] files

or


hosts: nisplus dns [NOTFOUND=return] files

Changes Do Not Take Effect or Are Erratic

Symptom:

You add or delete machines or servers but your changes are not recognized or do not take effect. Or in some instances the changes are recognized and at other times they are not in effect.

Probable cause:

The most likely cause is that you forgot to increment the SOA serial number on the primary master server after you made your change. Since there is no new SOA number, your secondary servers do not update their data to match that of the primary so they are working with the old, unchanged data files.

Another possible cause is that the SOA serial number in one or more of the primary data files was set to a value lower than the corresponding serial number on your secondary servers. This could happen, for example, if you deleted a file on the primary and then recreated it from scratch using an input file of some sort.

A third possible cause is that you forgot to send a HUP signal to the primary server after making changes to the primary's data files.

Diagnosis and solution:

First, check the SOA serial numbers in the data file that you changed and the corresponding file on the secondary server.

DNS Client Cannot Lookup "Short" Names

Symptoms:

Client can lookup fully qualified names but not short names.

Possible cause and solution:

Check the client's /etc/resolv.conf file for spaces at the end of the domain name. No spaces or tabs are allowed at the end of the domain name.

Reverse Domain Data Not Correctly Transferred to Secondary

While zone domain-named data is properly transferred from the zone primary master server to a zone secondary server, the reverse domain data is not being transferred. In other words, the host.rev file on the secondary is not being properly updated from the primary.

Possible causes:

Syntax error in the secondary server's boot file.

Diagnosis and Solution:

Check the secondary server's boot file. Make sure that the primary server's IP address is listed for the reverse zone entries just as it is for the hosts data.

For example, the following boot file is incorrect because the primary server's IP address (129.146.168.119) is missing from the secondary in-addr.arpa record:


;
; /etc/named.boot file for dnssecondary
directory /var/named
secondary   doc.com   129.146.168.119        dnshosts.bakup
secondary   168.146.129.in-addr.arpa  doc.rev.bakup

This is what the correct file should look like:


;
; /etc/named.boot file for dnssecondary
directory /var/named
secondary   doc.com   129.146.168.119        dnshosts.bakup
secondary   168.146.129.in-addr.arpa   129.146.168.119  doc.rev.bakup

Server Failed and Zone Expired Problems

When a secondary server cannot obtain updates from its master, it logs a master unreachable message. If the problem is not corrected, the secondary expires the zone and stops answering requests from clients. When that happens, users start seeing server failed messages.

Symptoms:

Note that if the problem lies with a secondary server, some users could still be successfully obtaining DNS information from the master and thus operating without experiencing any difficulty.

Possible causes:

The two most likely causes for these problems are network failure and a wrong IP address for the master in the secondary's boot file.

Diagnosis and solution:

Make sure that the IP address of the master matches the master's actual IP address and the address for the master specified in the hosts file. If the IP address is wrong, correct it, and then reboot the secondary.


% ping 129.146.168.119 -n 10

rlogin, rsh, and ftp Problems

Symptoms:

Possible causes:

Diagnosis and solution:

Check the appropriate hosts.rev file and make sure there is a PTR record for the user's machine. For example, if the user is working at the machine altair.doc.com with an IP address of 129.146.168.46, the doc.com primary master server's doc.rev file should have an entry like:


46 	IN	 PTR 	altair.doc.com.

If the record is missing, add it to the hosts.rev file and then reboot the server or reload its data as explained in "Forcing in.named to Reload DNS Data".

Check and correct the NS entries in the hosts.rev files and then reboot the server or reload its data as explained in "Forcing in.named to Reload DNS Data".

Other DNS Syntax Errors

Symptoms:

Error messages in console or syslog with operative phrases like the following are most often caused by syntax errors in DNS data and boot files:

Check the relevant files for spelling and syntax errors.

A common syntax error is misuse of the trailing dot in domain names (either using the dot when you should not, or not using it when you should). See "Trailing Dots in Domain Names".

FNS Problems and Solutions

This section presents problem scenarios with a description of probable causes, diagnoses, and solutions.

See "FNS Error Messages" for general information about FNS error messages, and Appendix B, Error Messages.

Cannot Obtain Initial Context

Symptom:

You get the message Cannot obtain initial context.

Possible Cause:

This is caused by an installation problem.

Diagnosis:

Check that FNS has been installed properly by looking for the file, /usr/lib/fn/fn_ctx_initial.so.

Solution:

Install the fn_ctx_initial.so library.

Nothing in Initial Context

Symptom:

When you run fnlist to see what is in the initial context, you see nothing.

Possible Cause:

This is caused by an NIS+ configuration problem. The organization associated with the user and machine running the fn* commands do not have an associated ctx_dir directory.

Diagnosis:

Use the nisls command to see whether there is a ctx_dir directory.

Solution:

If there is no ctx_dir directory, run fncreate -t org/nis+_domain_name/ to create the ctx_dir directory.

"No Permission" Messages (FNS)

Symptom:

You get no permission messages.

Possible Cause:

"No permission" messages mean that you do not have access to perform the command.

Diagnosis:

Check permission using the appropriate NIS+ commands, described in "Advanced FNS and NIS+ Issues". Use the nisdefaults command to determine your NIS+ principal name.

Another area to check is whether you are using the right name. For example, org// names the context of the root organization. Make sure you have permission to manipulate the root organization. Or maybe you meant to specify myorgunit/, instead.

Solution:

If you do have permission, then the appropriate credentials probably have not been acquired.

This could be caused by the following:

Check that the /etc/nsswitch.conf file has a publickey: nisplus entry. This might manifest itself as an authentication error.

fnlist Does not List Suborganizations

Symptom:

You run fnlist with an organization name, expecting to see suborganizations, but instead see nothing.

Possible Cause:

This is caused by an NIS+ configuration problem. Suborganizations must be NIS+ domains. By definition, an NIS+ domain must have a subdirectory named org_dir.

Diagnosis:

Use the nisls command to see what subdirectories exist. Run nisls on each subdirectory to verify which subdirectories have an org_dir. The subdirectories with an org_dir are suborganizations.

Solution:

Not applicable.

Cannot Create Host- or User-related Contexts

Symptom:

When you run fncreate -t for the user, username, host, or hostname contexts, nothing happens.

Possible Cause:

You have not set the NIS_GROUP environment variable. When you create a user or host context it is owned by the host or user, and not by the administrator who set up the namespace. Therefore, fncreate requires that the NIS_GROUP variable be set to enable the administrators who are part of that group to subsequently manipulate the contexts.

Diagnosis:

Check the NIS_GROUP environment variable.

Solution:

The NIS_GROUP environment variable should be set to the group name of the administrators who will administer the contexts.

Cannot Remove a Context You Created

Symptom:

When you run fndestroy on the host or user context the context is not removed.

Possible Cause:

You do not own the host or user context. When you create a user or host context it is owned by the host or user, and not by the administrator who set up the namespace.

Diagnosis:

Check the NIS_GROUP environment variable.

Solution:

The NIS_GROUP environment variable needs to be set to the group name of the administrator who will administer the contexts.

Name in Use with fnunbind

Symptom:

You get "name in use" when trying to remove bindings. It works for certain names but not for others.

Possible Cause:

You cannot unbind the name of a context. This restriction is in place to avoid leaving behind contexts that have no name ("orphaned contexts").

Diagnosis:

Run the fnlist command on the name to verify that it is a context.

Solution:

If the name is a context, use the fndestroy command to destroy the context.

Name in Use with fnbind/fncreate -s

Symptom:

You use the -s option with fnbind and fncreate, but for certain names you get "name in use."

Possible Cause:

fnbind -s and fncreate -soverwrite the existing binding if it already exists; but if the old binding is one that must be kept to avoid orphaned contexts, the operation fails with a "name in use" error because the binding could not be removed. This is done to avoid orphaned contexts.

Diagnosis:

Run the fnlist command on the name to verify that it is a context.

Solution:

Run the fndestroy command to remove the context before running fnbind or fncreate on the same name.

fndestroy/fnunbind Does Not Return Operation Failed

Symptom:

When you do an fndestroy or fnunbind on certain names that you know do not exist, you receive no indication that the operation failed.

Possible Cause:

The operation did not fail. The semantics of fndestroy and fnunbind are that if the terminal name is not bound, the operation returns success.

Diagnosis:

Run the fnlookup command on the name. You should receive the message, "name not found."

Solution:

Not applicable.