Solstice Backup 5.1 Administration Guide

Backup and Recover

This section explains how to troubleshoot various problems you might encounter with backup and recover operations.

Checking the Backup Daemons

If you have trouble starting Backup, the daemons may not be running properly. To determine whether the required daemons are running, enter the one of the following commands at the shell prompt:


# ps -aux | grep nsr

or:


# ps -ef | grep nsr

You should receive a response similar to the following:


12217 ?        S  0:09 /usr/sbin/nsrexecd -s jupiter
12221 ?        S  2:23 /usr/sbin/nsrd
12230 ?        S  0:00 /usr/sbin/nsrmmdbd
12231 ?        S  0:01 /usr/sbin/nsrindexd
12232 ?        S  0:00 /usr/sbin/nsrmmd -n 1
12234 ?        S  0:00 /usr/sbin/nsrmmd -n 2
12235 ?        S  0:00 /usr/sbin/nsrmmd -n 3
12236 ?        S  0:00 /usr/sbin/nsrmmd -n 4
12410 pts/8    S  0:00 grep nsr

If the response indicates that the daemons are not present, start the Backup daemons:


# /etc/init.d/networker start

Backup of Clients Fails to Stop

During a backup, you attempt to stop the process by clicking Stop in the Group Control window. This should stop the process for all clients in the selected group, but sometimes a client is missed. You then see messages that indicate the server is still busy.

To resolve the problem, on the client machine, determine which clients still have a save process running by using one of the following commands:


# ps -aux | grep save

or:


# ps -ef | grep save

This command returns a process identification number (pid) for each process associated with save. Enter the following command to stop the save process for each pid:


# kill -9 pid

No Notification of Client File Index Size Growth

Backup does not notify you when a client file index is getting too large. You should monitor the system regularly to check the size of client file indexes. See "Index Management " for information on how to manage the Backup client file indexes.

Media Position Errors Encountered When Auto Media Verify Is Enabled

When you enable Auto Media Verify for a pool, Backup verifies the data written to volumes from the pool while saving. This is done by reading a record of data written to the media and comparing it to the original record. Media is verified after Backup finishes writing to the volume, which might occur when a volume becomes full or when Backup no longer needs the volume for saving data.

To verify media, nsrmmd must reposition the volume to read previously written data. It does not always succeed in the first attempt. These warning messages appear in the message display in the Backup administration program (nwadmin):


media warning: /dev/rmt2.1 moving: fsr 15: I/O error
media emergency: could not position jupiter.007 to file 44, record
16

No action is required. Backup continues to attempt to find the proper position. If Backup can find the correct position, media verification succeeds and a successful completion message appears.


media info: verification of volume "jupiter.007" volid 30052
succeeded.

In this case, ignore the earlier messages because they only indicate that Backup had problems finding the desired position on the media. If the problem is serious, media verification fails and a subsequent message gives the reason for the failure.

PACKET RECEIVE BUFFER and NO ECB Counters Increase

When your server is waiting for a tape to be mounted or is in the process of changing an autochanger volume, you see the PACKET RECEIVE BUFFER and NO ECB counters increase on a NetWare client.

To resolve this problem, use the nsr_shutdown command to shut down the Backup server. Then, restart Backup manually. See "Checking the Backup Daemons" for commands to restart manually.

Backup Not Found in Expected Location for Solaris Client

On Solaris, Backup executables are installed by default in /usr/sbin/nsr. If you start a group backup on a Backup server that does not have /usr/sbin/nsr in the search path for root, the backup fails on a client that has its Backup executables in /usr/sbin/nsr. This is because the savefs command is not in the search path.

The best solution is to set the Executable Path hidden attribute for a client that has this problem. To set the Executable Path, display the Clients attribute in details view and enter the path of the executables, /usr/sbin/nsr, in the Executable Path attribute.

Another solution is to modify the search path for root on the Backup server to include /usr/sbin/nsr even if it does not exist locally.

The scanner Program Marks a Volume Read-Only

When you use the scanner program to rebuild the index of a backup volume, the scanner program marks the volume read-only.

This is a safety feature that prevents the last save set on the backup volume from being overwritten. To write to the media without marking it read-only, use the nsrmm -o command:


# nsrmm -o notreadonly volume-name

Index Recovery to a Different Location Fails

Suppose you attempt to recover indexes to a directory other than the one where they were originally located, then receive the following error message:


WARNING: The on-line index for `client-name' was NOT fully recovered.
There may have been a media error. You can retry the recover, or
attempt to recover another version of the `client-name' index. 

Do not attempt to recover the indexes to a different directory. After the indexes have been recovered to their original location, you can move them to another directory.

Because the indexes are holey files, using the UNIX cp command creates a file that consumes more disk space than the original file. To move the indexes, invoke the following command as root from within the /nsr/index directory:


# uasm -s -i client-index-directory-name | (cd target-directory; uasm -r)

Potential Cause for Client Alias Problems

If you encounter any of the following situations, a client alias problem might be the cause.

A client alias change is needed for the following situations:


Caution - Caution -

Do not put aliases that are shared by other hosts on this line.


Illegal Characters to Avoid in Configurations

When you upgrade from earlier versions of Backup, the configuration names of label templates, directives, groups, policies, and schedules that include the following special characters are no longer allowed:

/\\*?[]()$!^;'\"`~><&|{}

This change was made because volume labels, directives, groups, policies, and schedules are often passed as command line options to various Backup programs.

During installation of Backup, these characters in your current configuration names are replaced with an underscore (_) in the resources where they were originally created.

However, in the Clients resource where these configurations are applied, Backup automatically replaces the selected configuration with the preexisting Default configuration.

You need to reselect the configurations whose names have changed and reapply them to the individual clients.

The scanner Program Requests an Entry for Record Size

If you use the scanner program with the -s option but without an -i or -m option, and you receive the message:


please enter record size for this volume ('q' to quit) [xx] 

The number in the bracket [xx] is the entry from the last query.

The scanner command always rewinds the tape and reads the volume label to determine the block size. If the volume label is corrupted or unreadable, you see a message prompting you to enter the block size (in kilobytes).

Type in the block size; it must be an integer equal to or greater than 32. If you enter an integer that is less than 32, you receive the following message:


illegal record size (must be an integer >=32)

Failed Recover Operation Directly after New Installation

If you attempt to start the nwrecover program immediately after installing Backup for the first time on your system, you receive the error message nwrecover: Program not found.

To save disk space, Backup delays the creation of the client index until the first backup is completed. The nwrecover program cannot recover data until the client index has entries for browsing. To avoid the problem, perform a Backup backup on the client.

Recovering Files from an Interrupted Backup

If you terminate a backup by killing the Backup daemons, you cannot recover the files because the media database is not updated when the daemons die. Consequently, Backup does not know which volumes the requested files reside on.

Backup of a New Client Defaults to a Level Full

The first time you back up a new client, you receive the following message:


mars:/usr no cycles round in media db; doing full save.

In this example, the /usr filesystem on the mars client has no full saves listed in the media database. Therefore, regardless of the backup level selected for the client's schedule, Backup performs a full backup. This feature is important because it enables you to perform disaster recoveries for the client.

You may also receive this message if the server and client clocks are not synchronized. To avoid this, make sure that the Backup server and the client are in the same time zone and have their clocks synchronized.

Renamed Clients Cannot Recover Old Backups

Backup maintains a client file index for every client it backs up. If you change the name of the client, the index for that client is not associated with the client's new name and you cannot recover files backed up under the old client name. To recover previous backup data for a newly renamed client, see "How To Recover Previous Backup Data Under a New Client Name".

How To Recover Previous Backup Data Under a New Client Name

  1. Delete the Client resource configured for the old client name.

  2. Create a new Client resource for the new client name.

  3. Shut down the Backup daemons.


    # nsr_shutdown
    
  4. Delete the index directory that was automatically created for the new client.

    If you simply copy the new client index over the old client index directory, the result is a nesting of the new client index inside the old client index directory.

  5. Use the mv command to rename the old client's file index directory.


    # mv /nsr/index/old-client /nsr/index/new-client
    

Disk Label Errors

If you receive the error message "No disk label," you may have incorrectly configured a nonoptical device as an optical device within Backup. Verify that the Media Type attribute in the Devices resource matches the expected media for your device, and make corrections if needed.

Errors from Unsupported Media in HP Tape Drives

Certain Hewlett-Packard tape drives can only read 4 mm tapes of a specific length. Some, for example, read only 60 meters tapes and do not support the use of 90- or 120- meter tapes. To determine the type of tape supported by your HP drive, consult the hardware manual provided with the drive.

If you attempt to use unsupported media in an HP tape drive, you may encounter the following types of error messages:


nsrmm: error, label write, No more processes (5)

scanner: error, tape label read, No more processes (11)
scanning for valid records ...
read: 0 bytes
read: 0 bytes
read: 0 bytes

Cannot Print Bootstrap Information

If your server bootstraps are not printed, you may need to enter your printer's name as a hidden attribute in the Groups resource. Access the hidden attributes by selecting Details from the View menu in the graphical administration program (nwadmin) or by selecting the Hidden choice from the Options menu in the cursor-based administration program (nsradmin).

Enter the name of the printer where you want the bootstraps to be printed in the Printer attribute of the Groups resource.

Server Index Saved

If your Backup server belongs to a group that is not enabled or does not belong to any group, Backup automatically saves the server's bootstrap information with each group that is backed up. If this is the case, you receive the following message in the savegroup completion report:


jupiter: index Saving server index because server is not in an
active group

This is a safety measure to help avoid a long recovery process in the event of a system disaster. You should, as soon as possible, configure the Client resource for the server to include it in an active backup group.

Copy Violation

If you installed Backup on more than one server and used the same Backup enabler code for them all, you receive the messages similar to the following in your savegroup completion mail:


--- Unsuccessful Save Sets ---
* mars:/var save: error, copy violation - servers `jupiter' and `pluto'
have the same software enabler code, `a1b2c3d4f5g6h7j8' (13)
* mars:/var save: cannot start a save for /var with NSR server `jupiter'
* mars:index save: cannot start a save for /usr/nsr/index/mars with NSR
server `jupiter'
* mars:index save: cannot start a save for bootstrap with NSR server
`jupiter'
* mars:index save: bootstrap save of server's index and volume
databases failed

To successfully rerun the backup, you must issue the nsr_shutdown command on each server, remove the Backup software from the extra servers, and then restart the Backup daemons on the server where you want the backups to go.

Xview Errors

If you receive the following error message when you attempt to start the graphical administration interface with nwadmin from a client machine, it means that the client is not authorized to display Backup:


Xlib: connection to "mars:0.0" refused by server
Xlib: Client is not authorized to connect to Server
Xview error: Cannot open display on window server: mars:0.0 (Server
package)

To correct the situation, perform the following steps:

  1. From the client machine, invoke the xhost command:


    # xhost server-name
    
  2. Remotely log in to the Backup server and issue the setenv command at the shell prompt.


    # setenv DISPLAY client-name:0.0
    

    For command shells other than /usr/bin/csh, enter:


    # DISPLAY=client-name:0.0 
    # export DISPLAY