C H A P T E R 4 - Remote Mirror Software

This section describes some common errors that you may encounter when using Remote Mirror software.

Safeguarding the VTOC

Forgetting to Enable the Remote Mirror Set on the Secondary

If the secondary Remote Mirror set has not been enabled, the application gives the following error:

sndradm: warning: SNDR: Could not open file host:/dev/rdsk/xxxxx on remote node

Misentering the Remote Volume or Host Names

If the remote volume and host names do not match, both instances of SNDR will start, but they will not communicate with each other and replication will be unable to begin. The same message as when the secondary has not been enabled will be seen, but sndradm on the remote node will apparently show the set enabled. It is only on careful inspection that a difference in volume names can be seen to explain the failure.

Accessibility Issues

The most common class of user errors when using the Remote Mirror software is accessibility issues in the specification of the primary host volume and bitmap, secondary host volume and bitmap, or the primary and secondary host names, configured using the sndradm utility. The best means to resolve these types of errors is to use standard Solaris utilities, specifically format(1M), prtvtoc(1M), dd(1M), and telnet(1M).

sndradm -e hostA /dev/rdsk/c0t1d0s0 /dev/rdsk/c0t2d0s0 \hostB /dev/rdsk/c0t1d0s0 /dev/rdsk/c0t2d0s0 ip sync

A failure of this command may be due to incorrect device specifications, incorrect partition sizing, failure to access the device from this Solaris node, or Solaris host names. Resolving the issue using the following seven commands should be the first step towards resolving accessibility problems.

# telnet hostA

{login}

# format /dev/rdsk/c0t1d0s0

# format /dev/rdsk/c0t2d0s0

# prtvtoc /dev/rdsk/c0t1d0s0

# prtvtoc /dev/rdsk/c0t2d0s0

# dd if=/dev/rdsk/c0t1d0s0 of=/dev/null count=1

# dd if=/dev/rdsk/c0t2d0s0 of=/dev/null count=1

# dsbitmap -r /dev/rdsk/c0t1d0s0

# telnet hostB

{repeat sequence above}

Functionality Issues

The next class of user errors when using the Remote Mirror software are perceived functionality issues. The functionality of the Remove Mirror software is to continuously copy all the data from the primary host's volume to the secondary host's volume, repeatedly, until either replication is stopped or the primary or secondary hosts are no longer available. The first command and the following six commands are essentially equivalent for the setup of a Remote Mirror replication set, except that the second set can take hours or days to complete, as it recopies already copied data.

# sndradm -e hostA /dev/rdsk/c0t1d0s0 /dev/rdsk/c0t2d0s0 \

hostB /dev/rdsk/c0t1d0s0 /dev/rdsk/c0t2d0s0 ip sync

If the replication functionality of the first command listed above does not work as expected, use this set of commands with very small volumes to assure that the replication functionality that is desired works as expected for the configuration of volumes and hostname pertaining to your specific operating environment.

#!/bin/csh

# repeat:

# rsh hostA dd if=/dev/rdsk/c0t1d0s0 of=/tmp/hostA.tmp

# rsh hostA rcp /tmp/hostA.tmp hostB:/tmp/hostA.tmp

# rsh hostB dd of=/dev/rdsk/c0t1d0s0 if=/tmp/hostA.tmp

# goto repeat

Data Integrity Issues

When a Remote Mirror set is first enabled, the secondary volume may take hours or days to complete the initial synchronization, which is highly dependent on the volume size, network bandwidth and latency, and system resources on both the primary and secondary nodes. Review the Sun StorageTek Availability Suite 4.0 Remote Mirror Software Administration Guide for various methods which incorporate the use of sndradm -E for fast enable operations.

Once the initial full synchronization has completed, the Remote Mirror secondary volume is kept in write-order consistency, an operation which may lag the Remote Mirror primary volume. If at any time the replication process stops, logging mode is enabled, the network link does down, or there is a system failure, a replicated I/O operation may have been in progress. This state may result in a Remote Mirror secondary volume data set that appears inconsistent, meaning that utilities like fsck(1M), database recovery tools, or similar software may have to make indeterminate decisions about the validity of an incomplete I/O operation. The means by which the Remote Mirror software keeps a primary and secondary replicated set in write-order consistency results in the same I/O consistency issues as a Solaris node "panicking" while I/O is in progress.

If you are manually placing a Remote Mirror primary volume in logging mode to use the secondary volume, it is highly recommended that the primary volume be quiesced and all cached data blocks flushed to disk, so that the Remote Mirror software finishes replicating a consistent volume to the secondary host.

Configuration

Set Status

Set status can be checked with the sndradm -P command. The percentage of the primary that needs to be transmitted to the secondary to complete a sync operation can be seen with the dsstat -m sndr command.

Files

The file /var/adm/ds.log contains a record of Availability Suite activity, including which remote replication sets have been enabled, resumed, and stopped by the sndradm and sndrboot utilities.

Volume Configuration

Raw Partition

The following command creates a Remote Mirror replicated set consisting of raw partitions, where the primary is /dev/rdsk/c7t0d0s6 and the bitmap is /dev/rdsk/c7t1d0s6. Note the exact same command must be issued on both the primary and secondary host to complete a single Remote Mirror replicated set.

# sndradm -e hostA /dev/rdsk/c7t0d0s6 /dev/rdsk/c7t1d0s6 hostB \/dev/rdsk/c7t0d0s6 /dev/rdsk/c7t1d0s6 ip async

Since this is an asynchronous replicated set, the Remote Mirror software keeps the sets in synchronization with a memory queue, allowing for a small, finite lag between primary and secondary hosts.

# dsbitmap -r /dev/rdsk/c7t0d0s6

Solaris Volume Manager

The following command creates a Remote Mirror replicated set consisting of SVM volumes, where the primary is /dev/md/rdsk/d1 and the bitmap is /dev/md/rdsk/d2.

# sndradm -E hostA /dev/md/rdsk/d1 /dev/md/rdsk/d2 hostB \/dev/md/rdsk/d1 /dev/md/rdsk/d2 ip async

Since this is a synchronous replicated set with a -E (fast enable), there is an assumption that both the primary and secondary volumes are equal. If both the primary and secondary volumes are uninitialized, meaning that there is no file system, database, or application on the volumes, then both volumes are considered the same (uninitialized equals uninitilized). When the primary volume has a file system, database, or application data placed on it, the Remote Mirror software replicates these changes to the secondary, and, by virtue of replication, both volumes will be identical.

Another means by which to accomplish this step is to enable the primary node as shown above, but leave the SNDR set in logging mode and then enable a Point-in-Time Copy using the primary volume as the master volume, thereby creating an instant copy of the set. The primary volume can then be used by the system, applications, or a file system. A backup of the shadow volume needs to be taken; when the backup is complete, the Point-In-Time Copy set on the primary can be disabled. The backup of the shadow volume can be delivered to the site of the remote mirror secondary and restored to disk as specified above. Then a fast enable (-E) must be done on the secondary. When placing the Remote Mirror set in replicating mode, any changes since the Point-in-Time Copy set was made are replicated to the secondary, vastly minimizing the amount of data that needs to be replicated over the network.

VERITAS Volume Manager

The following commands create a Remote Mirror set consisting of VxVM volumes, where the primary master volume is /dev/vx/rdsk/sndr-dg/d21 and the bitmap volume is /dev/vx/rdsk/sndr-dg/d22.

# sndradm -e hostA /dev/vx/rdsk/sndr-dg/d21 \/dev/vx/rdsk/sndr-dg/d22 hostB /dev/vx/rdsk/sndr-dg/d23 \/dev/vx/rdsk/sndr-dg/d24 ip async# sndradm -q a /dev/vx/rdsk/sndr-dg/d30 \

hostB:/dev/vx/rdsk/sndr-dg/d30

Since this is an asynchronous replicated set with an associated disk queue, the Remote Mirror software keeps the sets in synchronization with a disk queue, allowing for a large, somewhat infinite lag between primary and secondary hosts.

Performance Diagnosis

This section discusses how to diagnose performance issues for the Remote Mirror software.

Remote Mirror Set Variables

sync and async

Asynchronous modes give faster local write performance than synchronous. If you find that your performance suddenly changes, then there is likely to have been some event that moved the system into the other mode. Possible events include:

queue modes

autosync

When enabled (sndradm -a on set), the Remote Mirror rdcsyncd daemon automates update resynchronization after a network link or machine failure. If a Point-in-Time Copy set was added as an ndr_ii entry (see ndr_ii), the daemon creates a dependent shadow volume of the Remote Mirror secondary, to assure that there is always a valid replica on the secondary site. While a full or update sync is in progress, the Remote Mirror software replicates changed blocks, starting from block 1 to the end of the volume. This replication is block-order, not write-order, so the volume is inconsistent until the synchronization operation completes. Having a ndr_ii Point-in-Copy on the secondary ensures that there is always a consistent, write-ordered volume on the secondary host.

max q writes

max q fbas

async Threads

Affects how fast the queue is sent across the network. More threads may lead to better network utilization.

Server Commands

dsstat

The dstat -m sndr command show basic statistics on the remote replication network and bitmap volumes. Other and more detailed statistics are available with the display option -d.

iostat

The iostat command can be used to monitor I/O rates to all Remote Mirror volumes on the local machine in a manner similar to the normal usage of iostat.

Network Commands

dsstat

ifconfig

Once you have determined that the rdc service is read,y you may want to check the integrity of the link. When configuring the Remote Mirror software, the name associated with the IP address of the interface that the Remote Mirror software will transfer data over will be used. This is true for entries added into the /etc/hosts file as well as when using sndradm commands to enable sets.

A simple test would verify that you can telnet or rlogin through the interfaces the Remote Mirror software will use. You may also want to use the ifconfig command to make sure the interface is plumbed, up, and at the IP address you have configured in the /etc/hosts file. The names and IP addresses of the interfaces being used for the Remote Mirror software on both systems should be in each system's /etc/hosts file.

# ifconfig -a

ba0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 9180 index 1

        inet 10.9.9.1 netmask ffffff00 broadcast 10.9.9.255

        ether 8:0:20:af:8e:d0

lo0: flags=1000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4> mtu 8232 index 2

        inet 127.0.0.1 netmask ff000000

hme0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3

        inet 10.8.11.124 netmask ffffff00 broadcast 10.8.11.255

        ether 8:0:20:8d:f7:2c

lo0: flags=2000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv6> mtu 8252 index 2

        inet6 ::1/128

hme0: flags=2000841<UP,RUNNING,MULTICAST,IPv6> mtu 1500

index 3

        ether 8:0:20:8d:f7:2c

        inet6 fe80::a00:20ff:fe8d:f72c/10

netstat

Network socket queue states can be monitored with netstat. The send and receive socket queues are displayed by the -a option's Swind, Send-Q, Rwind, and Recv-Q columns.

# netstat -a|grep rdc

*.rdc                *.*                0      0 65535      0 LISTEN

*.rdc                *.*                0      0 65535      0 LISTEN

*.rdc                *.*                0      0 65535      0 LISTEN

ping

The ping command can be used to check that the interfaces can communicate and whether IPV4 or IPV6 addressing is being used.

# ping -s second.atm

PING second.atm: 56 data bytes

64 bytes from second.atm (10.9.9.2): icmp_seq=0. time=1. ms

64 bytes from second.atm (10.9.9.2): icmp_seq=1. time=0. ms

64 bytes from second.atm (10.9.9.2): icmp_seq=2. time=0. ms

64 bytes from second.atm (10.9.9.2): icmp_seq=3. time=0. ms

In the above example, packets are successfully being sent and IPV4 addressing is being used. That is confirmed by looking at the IP address "(10.9.9.2)", which has four values; IPV6 addressing would have six. The ping should be run in both directions (from primary to secondary and secondary to primary) to ensure connectivity in both directions. This is also a good way to verify that both systems are using the same protocol, IPV4 or IPV6.

rpcinfo

The rpcinfo utility can be used to check a path to the remote Remote Mirror services, either primary or secondary. Two commands are used to check the rdc service:

# rpcinfo -T tcp node1 100143 4

program 100143 version 7 ready and waiting

In the prior example, the rdc service is clearly ready. In the next example, the system was booted with an incorrect entry for "services" in the /etc/nsswitch.conf file and is not ready. In both examples, node1 is the system name. The commands should be run from all systems in the Remote Mirror config.

# rpcinfo -T tcp node1 100143 7

 rpcinfo: RPC: Program not registered

snoop

The snoop utility can be used to see if SNDR is actually sending and receiving date during a copy or update command.

# snoop -d hme0 port rdc

Using device /dev/hme (promiscuous mode)

 node2 -> node1 RPC C XID=3565514130 PROG=100143 (?) VERS=4 PROC=8

 node1 -> node2 RPC R (#1) XID=3565514130 Success

 node2 -> node1 TCP D=121 S=1018     Ack=1980057565 Seq=2524537885

Len=0 Win=33304 Options=<nop,nop,tstamp 1057486 843038>

 node2 -> node1 RPC C XID=3565514131 PROG=100143 (?) VERS=4 PROC=8

 node1 -> node2 RPC R (#4) XID=3565514131 Success

 node2 -> node1 TCP D=121 S=1018     Ack=1980057597 Seq=2524538025

Len=0 Win=33304 Options=<nop,nop,tstamp 1057586 843138>

 node2 -> node1 RPC C XID=3565514133 PROG=100143 (?) VERS=4 PROC=8

 node1 -> node2 RPC R (#7) XID=3565514133 Success

 node2 -> node1 TCP D=121 S=1018     Ack=1980057629 Seq=2524538165

Len=0 Win=33304 Options=<nop,nop,tstamp 1057686 843238>

 node2 -> node1 RPC C XID=3565514134 PROG=100143 (?) VERS=4 PROC=8

In the example above, the snoop utility is being run from the primary side of the Remote Mirror set. The interface being used is hme0 and the port to report on is the port used by rdc. The interface that is being used by the Remote Mirror software can be determined by relating the name used when enabling with the sndradm command to the IP address in the /etc/hosts file to the interface listed in the ifconfig -a output.

If you are using an ATM interface, a special snoop command called atmsnoop must be used:

# /etc/opt/SUNWconn/atm/bin/atmsnoop -d ba0 port rdc

device ba0

Using device /dev/ba (promiscuous mode)

TRANSMIT : VC=32

TCP D=121 S=1011 Syn Seq=2333980324 Len=0 Win=36560

________________________________________________________________

RECEIVE : VC=32

TCP D=1011 S=121 Syn Ack=2333980325 Seq=2878301021 Len=0 Win=36512

________________________________________________________________

TRANSMIT : VC=32

TCP D=121 S=1011     Ack=2878301022 Seq=2333980325 Len=0 Win=41076

________________________________________________________________

TRANSMIT : VC=32

RPC C XID=1930565346 PROG=100143 (?) VERS=4 PROC=11

________________________________________________________________

RECEIVE : VC=32

TCP D=1011 S=121     Ack=2333980449 Seq=2878301022 Len=0 Win=36450

________________________________________________________________

RECEIVE : VC=32

RPC R (#4) XID=1930565346 Success

________________________________________________________________

TRANSMIT : VC=32

TCP D=121 S=1011     Ack=2878301054 Seq=2333980449 Len=0 Win=41076

InfoDoc Summary

Following is a summary of the SunSolve InfoDocs written to address common customer issues for Remote Mirror software. If you believe you are experiencing one of these issues, contact your Sun Service Representative for a swift resolution.

InfoDoc ID	Issue
45485	SNDR `wait` command (`sndradm -w` or `rdcadm -w`) may return prematurely when run in a script
70015	Unable to grow a ufs filesystem under SNDR
71559	Cannot remove SVM, Veritas volumes, or DR LUNs under Availability Suite Software
73827	"SNDR: Recovery bitmaps not allocated"
77167	Booting either host causes entire sync in Remote Mirror or Point-in-Time Copy
80100	Warning Message: "bitmap reference count maxed out"
80732	Missing Remote Mirror Sets After a Host Boot

Common User Errors

Safeguarding the VTOC

Forgetting to Enable the Remote Mirror Set on the Secondary

Misentering the Remote Volume or Host Names

Accessibility Issues

Functionality Issues

Data Integrity Issues

Configuration

Set Status

Files

Volume Configuration

Raw Partition

Solaris Volume Manager

VERITAS Volume Manager

Performance Diagnosis

Remote Mirror Set Variables

`sync` and `async`

`queue` modes

`autosync`

`max q writes`

`max q fbas`

`async` Threads

Server Commands

`dsstat`

`iostat`

Network Commands

`dsstat`

`ifconfig`

`netstat`

`ping`

`rpcinfo`

`snoop`

InfoDoc Summary

Common User Errors

Safeguarding the VTOC

Forgetting to Enable the Remote Mirror Set on the Secondary

Misentering the Remote Volume or Host Names

Accessibility Issues

Functionality Issues

Data Integrity Issues

Configuration

Set Status

Files

Volume Configuration

Raw Partition

Solaris Volume Manager

VERITAS Volume Manager

Performance Diagnosis

Remote Mirror Set Variables

sync and async

queue modes

autosync

max q writes

max q fbas

async Threads

Server Commands

dsstat

iostat

Network Commands

dsstat

ifconfig

netstat

ping

rpcinfo

snoop

InfoDoc Summary

`sync` and `async`

`queue` modes

`autosync`

`max q writes`

`max q fbas`

`async` Threads

`dsstat`

`iostat`

`dsstat`

`ifconfig`

`netstat`

`ping`

`rpcinfo`

`snoop`