Sun Cluster Geographic Edition 3.1 8/05 Release Notes

Hitachi TrueCopy CCI Commands and Hitachi TrueCopy Resources Report that Remote horcmd Is Not Alive Even When It Is Alive and Responding (6297384)

Problem Summary: When a cluster node has two or more network addresses on different subnets for communication, IP_address in the /etc/horcm.conf file must be set to NONE. You must set the IP_address field to NONE even if the network addresses belong to the same subnet.

If the IP_address field is not set to NONE, Hitachi TrueCopy commands could respond unpredictably with the timeout error ENORMT, even though the remote process horcmd is alive and responding.

Workaround: Update the SUNW.GeoCtlTC resource time out values if the default Hitachi TrueCopy time out value has changed in the /etc/horcm.conf file. The default Hitachi TrueCopy time out value in /etc/horcm.conf is 3000(10ms), which is 30 seconds.

The SUNW.GeoCtlTC resources that have been created by the Sun Cluster Geographic Edition environment also have the default time out set at 3000(10ms).

If the default Hitachi TrueCopy time out value has changed in /etc/horcm.conf, the resource time out values must be updated according to algorithm discussed below. You should not change the default time out values for /etc/horcm.conf and Hitachi TrueCopy resources unless the situation demands otherwise.

The following equations establish an upper limit on the time it takes for a Hitachi TrueCopy command to time out based on various factors:


Note –

Units appear in seconds in the following equation.


For example, if horctimeout were set to 30, and numhosts is set to 2, and numretries is set to 2, then Upper-limit-on-timeout would be 120.

Based on value of Upper-limit-on-timeout, the following resource time out values should be set. A minimum of 60 should be specified as a buffer, to allow for processing of other commands.


Validate_timeout = Upper-limit-on-timeout + 60
Update_timeout = Upper-limit-on-timeout + 60
Monitor_Check_timeout = Upper-limit-on-timeout + 60
Probe_timeout = Upper-limit-on-timeout + 60
Retry_Interval = (Prote_timeout + Thorough_probe_interval) + 60

The other time out parameters in the resource should contain default values.

To change the time out values, complete the following steps:

  1. Bring the resource group offline by using the scswitch command.

  2. Update the required timeout properties by using the scrgadm command.

  3. Bring the resource group online by using the scswitch command.