B Redundant Electronics Overview

The optional redundant electronics (RE) feature provides failover protection for the library controller. If the library controller or drive controller experiences errors, operations can switch to the standby controller. The library controller and drive controller installed on the same side of the card cage are always switched as a pair.

RE allows an Oracle support representative to replace a faulty card while the library is online and provides minimal disruption during firmware upgrades.

Note:

Any reference to the HBC card also refers to the HBCR card.

See Also

Requirements for Redundant Electronics

  • Two library controller (HBC) cards

  • Two drive controller (HBT) cards

    Note:

    To enable ADI mode, both cards must be high-memory HBTs.

    If using Media Validation, Oracle recommends both HBT cards are high-memory.

  • Minimum SL3000 firmware version FRS_3.0 and SLC version 5.00

  • Hardware activation file (see "Activating Optional Features")

  • HLI host with TCP/IP or FC-SCSI host using ACSLS (see "Updating HLI Host Management Software for RE"). RE is not available to hosts using a native FC-SCSI interface.

Redundant Electronics Configuration Examples

Each library controller card requires its own unique IP address. If the dual TCP/IP feature is active on the library, each card requires two unique IP addresses: one for the primary port 2B and one for the secondary port 2A. Therefore, a library with RE and dual TCP/IP requires four unique IP addresses.

On each controller card, port 2B and 2A must be on different broadcast domains. However, port 2B on the active card and port 2B on the standby card can be on the same broadcast domain. The same is true for the 2A ports.

Figure B-1 Redundant Electronics Configuration Examples

Surrounding text describes Figure B-1 .

For more information about dual TCP/IP, see "Dual TCP/IP Overview".

What Occurs During a Failover

In a failover, the active library controller attempts to complete all in-process jobs and copies the cartridge database to the standby controller card. If the database cannot be copied (usually only in a sudden failure), you must perform an audit after failover (see "Auditing the Library"). The library returns any in-transit cartridges to their home slots. The library places any cartridges it cannot return to a home slot in a system slot for host recovery (see the host software documentation).

After all in-process jobs complete or time out, the cards switch roles. The standby controller becomes active and the previously active controller becomes the standby. If the previously active controller cannot bring up the standby software, the controller enters a fault state.

Effect of a Failover on Users

  • Users of tape management software (Symantec or Virtual Storage Manager) do not see an interruption.

  • HLI host applications queue requests during the failover process for completion after the failover switch. For ACSLS, only mount and dismount requests are affected (see the host software documentation).

  • SLC and CLI connections are terminated. You must re-establish connections to the library using the IP address or DNS alias of the new active library controller (the former standby controller).

Factors that Prevent an RE Switch

  • The standby library or drive controller is in a fault or eject state.

  • The standby code is not running on the standby library or drive controller cards.

  • A firmware download or card initialization is in progress.

Factors that Initiate an Automatic Failover

An automatic failover can be initiated by either the active or standby library controller.

The active library controller initiates an automatic failover when:

  • Its partner drive controller card is not installed or it is not communicating.

  • It detects a catastrophic internal software error.

The standby library controller initiates an automatic failover if the active controller is not functioning normally.

Ways to Initiate a Manual Failover

Note:

Before initiating a manual switch, you should verify that the standby library and drive controllers are running normally (see "Viewing Device Status and Properties").
  • Host tape management (ACSLS or ELS) — Failover can be initiated from either the active or standby library controller. The standby library controller accepts only set host path group and force switchover HLI requests.

  • SLC — Failover is initiated from the active library controller only (see "Initiating a Manual RE Switch Using SLC").

  • CLI — Failover can be initiated from either the active or standby library controller. This function is only available to Oracle support representatives.

You may want to perform a manual switch after initial installation of the standby cards, after a firmware upgrade, or periodically to check that the failover function is working properly. It is not possible to manually switch the library controllers without the drive controllers — the controllers are always switched as a pair.

Firmware Upgrades when Using RE

Firmware upgrades for libraries with RE are minimally disruptive to library operations. You can load and unpack new code simultaneously on the active and standby controller cards and on all devices. The code is then activated, and the active and standby controllers and most devices are re-initialized. Under most circumstances, robot initialization is bypassed.

The loading, unpacking, and activation of code are not disruptive to library operations until the library is rebooted. During the reboot process (which takes approximately 10 minutes), the HLI host applications (ACSLS and ELS) queue all mount and dismount requests. After the reboot is complete, the queued requests are submitted to the library controller.

See "Upgrading Library Firmware" for firmware download and activation information.