6 Backup and Recovery

Oracle Enterprise Manager Ops Center has several capabilities that can be used to recover data and resume functions if the Enterprise Controller system or a Proxy Controller system fail.

The satadm command can be used to back up the data on the Enterprise Controller. The backup file can then be used to restore the Enterprise Controller on the same system or on a system of the same type.

If a Proxy Controller must be reinstalled, you can back up its data and restore it after reinstallation.

If you set up a High Availability configuration during the installation and configuration process, you can fail over to the secondary Enterprise Controller if the primary Enterprise Controller fails.

About Backup and Recovery

Oracle Enterprise Manager Ops Center has several tools that can be used for disaster recovery. These tools let you preserve Oracle Enterprise Manager Ops Center data and functionality if the Enterprise Controller or Proxy Controller systems fail.

Backup and Restore Commands

You can back up the Enterprise Controller using the satadm backup command. This creates a backup file that contains all of the Oracle Enterprise Manager Ops Center information stored by the Enterprise Controller, including asset data, administration data, job history. Oracle Enterprise Manager Ops Center data that takes up large amounts of space, such as operating system images, is not included.

If the Enterprise Controller system fails, you can use the satadm restore command and the backup file to restore the Enterprise Controller to its previous state. If you use a new system, it must have the same specifications, operating system, version of Oracle Enterprise Manager Ops Center, and IP address as the old system. The new Enterprise Controller system must have Oracle Enterprise Manager Ops Center installed but not configured. The satadm restore command uses the backup file to configure the Enterprise Controller and restore the data.

Restoring Proxy Controller Data

Proxy Controllers store a variety of data, including data about the Agent Controllers that they manage. If you uninstall and reinstall a Proxy Controller without backing up and restoring this data, the Proxy Controller will not be able to communicate with the Agent Controllers, and the Agent Controllers' systems will become unmanageable.

You can back up the directories containing the Agent Controller data and restore them when you reinstall a Proxy Controller on the same system.

High Availability Failover

High Availability is a setup involving multiple Enterprise Controllers using shared storage. The primary Enterprise Controller is active, and is used for all Oracle Enterprise Manager Ops Center operations. The secondary Enterprise Controller is installed but not configured, and is kept as a backup.

If the primary Enterprise Controller fails or must be taken offline, you can transfer the data from the primary Enterprise Controller to the secondary Enterprise Controller, transfer the shared storage, and bring the secondary Enterprise Controller online.

Backing Up an Enterprise Controller

The satadm backup command enables you to back up and restore the Enterprise Controller, but it does not back up or restore the colocated Proxy Controller. The backup command temporarily shuts down the Enterprise Controller functionality before backing up the data, saves the back up data in a tar file, and then restarts the Enterprise Controller. You can specify the name and location of the backup file and the log file. For information about the supported flags and other information about the satadm command, see the SATADM man page.

Note:

The satadm backup command does not back up the /var/opt/sun/xvm/images/os directory. This is intentional, because the size of some of the files can be prohibitively large, such as those found under the /iso sub-directory.

In addition to running the satadm backup command, you should also back up the /var/opt/sun/xvm/images/os directory and manually archive the files to another server, file-share facility, or a location outside of the /var/opt/sun directory.

If you have already stored exact copies of the ISO images elsewhere, such as the location from which the ISO's were originally imported into Oracle Enterprise Manager Ops Center, you can exclude the ISO images matched by the /var/opt/sun/xvm/images/os/iso/DISTRO*.iso when making the manual archive. The ISO signature files, which are matched by /var/opt/sun/xvm/images/os/iso/DISTRO*.sig should be archived along with all other files from the /var/opt/sun/xvm/images/os hierarchy (possibly excluding the os/DISTRO*.iso's.)

The satadm restore command is used to restore the Enterprise Controller from the backup file. You can restore the Enterprise Controller on a different system, but only if the following characteristics are identical for both systems:

  • Processor class (SPARC or x86)

  • Operating system (Oracle Solaris or Oracle Linux)

  • Oracle Enterprise Manager Ops Center software version, including updates

  • Set of network interfaces that are cabled identically to the same subnets

To Back Up an Enterprise Controller

This procedure describes the steps required to back up the Enterprise Controller. The back up cannot be used to migrate an Enterprise Controller to a new configuration.

By default, the server data is saved in a backup file in the /var/tmp directory with a file name that includes a date and time stamp. You can define the file name and location during the backup, as shown in the example below. To use additional options, see the SATADM man page.

  1. From the command line, log in to the Enterprise Controller system.

  2. Use the proxyadm command to shut down the co-located Proxy Controller. The proxyadm command resides in the /opt/SUNWxvmoc/bin directory on an Oracle Solaris system, and in the /opt/sun/xvmoc/bin on a Linux system. For example:

    ./proxyadm stop
    
  3. Use the satadm command with the backup subcommand to back up the Enterprise Controller. The satadm command resides in the /opt/SUNWxvmoc/bin directory on an Oracle Solaris system, and in the /opt/sun/xvmoc/bin on a Linux system. The example below uses the following flag: -o – Indicate the directory location of the backup file.

    ./satadm backup -o /var/tmp/backup-file-name.tar
    
  4. Use the proxyadm command to restart the co-located Proxy Controller. For example:

    ./proxyadm start
    

Restoring an Enterprise Controller

Once you have created a backup of an Enterprise Controller, you can use that backup to restore it.

Note:

The satadm commands do not back up or restore the /var/opt/sun/xvm/images/os directory. This is intentional, because the size of some of the files can be prohibitively large, such as those found under the /iso sub-directory.

In the event that it is necessary to completely rebuild the Enterprise Controller, you must manually restore the /var/opt/sun/xvm/images/os hierarchy from the user-created archive after the satadm restore job has completely finished.

If the /var/opt/sun/xvm/images/os/iso/DISTRO*.iso were excluded from the user-created archive, then you will need to copy in the exact ISO images from elsewhere and named according to the saved /var/opt/sun/xvm/images/os/iso/DISTRO*.sig files that must also be restored to the /var/opt/sun/xvm/images/os/iso directory.

You can restore the Enterprise Controller on a different system, but only if the following characteristics are identical for both systems:

  • Processor class (SPARC or x86)

  • Operating system (Oracle Solaris or Oracle Linux)

  • Oracle Enterprise Manager Ops Center software version, including updates

  • Set of network interfaces that are cabled identically to the same subnets

To Restore an Enterprise Controller

This procedure describes the steps required to restore the data from the backup file, which is the archive created by the satadm backup operation.

  1. Prepare the Enterprise Controller system for the restore.

    • If you are restoring the backup on a new system, then the IP address, host name, and Enterprise Controller software version of the restored system must match that of the backed up system.

    • If you are restoring the backup on the same system, but the software has become corrupt or an upgrade failed, uninstall the Enterprise Controller software.

      Run the install script with the -e and -k options. The -e option uninstalls the Enterprise Controller and co-located Proxy Controller, and the -k option preserves the Oracle Configuration Manager software. For example:

      # cd /var/tmp/OC/xvmoc_full_bundle
      # ./install -e -k
      
    • If you are restoring the backup on the same system, and the software is functioning normally, unconfigure the Enterprise Controller.

  2. Install the Enterprise Controller if it has not been installed.

    1. Oracle Solaris OS: See the Oracle Enterprise Manager Ops Center Installation Guide for Oracle Solaris Operating System.

    2. Linux OS: See the Oracle Enterprise Manager Ops Center Installation Guide for Linux Operating Systems.

      Do not configure the Enterprise Controller. The restore command will restore your configuration settings.

      Note:

      The Enterprise Controller version must match the version that was backed up.
  3. Change to the directory where the satadm file is located on your system.

    • On Oracle Solaris systems, go to the /opt/SUNWxvmoc/bin/ directory.

    • On Linux systems, go to the /opt/sun/xvmoc/bin/ directory.

  4. Enter the satadm command with the restore subcommand. The -i flag is required to restore the data from the backup file. See the Oracle Enterprise Manager Ops Center Reference Guide for information about additional options.

    ./satadm restore -i backup-directory-location/file-name
    
  5. For an Enterprise Controller with a co-located Proxy Controller, restart the co-located Proxy Controller using the proxyadm command. The proxyadm command is in the same directory as the satadm command.

    ./proxyadm start -w
    
  6. For an Enterprise Controller with a co-located Proxy Controller, use the Custom Discovery method to rediscover the system after running the restore command. See the Oracle Enterprise Manager Ops Center Advanced User's Guide for more information about the Custom Discovery procedure. You do not need to re-register the assets.

    Note:

    After restoring the Enterprise Controller, the asset details might take several minutes to display completely in the user interface.

Example – Restoring a Backup File for an Enterprise Controller Running on an Oracle Solaris OS

In this example, the restore command includes flags to set the restore in verbose mode (-v), and to create a restore log (-l) for debugging purposes. The input (-i) specifies the backup file location.

# /opt/SUNWxvmoc/bin/satadm restore -v -i /var/tmp/OC/server1/backup-May28-2010.tar -l /var/tmp/OC/server1/logfile-restore-May28-2010.log

Restoring Agent Data During a Proxy Controller Reinstall

When you reinstall a Proxy Controller, the data stored on the older version of the Proxy Controller is lost. This data includes Agent data. If this data is not restored, the assets managed by that Proxy Controller will not be manageable.

To Restore Agent Data During a Proxy Controller Reinstall

This procedure describes the steps required to retain Agent Controller data while reinstalling a Proxy Controller.

  1. As root, log in to the Enterprise Controller.

  2. Use the satadm backup command to back up the Enterprise Controller. The satadm command resides in the /opt/SUNWxvmoc/bin directory on an Oracle Solaris system, and in the /opt/sun/xvmoc/bin on a Linux system.

    # /opt/SUNWxvmoc/bin/satadm backup -o /var/tmp/sat-backup-proxy-reinstall.tar
    
  3. Copy the Proxy Controller install bundle from the Enterprise Controller to the Proxy Controller.

    # scp /var/opt/sun/xvm/images/proxy/ProxyBundle.SunOS.sparc.2.5.0.3057.zip root@<proxy>:/var/tmp
    
  4. As root, log in to the Proxy Controller system.

  5. Archive the contents of the /var/opt/sun/xvm and /etc/cacao directories.

    # tar cvf /var/tmp/scn-proxy.tar /var/opt/sun/xvm /etc/cacao
    
  6. Uninstall the Proxy Controller.

    # /n1gc-setup/installer/install -e
    
  7. Create a new directory and change to it.

    # mkdir /var/tmp/proxy
    # cd /var/tmp/proxy
    
  8. Unzip the Proxy Controller install bundle.

    # unzip ../ProxyBundle.SunOS.sparc.2.5.0.3057.zip
    
  9. Install the Proxy Controller.

    # ./install -p
    
  10. Change to the top-level directory and un-tar the /var/opt/sun/xvm and /etc/cacao directories.

    # cd /
    # tar xvf /var/tmp/scn-proxy.tar
    
  11. On the Enterprise Controller, use the satadm restore command to restore the previously saved Enterprise Controller data.

    # /opt/SUNWxvmoc/bin/satadm restore -i /var/tmp/sat-backup-proxy-reinstall.tar
    
  12. On the Proxy Controller, use the proxyadm start command to start the Proxy Controller.

    # /opt/SUNWxvmoc/bin/proxyadm start -w
    

About High Availability

The design for a High Availability (HA) architecture must consider all single points of failure, such as power, storage, and network connectivity in addition to the product software.

For the Enterprise Manager Ops Center environment, high availability applies only to the Enterprise Controller and its co-located Proxy Controller. To avoid a single point of failure in the Enterprise Manager Ops Center software, transfer the /var/opt/sun/xvm directory structure manually from the primary Enterprise Controller to a secondary Enterprise Controller. The secondary Enterprise Controller duplicates the primary Enterprise Controller's configuration and takes over much of the primary Enterprise Controller's identity, including its host name, its IP addresses, its ssh keys, and its role. Only one Enterprise Controller, either primary or secondary, can be operational at any time.

The primary Enterprise Controller is configured and operational. The secondary Enterprise Controller is not configured and not operational. On the primary Enterprise Controller, the habackup program saves the data in the /var/opt/sun/xvm directory structure. The data that is saved by the habackup program on the primary Enterprise Controller is used by the harestore program on the secondary Enterprise Controller to duplicate how the primary Enterprise Controller is configured. The habackup program also backs up the local /etc/passwd file. The harestore program uses that information to change the ownership of the files to match the secondary Enterprise Controller's /etc/passwd file. However, root user passwords on Enterprise Controllers are not changed.

Requirements for High Availability

  • Use two systems of the same model and configured identically:

    • Processor class

    • Operating system

    • Oracle Enterprise Manager Ops Center software version, including updates

    • Network interfaces that are cabled identically to the same subnets

  • Add an asset tag to identify the primary Enterprise Controller and to distinguish it from the secondary Enterprise Controller. You can add a tag by using the Edit Asset action.

  • Maintain the secondary Enterprise Controller's system in the same way as the primary Enterprise Controller. The primary and secondary Enterprise Controllers must use the same version of Enterprise Manager Ops Center software. If you cannot use the Enterprise Manager Ops Center's user interface to verify the installed software versions at the time that you need to transfer functions to the secondary system view the content of the /n1gc-setup/.version.properties file. The product.version property lists the specific revision level of the installed software. For example:

    # cat /n1gc-setup/.version.properties
    #Note: This file is created at build time.
    #Wed Jun 30 15:28:45 PDT 2010
    version=dev-ga
    date=2010/06/30 15\:28
    build.variation=xvmopscenter
    product.version=2.6.0.1395
    product.installLocation=/var/opt/sun/xvm/EnterpriseController_installer_2.6.0.1395
    #
    

    Verify that the product.version property lists the same version on the primary and secondary Enterprise Controllers before you perform a failover procedure.

Limitations

  • User accounts and data that are not associated with Oracle Enterprise Manager Ops Center are not part of the failover process. Only Oracle Enterprise Manager Ops Center data is moved between the primary and secondary Enterprise Controllers.

  • UI sessions are lost on failover.

  • The HA configuration applies only to the Enterprise Controller and its co-located Proxy Controller and not to other standalone Proxy Controllers.

High Availability Failover on Oracle Solaris

Oracle Enterprise Manager Ops Center provides high availability capability that enables you to manually transfer Enterprise Controller functions from one system to another. If you configured high availability during the installation and configuration process, you can switch to the alternate Enterprise Controller if the primary Enterprise Controller fails.

HA Failover Overview on Oracle Solaris

The High Availability configuration uses manual failover procedures to transfer product functions from the primary Enterprise Controller to the secondary Enterprise Controller. Depending on the nature of the failure, different or additional procedures might be required. The procedures follow these general steps:

  1. Shut down the primary Enterprise Controller, if possible.

  2. Prepare the secondary Enterprise Controller for failover.

  3. Transfer the storage asset that holds the /var/opt/sun/xvm directory structure from the primary Enterprise Controller to the secondary Enterprise Controller.

  4. Run the harestore program to configure the Enterprise Manager Ops Center software on the secondary Enterprise Controller.

  5. Reboot the secondary Enterprise Controller and start the Enterprise Manager Ops Center operations.

The harestore command configures the secondary Enterprise Controller to use the IP addresses of the primary Enterprise Controller. As you repair the primary Enterprise Controller, prevent it from accessing the networks where the secondary Enterprise Controller is operational.

Shutting Down the Primary Enterprise Controller on Oracle Solaris

Depending on the nature of the failure that you encounter, the primary Enterprise Controller might function enough to allow you to shut down the system in an orderly manner. If this is true, take the steps necessary to complete the following:

  • Run the habackup program if possible

  • Stop Oracle Enterprise Manager Ops Center services

  • Unmount LOFS mounts

  • Unshare the /var/opt/sun/xvm/osp/share/allstart directory

  • Release control of the storage asset that stores the /var/opt/sun/xvm directory structure

  • Shut down the operating system of the failing Enterprise Controller

The steps to shut down the primary Enterprise Controller are intended to minimize damage to the primary Enterprise Controller and to the data that will transfer to the secondary Enterprise Controller. Some failures, however, might prevent you from shutting down the primary Enterprise Controller in an orderly manner. If this is true, additional procedures might be required to transfer the storage asset to the secondary Enterprise Controller. Even if it is not possible to perform these procedures, the HA failover procedure is designed to succeed.

The cron utility runs the habackup program every hour on the Enterprise Controller. To obtain a current copy of the data that it collects, run habackup before you unmount directories and release the storage asset that holds the /var/opt/sun/xvm directory structure, if possible.

If Oracle Enterprise Manager Ops Center services are running on the primary Enterprise Controller, stop those services before you transfer a storage asset between the two systems.

The /etc/opt/SUNWuce, /var/opt/SUNWuce, and /opt/SUNWuce directories are used as LOFS mount points for a set of directories within the /var/opt/sun/xvm directory structure. These LOFS mounts are created automatically when you install Oracle Enterprise Manager Ops Center. Before you release or transfer control of the storage asset that holds the /var/opt/sun/xvm directory structure, unmount these LOFS mounts.

The /var/opt/sun/xvm/osp/share/allstart directory is automatically shared when you install Oracle Enterprise Manager Ops Center software on an Enterprise Controller. Before you release or transfer control of the storage asset that holds the /var/opt/sun/xvm directory structure, unshare the /var/opt/sun/xvm/osp/share/allstart directory.

Many storage solutions support HA configurations in Oracle Enterprise Manager Ops Center. Some storage solutions allow you to release control of a storage asset, to allow a different system to use the storage asset. For example, the ZFS file system allows you to export a storage pool so that a different system can import it. The following example procedures use a Fibre Channel attached array that contains a ZFS storage pool. In this example, the primary and secondary Enterprise Controllers are both attached to the array.

Determine if the storage solution for your HA configuration allows you to release control of the storage asset that stores the /var/opt/sun/xvm directory structure. If this is true, release the storage asset before you shut down the primary Enterprise Controller.

Perform the following procedure on the primary Enterprise Controller.

To Shut Down the Primary Enterprise Controller

Note:

The following procedure is for a sample storage solution. You must determine the specific procedures to use, depending on the storage solution that you have implemented.
  1. Run the habackup program to create a current copy of the data that it collects. For example:

    # /opt/sun/xvm/bin/habackup
    Backup complete.
    #
    

    The habackup program successfully runs only if the Enterprise Controller services are running.

  2. Use the proxyadm and satadm commands to check for and shut down Oracle Enterprise Manager Ops Center services on the primary Enterprise Controller. These commands are located in the /opt/SUNWxvmoc/bin/ directory. Shut down Proxy Controller services before you shut down Enterprise Controller services. For example:

    # /opt/SUNWxvmoc/bin/proxyadm status
    online
    # /opt/SUNWxvmoc/bin/proxyadm stop -w
    proxyadm: Shutting down proxy using SMF...
    proxyadm: Proxy services have stopped
    #
    # /opt/SUNWxvmoc/bin/satadm status
    online
    # /opt/SUNWxvmoc/bin/satadm stop -w
    satadm: Shutting down satellite using SMF...
    satadm: Satellite services have stopped
    # 
    

    In this example, both Proxy Controller and Enterprise Controller services are online. The services that the proxyadm command reports as online are associated with the Proxy Controller that is co-located with the Enterprise Controller.

  3. Unmount the /etc/opt/SUNWuce, /var/opt/SUNWuce, and /opt/SUNWuce LOFS mounts. For example:

    # mount -v | grep xvm
    OpsCenter/xvm on /var/opt/sun/xvm type zfs read/write/setuid/devices/exec/xattr/atime/dev=4010002 on Mon Feb 23 07:59:22 2009
    /var/opt/sun/xvm/uce/etc.opt on /etc/opt/SUNWuce type lofs read/write/setuid/devices/dev=4010002 on Mon Feb 23 08:15:15 2009
    /var/opt/sun/xvm/uce/var.opt on /var/opt/SUNWuce type lofs read/write/setuid/devices/dev=4010002 on Mon Feb 23 08:15:15 2009
    /var/opt/sun/xvm/uce/opt on /opt/SUNWuce type lofs read/write/setuid/devices/dev=4010002 on Mon Feb 23 08:15:15 2009
    # 
    # umount /etc/opt/SUNWuce
    # umount /var/opt/SUNWuce
    # umount /opt/SUNWuce
    # mount -v | grep xvm
    OpsCenter/xvm on /var/opt/sun/xvm type zfs read/write/setuid/devices/exec/xattr/atime/dev=4010002 on Mon Feb 23 07:59:22 2009
    #
    

    In this example, the ZFS file system named OpsCenter/xvm is mounted as /var/opt/sun/xvm. The /etc/opt/SUNWuce, /var/opt/SUNWuce, and /opt/SUNWuce LOFS mounts use resources within /var/opt/sun/xvm.

  4. Unshare the /var/opt/sun/xvm/osp/share/allstart directory. For example:

    # share
    -               /var/opt/sun/xvm/osp/share/allstart   ro   "Allstart Share" 
    -               /var/js   ro,anon=0   "Allstart Share" 
    -               /opt/SUNWjet   ro,anon=0   "JET Framework" 
    # unshare /var/opt/sun/xvm/osp/share/allstart
    # share
    -               /var/js   ro,anon=0   "Allstart Share" 
    -               /opt/SUNWjet   ro,anon=0   "JET Framework" 
    # 
    
  5. Release control of the storage asset that stores the /var/opt/sun/xvm directory structure. For example, if you use ZFS to store the /var/opt/sun/xvm directory structure, you can export the ZFS pool that contains /var/opt/sun/xvm:

    # zfs list
    NAME            USED  AVAIL  REFER  MOUNTPOINT
    OpsCenter       382M   134G  1.50K  none
    OpsCenter/xvm   382M   134G   382M  legacy
    # zpool export OpsCenter
    # zpool list
    no pools available
    # zfs list
    no datasets available
    # 
    

    The /var/opt/sun/xvm directory unmounts automatically when you export the OpsCenter pool.

  6. If required for your particular storage solution, physically detach the storage asset from the primary Enterprise Controller.

  7. Shut down the operating system on the primary Enterprise Controller. For example, on Oracle Solaris systems, you can use the init 5 command to shut down and power off the system:

    # init 5
    
  8. Disconnect the primary Enterprise Controller from the networks to which it is connected. If you boot the primary Enterprise Controller to make repairs, this prevents the primary Enterprise Controller from using the same IP addresses as secondary Enterprise Controller. You can temporarily assign new IP addresses to the primary Enterprise Controller if you need to attach it to your network.

Preparing the Secondary Enterprise Controller for Failover on Oracle Solaris

Preparing the secondary Enterprise Controller before you transfer storage requires the following tasks:

  • Stopping Oracle Enterprise Manager Ops Center services

  • Unmounting LOFS mounts

  • Unsharing the /var/opt/sun/xvm/osp/share/allstart directory

  • Renaming the /var/opt/sun/xvm directory, and creating an empty /var/opt/sun/xvm directory

Perform the following procedure on the secondary Enterprise Controller.

To Prepare the Secondary Enterprise Controller for Failover

Note:

The following procedure is for a sample storage solution. You must determine the specific procedures to use, depending on the storage solution that you have implemented.
  1. Use the satadm command to check for and shut down Oracle Enterprise Manager Ops Center services on the secondary Enterprise Controller. The satadm command is located in the /opt/SUNWxvmoc/bin/ directory. For example:

    # /opt/SUNWxvmoc/bin/satadm status
    online
    # /opt/SUNWxvmoc/bin/satadm stop -w
    satadm: Shutting down satellite using SMF...
    satadm: Satellite services have stopped
    # 
    

    In this example, Enterprise Controller services are online. On a secondary Enterprise Controller you typically install, but do not configure, Oracle Enterprise Manager Ops Center Software. No Proxy Controller services run on a system that has Oracle Enterprise Manager Ops Center software installed but not configured. You can use the proxyadm command to check for and stop Proxy Controller services if required. The proxyadm command is located in the /opt/SUNWxvmoc/bin/ directory.

  2. Unmount the /etc/opt/SUNWuce, /var/opt/SUNWuce, and /opt/SUNWuce LOFS mounts. For example:

    # mount -v | grep xvm
    /var/opt/sun/xvm/uce/etc.opt on /etc/opt/SUNWuce type lofs read/write/setuid/devices/dev=1d80018 on Mon Feb 23 08:16:30 2009
    /var/opt/sun/xvm/uce/var.opt on /var/opt/SUNWuce type lofs read/write/setuid/devices/dev=1d80018 on Mon Feb 23 08:16:30 2009
    /var/opt/sun/xvm/uce/opt on /opt/SUNWuce type lofs read/write/setuid/devices/dev=1d80018 on Mon Feb 23 08:16:30 2009
    # 
    # umount /etc/opt/SUNWuce
    # umount /var/opt/SUNWuce
    # umount /opt/SUNWuce
    # mount -v | grep xvm
    #
    

    In this example, the system was installed using one file system to hold all Oracle Enterprise Manager Ops Center software. No separate file system yet exists for the /var/opt/sun/xvm directory.

  3. Unshare the /var/opt/sun/xvm/osp/share/allstart directory. For example:

    # share
    -               /var/opt/sun/xvm/osp/share/allstart   ro   "Allstart Share" 
    -               /var/js   ro,anon=0   "Allstart Share" 
    -               /opt/SUNWjet   ro,anon=0   "JET Framework" 
    # unshare /var/opt/sun/xvm/osp/share/allstart
    # share
    -               /var/js   ro,anon=0   "Allstart Share" 
    -               /opt/SUNWjet   ro,anon=0   "JET Framework" 
    # 
    
  4. Rename the /var/opt/sun/xvm directory to /var/opt/sun/xvm_secondary. For example:

    # mv /var/opt/sun/xvm /var/opt/sun/xvm_secondary
    #
    
  5. Create a new directory named /var/opt/sun/xvm. For example:

    # mkdir /var/opt/sun/xvm
    # 
    

    This directory is used to mount the transferable storage that contains the /var/opt/sun/xvm directory structure from the primary Enterprise Controller.

Transferring a Storage Asset on Oracle Solaris

Many storage solutions support HA configurations in Oracle Enterprise Manager Ops Center. Different storage solutions require different procedures. You must determine the procedures that are required to transfer the storage asset that stores the /var/opt/sun/xvm directory structure in your particular storage solution.

This example procedure assumes that the mountpoint property of the example ZFS file system is set to legacy, to resolve an issue regarding when ZFS and LOFS mounts take place in the system boot process. The legacy value indicates that the legacy mount and umount commands, and the /etc/vfstab file, will control mounting and unmounting this ZFS file system. Other storage solutions typically use these legacy commands and the /etc/vfstab file to control mounting and unmounting operations. See the Oracle Enterprise Manager Ops Center Release Notes for information about the LOFS race condition issue.

Perform the following procedure on the secondary Enterprise Controller.

To Transfer a Storage Asset to the Secondary Enterprise Controller

Note:

The following procedure is for a sample storage solution. You must determine the specific procedures to use, depending on the storage solution that you have implemented.
  1. If required for your particular storage solution, physically attach the storage asset to the secondary Enterprise Controller.

  2. Transfer the storage asset that stores the /var/opt/sun/xvm directory structure to the secondary Enterprise Controller. For example, if you use ZFS to store the /var/opt/sun/xvm directory structure, you can import the ZFS pool that contains /var/opt/sun/xvm:

    # zpool import
      pool: OpsCenter
        id: 16117101211372343022
     state: ONLINE
    action: The pool can be imported using its name or numeric identifier.
    config:
    
            OpsCenter    ONLINE
              mirror     ONLINE
                c1t65d0  ONLINE
                c1t66d0  ONLINE
    
    # zpool import OpsCenter
    #
    

    The zpool import command with no argument shows the list of ZFS pools that you can import. The zpool import OpsCenter command imports the OpsCenter pool.

  3. Verify that the ZFS file system exists and that its mount point is set to legacy. For example:

    # zfs list
    NAME            USED  AVAIL  REFER  MOUNTPOINT
    OpsCenter       382M   134G  1.50K  none
    OpsCenter/xvm   382M   134G   382M  legacy
    #
    

    In this example, the OpsCenter/xvm file system contains the Oracle Enterprise Manager Ops Center data from the primary Enterprise Controller, and its mount point is set to legacy.

  4. Edit the /etc/vfstab file and add an entry for the ZFS file system above the three lofs entries for /etc/opt/SUNWuce, /var/opt/SUNWuce, and /opt/SUNWuce. For example:

    (output omitted)
    OpsCenter/xvm   -       /var/opt/sun/xvm        zfs     -       yes     -
    /var/opt/sun/xvm/uce/etc.opt    -       /etc/opt/SUNWuce        lofs    -       yes     -
    /var/opt/sun/xvm/uce/var.opt    -       /var/opt/SUNWuce        lofs    -       yes     -
    /var/opt/sun/xvm/uce/opt        -       /opt/SUNWuce    lofs    -       yes     -
    

    In this example, the device to mount is OpsCenter/xvm, the mount point is /var/opt/sun/xvm, the file system type is zfs, and the mount at boot option is set to yes.

  5. Mount the new file system, and then verify that it is mounted. For example:

    # mount /var/opt/sun/xvm
    # mount
    (output omitted)
    /var/opt/sun/xvm on OpsCenter/xvm read/write/setuid/devices/exec/xattr/atime/dev=4010002 on Mon Feb 23 09:51:55 2009
    

Completing HA Failover on Oracle Solaris

After the storage asset that holds the /var/opt/sun/xvm directory structure has been transferred to the secondary Enterprise Controller, you must run the harestore program, reboot the secondary Enterprise Controller, and start Oracle Enterprise Manager Ops Center services.

After you complete the HA failover procedure, the secondary Enterprise Controller is listed in the Assets pane as controller_0, where controller is the host name of the primary Enterprise Controller.

Note:

The following procedure is for a sample storage solution. You must determine the specific procedures to use, depending on the storage solution that you have implemented.

To Run the harestore Program

Run the harestore program to restore Oracle Enterprise Manager Ops Center configuration from the primary Enterprise Controller onto the secondary Enterprise Controller. The harestore program is located in the /opt/sun/xvm/bin/ directory. For example:

# /opt/sun/xvm/bin/harestore
./osp/clientPkgs/Sun_Microsystems,_Inc.-allstart-3.3-1/base-mgmt-allstart-1.0-15.noarch.rpm:  No mapping for uid 63333
Restore complete.  You need to reboot now.
# 

To Reboot the Secondary Enterprise Controller and Start Services

After you run the harestore program, you must reboot the secondary Enterprise Controller and start Oracle Enterprise Manager Ops Center services on that system.

  1. Reboot the operating system on the secondary Enterprise Controller. For example, on Solaris systems, you can use the init 6 command to reboot the system:

    # init 6
    
  2. Use the satadm command to start Enterprise Controller services. The satadm command is located in the /opt/SUNWxvmoc/bin/ directory. For example:

    # /opt/SUNWxvmoc/bin/satadm start -w
    satadm: Starting satellite with SMF...
    satadm: Satellite services have started
    #
    
  3. If the configuration from the primary Enterprise Controller included an enabled co-located Proxy Controller, use the proxyadm command to start Proxy Controller services. The proxyadm command is located in the /opt/SUNWxvmoc/bin/ directory. For example:

    # /opt/SUNWxvmoc/bin/proxyadm start -w
    proxyadm: Starting proxy with SMF...
    proxyadm: Proxy services have started
    #
    

Restoring the Primary Enterprise Controller

After the primary Enterprise Controller has been repaired, you can either establish it as a new secondary Enterprise Controller, or return operation to it by taking the following actions:

  1. Shutting down the secondary Enterprise Controller

  2. Preparing the primary Enterprise Controller for failover

  3. Transferring the storage asset that holds the /var/opt/sun/xvm directory structure from the secondary Enterprise Controller to the primary Enterprise Controller

  4. Running the harestore program to restore the Oracle Enterprise Manager Ops Center configuration on the primary Enterprise Controller

  5. Re-booting the primary Enterprise Controller and starting Ops Center operations

Caution:

You must prevent the primary and secondary Enterprise Controllers from using the same IP addresses at the same time.

High Availability Failover on Linux

Oracle Enterprise Manager Ops Center provides high availability capability that enables you to manually transfer Enterprise Controller functions from one system to another. If you configured high availability during the installation and configuration process, you can switch to the alternate Enterprise Controller if the primary Enterprise Controller fails.

HA Failover Overview on Linux

Oracle Enterprise Manager Ops Center HA configurations use manual failover procedures to transfer Oracle Enterprise Manager Ops Center functions from the primary Enterprise Controller to the secondary Enterprise Controller. Depending on the nature of the failure that takes place on an Enterprise Controller, different or additional procedures might be required. Failover generally follows these main steps:

  1. Shutting down the primary Enterprise Controller, if possible

  2. Preparing the secondary Enterprise Controller for failover

  3. Transferring the storage asset that holds the /var/opt/sun/xvm directory structure from the primary Enterprise Controller to the secondary Enterprise Controller

  4. Running the harestore program to restore Ops Center configuration on the secondary Enterprise Controller

  5. Rebooting the secondary Enterprise Controller and starting Ops Center operations

You must prevent the primary Enterprise Controller from using the same IP addresses that the secondary Enterprise Controller uses. The harestore command configures the secondary Enterprise Controller to use the IP addresses of the primary Enterprise Controller. As you repair the primary Enterprise Controller, prevent it from accessing the networks where the secondary Enterprise Controller is operational.

Preparing the Secondary Enterprise Controller for Failover on Linux

Preparing the secondary Enterprise Controller before you transfer storage requires the following tasks:

  • Stopping Oracle Enterprise Manager Ops Center services

  • Unmounting LOFS mounts

  • Unsharing the /var/opt/sun/xvm/osp/share/allstart directory

  • Renaming the /var/opt/sun/xvm directory, and creating an empty /var/opt/sun/xvm directory

Perform the following procedure on the secondary Enterprise Controller.

Shutting Down the Primary Enterprise Controller on Linux

Depending on the nature of the failure that you encounter, the primary Enterprise Controller might function enough to allow you to shut down the system in an orderly manner. If this is true, take the steps necessary to complete the following:

  • Run the habackup program if possible

  • Stop Oracle Enterprise Manager Ops Center services

  • Unmount LOFS mounts

  • Unshare the /var/opt/sun/xvm/osp/share/allstart directory

  • Release control of the storage asset that stores the /var/opt/sun/xvm directory structure

  • Shut down the operating system of the failing Enterprise Controller

The steps to shut down the primary Enterprise Controller are intended to minimize damage to the primary Enterprise Controller and to the data that will transfer to the secondary Enterprise Controller. Some failures, however, might prevent you from shutting down the primary Enterprise Controller in an orderly manner. If this is true, additional procedures might be required to transfer the storage asset to the secondary Enterprise Controller. Even if it is not possible to perform these procedures, the HA failover procedure is designed to succeed.

The cron utility runs the habackup program every hour on the Enterprise Controller. To obtain a current copy of the data that it collects, run habackup before you unmount directories and release the storage asset that holds the /var/opt/sun/xvm directory structure, if possible.

If Oracle Enterprise Manager Ops Center services are running on the primary Enterprise Controller, stop those services before you transfer a storage asset between the two systems.

The /etc/opt/SUNWuce, /var/opt/SUNWuce, and /opt/SUNWuce directories are used as LOFS mount points for a set of directories within the /var/opt/sun/xvm directory structure. These LOFS mounts are created automatically when you install Oracle Enterprise Manager Ops Center. Before you release or transfer control of the storage asset that holds the /var/opt/sun/xvm directory structure, unmount these LOFS mounts.

The /var/opt/sun/xvm/osp/share/allstart directory is automatically shared when you install the Enterprise Controller. Before you release or transfer control of the storage asset that holds the /var/opt/sun/xvm directory structure, unshare the /var/opt/sun/xvm/osp/share/allstart directory.

Many storage solutions support HA configurations in Oracle Enterprise Manager Ops Center. Some storage solutions allow you to release control of a storage asset, to allow a different system to use the storage asset. The following example procedures use a Fibre Channel attached array. In this example, the primary and secondary Enterprise Controllers are both attached to the array.

Determine if the storage solution for your HA configuration allows you to release control of the storage asset that stores the /var/opt/sun/xvm directory structure. If this is true, release the storage asset before you shut down the primary Enterprise Controller.

Perform the following procedure on the primary Enterprise Controller.

To Prepare the Secondary Enterprise Controller for Failover

Note:

The following procedure is for a sample storage solution. You must determine the specific procedures to use, depending on the storage solution that you have implemented.
  1. Use the /opt/sun/xvmoc/bin/satadm command to check for and shut down Oracle Enterprise Manager Ops Center services on the secondary Enterprise Controller. For example:

    # /opt/sun/xvmoc/bin/satadm status
    online
    # /opt/sun/xvmoc/bin/satadm stop -w
    satadm: Shutting down satellite using SMF...
    satadm: Satellite services have stopped
    #
    

    In this example, Enterprise Controller services are online. On a secondary Enterprise Controller you typically install, but do not configure, Oracle Enterprise Manager Ops Center Software. No Proxy Controller services run on a system that has Oracle Enterprise Manager Ops Center software installed but not configured. You can use the /opt/sun/xvmoc/bin/proxyadm command to check for and stop Proxy Controller services if required.

  2. Unmount the /etc/opt/SUNWuce, /var/opt/SUNWuce, and /opt/SUNWuce LOFS mounts. For example:

    # mount -v | grep xvm
    /var/opt/sun/xvm/uce/etc.opt on /etc/opt/SUNWuce type lofs read/write/setuid/devices/dev=1d80018 on Mon Feb 23 08:16:30 2009
    /var/opt/sun/xvm/uce/var.opt on /var/opt/SUNWuce type lofs read/write/setuid/devices/dev=1d80018 on Mon Feb 23 08:16:30 2009
    /var/opt/sun/xvm/uce/opt on /opt/SUNWuce type lofs read/write/setuid/devices/dev=1d80018 on Mon Feb 23 08:16:30 2009
    #
    # umount /etc/opt/SUNWuce
    # umount /var/opt/SUNWuce
    # umount /opt/SUNWuce
    # mount -v | grep xvm
    #
    

    In this example, the system was installed using one file system to hold all Oracle Enterprise Manager Ops Center software. No separate file system yet exists for the /var/opt/sun/xvm directory.

  3. Unshare the /var/opt/sun/xvm/osp/share/allstart directory. For example:

    # share
    - /var/opt/sun/xvm/osp/share/allstart ro "Allstart Share"
    - /var/js ro,anon=0 "Allstart Share"
    - /opt/SUNWjet ro,anon=0 "JET Framework"
    # unshare /var/opt/sun/xvm/osp/share/allstart
    # share
    - /var/js ro,anon=0 "Allstart Share"
    - /opt/SUNWjet ro,anon=0 "JET Framework"
    #
    
  4. Rename the /var/opt/sun/xvm directory to /var/opt/sun/xvm_secondary. For example:

    # mv /var/opt/sun/xvm /var/opt/sun/xvm_secondary
    #
    
  5. Create a new directory named /var/opt/sun/xvm. For example:

    # mkdir /var/opt/sun/xvm
    #
    

    This directory is used to mount the transferable storage that contains the /var/opt/sun/xvm directory structure from the primary Enterprise Controller.

To Shut Down the Primary Enterprise Controller

Note:

The following procedure is for a sample storage solution. You must determine the specific procedures to use, depending on the storage solution that you have implemented.
  1. Run the habackup program to create a current copy of the data that it collects. For example:

    # /opt/sun/xvm/bin/habackup
    Backup complete.
    #
    

    The habackup program successfully runs only if the Enterprise Controller services are running.

  2. Use the proxyadm and satadm commands to check for and shut down Oracle Enterprise Manager Ops Center services on the primary Enterprise Controller. Shut down Proxy Controller services before you shut down Enterprise Controller services. For example:

    # /opt/sun/xvmoc/bin/proxyadm status
    online
    # /opt/sun/xvmoc/bin/proxyadm stop -w
    proxyadm: Shutting down proxy using SMF...
    proxyadm: Proxy services have stopped
    #
    # /opt/sun/xvmoc/bin/satadm status
    online
    # /opt/sun/xvmoc/bin/satadm stop -w
    satadm: Shutting down satellite using SMF...
    satadm: Satellite services have stopped
    #
    

    In this example, both Proxy Controller and Enterprise Controller services are online. The services that the proxyadm command reports as online are associated with the Proxy Controller that is co-located with the Enterprise Controller.

  3. Unmount the /etc/opt/SUNWuce, /var/opt/SUNWuce, and /opt/SUNWuce LOFS mounts. For example:

    # mount -v | grep xvm
    OpsCenter/xvm on /var/opt/sun/xvm type zfs read/write/setuid/devices/exec/xattr/atime/dev=4010002 on Mon Feb 23 07:59:22 2009
    /var/opt/sun/xvm/uce/etc.opt on /etc/opt/SUNWuce type lofs read/write/setuid/devices/dev=4010002 on Mon Feb 23 08:15:15 2009
    /var/opt/sun/xvm/uce/var.opt on /var/opt/SUNWuce type lofs read/write/setuid/devices/dev=4010002 on Mon Feb 23 08:15:15 2009
    /var/opt/sun/xvm/uce/opt on /opt/SUNWuce type lofs read/write/setuid/devices/dev=4010002 on Mon Feb 23 08:15:15 2009
    #
    # umount /etc/opt/SUNWuce
    # umount /var/opt/SUNWuce
    # umount /opt/SUNWuce
    # mount -v | grep xvm
    OpsCenter/xvm on /var/opt/sun/xvm type zfs read/write/setuid/devices/exec/xattr/atime/dev=4010002 on Mon Feb 23 07:59:22 2009
    #
    

    In this example, the ZFS file system named OpsCenter/xvm is mounted as /var/opt/sun/xvm. The /etc/opt/SUNWuce, /var/opt/SUNWuce, and /opt/SUNWuce LOFS mounts use resources within /var/opt/sun/xvm.

  4. Unshare the /var/opt/sun/xvm/osp/share/allstart directory. For example:

    # share
    - /var/opt/sun/xvm/osp/share/allstart ro "Allstart Share"
    - /var/js ro,anon=0 "Allstart Share"
    - /opt/SUNWjet ro,anon=0 "JET Framework"
    # unshare /var/opt/sun/xvm/osp/share/allstart
    # share
    - /var/js ro,anon=0 "Allstart Share"
    - /opt/SUNWjet ro,anon=0 "JET Framework"
    #
    
  5. Release control of the storage asset that stores the /var/opt/sun/xvm directory structure. For example, if you use ZFS to store the /var/opt/sun/xvm directory structure, you can export the ZFS pool that contains /var/opt/sun/xvm:

    # zfs list
    NAME USED AVAIL REFER MOUNTPOINT
    OpsCenter 382M 134G 1.50K none
    OpsCenter/xvm 382M 134G 382M legacy
    # zpool export OpsCenter
    # zpool list
    no pools available
    # zfs list
    no datasets available
    #
    

    The /var/opt/sun/xvm directory unmounts automatically when you export the OpsCenter pool.

  6. If it is required for your particular storage solution, physically detach the storage asset from the primary Enterprise Controller.

  7. Shut down the operating system on the primary Enterprise Controller. For example, on Linux systems, you can use the poweroff command to shut down and power off the system:

    # poweroff
    
  8. Disconnect the primary Enterprise Controller from the networks to which it is connected. If you boot the primary Enterprise Controller to make repairs, this prevents the primary Enterprise Controller from using the same IP addresses as secondary Enterprise Controller. You can temporarily assign new IP addresses to the primary Enterprise Controller if you need to attach it to your network.

Transferring a Storage Asset on Linux

Many storage solutions support HA configurations in Oracle Enterprise Manager Ops Center. Different storage solutions require different procedures. You must determine the procedures that are required to transfer the storage asset that stores the /var/opt/sun/xvm directory structure in your particular storage solution.

This example procedure assumes that the mountpoint property of the example ZFS file system is set to legacy, to resolve an issue regarding when ZFS and LOFS mounts take place in the system boot process. The legacy value indicates that the legacy mount and umount commands, and the /etc/fstab file, will control mounting and unmounting this ZFS file system. Other storage solutions typically use these legacy commands and the /etc/fstab file to control mounting and unmounting operations. See Oracle Enterprise Manager Ops Center Release Notes for information about the LOFS race condition issue.

Perform the following procedure on the secondary Enterprise Controller.

To Transfer a Storage Asset to the Secondary Enterprise Controller

Note:

The following procedure is for a sample storage solution. You must determine the specific procedures to use, depending on the storage solution that you have implemented.
  1. If required for your particular storage solution, physically attach the storage asset to the secondary Enterprise Controller.

  2. Transfer the storage asset that stores the /var/opt/sun/xvm directory structure to the secondary Enterprise Controller.

  3. Mount the Oracle Enterprise Manager Ops Center directory, and then verify that it is mounted. For example:

    # mount /var/opt/sun/xvm
    # mount
    (output omitted)
    /var/opt/sun/xvm on OpsCenter/xvm read/write/setuid/devices/exec/xattr/atime/dev=4010002 on Mon Feb 23 09:51:55 2009
    

Completing HA Failover on Linux

After the storage asset that holds the /var/opt/sun/xvm directory structure has been transferred to the secondary Enterprise Controller, you must run the harestore program, reboot the secondary Enterprise Controller, and start Oracle Enterprise Manager Ops Center services.

After you complete the HA failover procedure, the secondary Enterprise Controller is listed in the Assets pane as controller_0, where controller is the host name of the primary Enterprise Controller.

Note:

The following procedure is for a sample storage solution. You must determine the specific procedures to use, depending on the storage solution that you have implemented.

To Run the harestore Program

Run the harestore program to restore Oracle Enterprise Manager Ops Center configuration from the primary Enterprise Controller onto the secondary Enterprise Controller. The harestore program is located in the /opt/sun/xvm/bin/ directory. For example:

# /opt/sun/xvm/bin/harestore
./osp/clientPkgs/Sun_Microsystems,_Inc.-allstart-3.3-1/base-mgmt-allstart-1.0-15.noarch.rpm: No mapping for uid 63333
Restore complete. You need to reboot now.
#

To Reboot the Secondary Enterprise Controller and Start Services

After you run the harestore program, you must reboot the secondary Enterprise Controller and start Oracle Enterprise Manager Ops Center services on that system.

  1. Reboot the operating system on the secondary Enterprise Controller. For example, on Linux systems, you can use the reboot command to reboot the system:

    # reboot
    
  2. Use the satadm command to start Oracle Enterprise Manager Ops Center Enterprise Controller services. For example:

    # /opt/sun/xvmoc/bin/satadm start -w
    satadm: Starting satellite with SMF...
    satadm: Satellite services have started
    #
    
  3. If the configuration from the primary Enterprise Controller included an enabled co-located Proxy Controller, use the proxyadm command to start Proxy Controller services. For example:

    # /opt/sun/xvmoc/bin/proxyadm start -w
    proxyadm: Starting proxy with SMF...
    proxyadm: Proxy services have started
    #
    

Restoring the Primary Enterprise Controller

After the primary Enterprise Controller has been repaired, you can either establish it as a new secondary Enterprise Controller, or return operation to it by taking the following actions:

  1. Shutting down the secondary Enterprise Controller

  2. Preparing the primary Enterprise Controller for failover

  3. Transferring the storage asset that holds the /var/opt/sun/xvm directory structure from the secondary Enterprise Controller to the primary Enterprise Controller

  4. Running the harestore program to restore the Oracle Enterprise Manager Ops Center configuration on the primary Enterprise Controller

  5. Re-booting the primary Enterprise Controller and starting Oracle Enterprise Manager Ops Center operations

Caution:

You must prevent the primary and secondary Enterprise Controllers from using the same IP addresses at the same time.