Oracle® Enterprise Manager Ops Center

Recovering Logical Domains from a Failed Server

Release 12.1.1.0.0

E36058-01

August 2012

This guide provides an end-to-end example for how to use Oracle Enterprise Manager Ops Center.

Introduction

Oracle Enterprise Manager Ops Center provides options to install and configure Oracle VM Server for SPARC systems, create logical domains, and provision OS on the logical domains. You can pool the Oracle VM Server for SPARC systems in a server pool which provides load balancing, high availability capabilities, and sharing resources with all the members of the pool.

The high availability capability for an Oracle VM Server for SPARC server pool is enhanced by allowing the automatic recovery of the logical domains on a failed server.

In Oracle Enterprise Manager Ops Center, you can recover the logical domains from failed and unreachable Oracle VM Server for SPARC systems. You can enable automatic recovery for the logical domains and set the priority of recovery. The automatic recovery priority decides the order of recovery of the logical domains. Zero (0) is the lowest automatic recovery priority while 100 is the highest. When an Oracle VM Server Control Domain fails, the logical domains in it are recovered and started on another Control Domain in the server pool.

There are many scenarios and conditions that determine the recovery of the logical domains. In this example, two such scenarios are described:

  • Automatic recovery of logical domains

  • Manual recovery of logical domains

For more information, refer to Oracle Enterprise Manager Ops Center Feature Reference Guide.

In this example, the Control Domain is placed in a server pool and has logical domains running in it. When the Control Domain becomes unreachable, the logical domains that have been enabled for automatic recovery are recovered and started in another Control Domain in the server pool automatically. When the logical domains are not enabled for automatic recovery, the logical domains can be recovered from the failed Control Domain using the manual procedure described in this guide.

What You Will Need?

You will need the following for showcasing the recovery of the logical domains:

  • Two Oracle VM Server for SPARC servers installed and configured using Oracle Enterprise Manager Ops Center.

  • The Oracle VM Server for SPARC servers are placed in a server pool.

  • Two logical domains installed and configured on one of the Oracle VM Server for SPARC system using Oracle Enterprise Manager Ops Center.

Hardware and Software Configuration

The Oracle VM Server for SPARCs are of the following configuration:

  • In this example, the servers named as smt4-14 and smt4-15 are installed and configured with Oracle VM Server for SPARC 2.2 using Oracle Enterprise Manager Ops Center.

    Description of ldom_server_version.png follows
    Description of the illustration ldom_server_version.png

  • The Control Domains are placed in a server pool with the following policies:

    • Place guest in Oracle VM Server with lowest relative load.

    • Do not automatically balance the server pool.

    • Power off a failed server from Service Processor, given capabilities, before automatic recovery of attached logical domains.

  • Two logical domains, guest1 and guest2 are created in the Control Domain smt4-15.

    Description of logical_domain_view.png follows
    Description of the illustration logical_domain_view.png

Recovering Logical Domains

In this example, the following two scenarios are described:

  • Automatic recovery

  • Manual recovery

There are two logical domains guest1 and guest2 in this example. The logical domain guest1 is designed for manual recovery and the logical domain guest2 for an automatic recovery. The Control Domain smt4-15 in which the logical domains resides becomes unreachable. Select a topic to see how the recovery procedures are executed:

Automatic Recovery of Logical Domains

To recover the logical domains automatically, you must enable the automatic recovery of the logical domains. You can enable the automatic recovery of logical domains in the following ways:

  • Set the automatic recovery option when you create the logical domain profile. Select the automatic recovery and the provide the priority value in the logical domain profile.

  • Select the logical domain and use the option Enable Automatic Recovery in the Actions pane to trigger the recovery of logical domains automatically when a server fails. Edit the Automatic Recovery Priority using the Edit Attributes option for a logical domain. The Enable Automatic Recovery is shown in the figure below.

    Description of disabled_auto.png follows
    Description of the illustration disabled_auto.png

In this example, the logical domain guest2 is enabled for an automatic recovery with an Automatic Recovery Priority of 100, which is the highest priority for recovery.

Description of guest2_recovery.png follows
Description of the illustration guest2_recovery.png

When an Oracle VM Server for SPARC in the server pool fails, the logical domains that have been enabled for automatic recovery are recovered and started on another Oracle VM Server in the server pool without any user intervention.

When the Control Domain smt4-15 fails and becomes unreachable, the automatic recovery of the logical domain guest2 is triggered. The status of smt4-15 is unreachable as shown in the figure below.

Description of unreach_status_view.png follows
Description of the illustration unreach_status_view.png

You can view the job running in the job pane.

Description of auto_recov_job.png follows
Description of the illustration auto_recov_job.png

Select the job and view the job details such as the task flow execution.

Description of recovery_job_details.png follows
Description of the illustration recovery_job_details.png

From the job details, you can view that the server smt4-15 is powered off according to the server pool policy. The logical domain guest2 recovery is initiated and created successfully on the Control Domain smt4-14 in the server pool. When the logical domain guest2 is recovered, the server pool status is as in the following figure:

Description of after_recovery1.png follows
Description of the illustration after_recovery1.png

You can view the logical domain guest2 recovered and running on the Control Domain smt4-14. The Control Domain smt4-15 is in unreachable status and the logical domain guest1 has disappeared from the list.

When the logical domain is recovered on the other host in the server pool, Oracle Enterprise Manager Ops Center takes care to auto boot the operating system of the logical domain. Allow some time for the logical domain to get started on the new virtualization host as its operating system gets booted.

When the failed server is repaired and restarted, the logical domains that were not recovered are started in the Control Domain. For the logical domains that are recovered and running on other servers, Oracle Enterprise Manager Ops Center cleans up the repaired server and removes those logical domains.

In a scenario where you cannot repair the failed server, you must manually recover the logical domains.

Manual Recovery of Logical Domains

When you have not enabled automatic discovery of logical domains or you do not have enough resources to recover the logical domains in a server pool, then use the manual procedure to recover the logical domains.

When the Control Domain smt4-15 becomes unreachable, do not try to remove it from the server pool using the option Remove from Server Pool. You cannot remove a Control Domain with running guests from a server pool.

Description of cannot_remove_frm_pool.png follows
Description of the illustration cannot_remove_frm_pool.png

As described in the previous section, the logical domain guest1 was not enabled for automatic recovery. Use the following procedure to manually recover the logical domain.

  1. The Control Domain is unreachable and the server is already powered off according to the server pool policy. Else, power off the Control Domain.

    Description of power_off.png follows
    Description of the illustration power_off.png

  2. Select All Assets in the System Group filter in the Navigation pane.

  3. Select Managed Assets tab in the center pane.

  4. Select the unreachable Control Domain from the list. Ensure that you select the Control Domain and not the operating system of the Control Domain.

  5. Click the Delete Assets icon to delete the asset.

    Description of delete_asset.png follows
    Description of the illustration delete_asset.png

  6. Click Delete to confirm the delete action.

    Description of delete_the_asset.png follows
    Description of the illustration delete_the_asset.png

    The Unmanage Asset wizard is displayed.

  7. Oracle Enterprise Manager Ops Center requires the credentials to delete the Agent Controller installed on the asset. Though the asset is unreachable, you must provide the credentials to continue the wizard. Select the credential and then click Next.

    Description of unmanage_asset.png follows
    Description of the illustration unmanage_asset.png

  8. Click Finish to unmanage the asset.

    Description of finish_unmanage_asset.png follows
    Description of the illustration finish_unmanage_asset.png

The delete asset job is carried out, and the service processor and the Control Domain disappear from the assets tree. Select the server pool in which the Control Domain was originally placed. The logical domain guest1 appears in the server pool under the Shutdown Guests list. You can start the logical domain in the required virtualization host in the server pool.

Description of result_guest1.png follows
Description of the illustration result_guest1.png

From the figure, you can see that the logical domain guest2 which was enabled for automatic recovery was recovered and running in another Control Domain in the server pool. The logical domain guest1 is also recovered and available as shut down guest in the server pool.

What's Next?

Use the option Start Guest to start the shut down logical domain on an Oracle VM Server in the server pool.

Related Articles and Resources

The Oracle Enterprise Manager Ops Center 12c documentation is located at http://www.oracle.com/pls/topic/lookup?ctx=oc121.

See the following guides for more information:

Other examples are available at http://docs.oracle.com/cd/E27363_01/nav/howto.htm.

Documentation Accessibility

For information about Oracle's commitment to accessibility, visit the Oracle Accessibility Program website at http://www.oracle.com/pls/topic/lookup?ctx=acc&id=docacc.

Access to Oracle Support

Oracle customers have access to electronic support through My Oracle Support. For information, visit http://www.oracle.com/pls/topic/lookup?ctx=acc&id=info or visit http://www.oracle.com/pls/topic/lookup?ctx=acc&id=trs if you are hearing impaired.


Oracle Enterprise Manager Ops Center How to Recover Logical Domains from a Failed Server, Release 12.1.1.0.0

E36058-01

Copyright © 2007, 2012, Oracle and/or its affiliates. All rights reserved.

This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited.

The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing.

If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, the following notice is applicable:

U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government.

This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous applications.

Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.

Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group.

This software or hardware and documentation may provide access to or information on content, products, and services from third parties. Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content, products, or services.