2 Introduction to Oracle Site Guard

In this section, you learn about Oracle Site Guard and its benefits.

Representation of a Site in Enterprise Manager Cloud Control Console

A site is a logical grouping of software components and associated hardware that run one or more user applications.

A site could consist, for example, of a collection of servers (hosts) that are used to deploy Oracle Fusion Middleware instances, Oracle Fusion Application instances, Oracle databases, along with the associated storage for these software components. Oracle Site Guard uses the Enterprise Manager Cloud Control generic system target to represent a site. Every site, whether primary or standby, is represented as a Generic System, which is a collection of other target types, such as Oracle Database and Oracle Fusion Middleware Domain. Oracle Site Guard only supports Enterprise Manager deployments where both primary and standby sites are managed by the same Enterprise Manager Cloud Control deployment.

The following picture illustrates the main portions of an Oracle Fusion Middleware Disaster Recovery topology managed by the same Enterprise Manager Cloud Control deployment.

Figure 2-1 Primary (Production) and Standby Site for Oracle Fusion Middleware Disaster Recovery Topology Managed by Enterprise Manager Cloud Control

Primary and Standby sites

The main aspects of an Oracle Fusion Middleware Disaster Recovery topology are as follows:

  • A single Enterprise Manager Cloud Control monitors the primary site and the standby site.

  • Oracle Management Agent (EM Agent) is installed on local (non-replicated) storage on all hosts on the primary site and the standby site.

    For example:

    • Web Tier managed system components (WEBHOST1 and WEBHOST2)

    • Oracle Fusion Middleware Applications (APPHOST1 and APPHOST2)

    • Oracle RAC Database (RAC DBHOST1 and RAC DBHOST2)

    Oracle Management Agent (EM Agent) is one of the core components of Enterprise Manager Cloud Control that enables you to convert an unmanaged host to a managed host in the Enterprise Manager system. The Management Agent works in conjunction with Enterprise Manager plug-ins to manage the targets running on that managed host.

Oracle Site Guard Features

Oracle Site Guard offerings include extensibility, storage integration, monitoring execution, and credentials managing.

Oracle Site Guard main features include the following:

Extensibility

Oracle Site Guard allows extending the built-in disaster recovery functionality by allowing you to insert custom scripts at specific points in the operation workflow.

Extensibility provides a mechanism to perform customized, site-specific, and operation-specific activities during a disaster recovery operation.

Any number of scripts can be configured for extensibility. The time and manner in which these user-defined scripts are executed and the sequence in which they are executed can be configured by selecting the script type.

This section contains the following topics:

Types of Scripts for Extensibility

Oracle Site Guard offers several kind of scripts with which you can extend its functionality.

To customize and extend Oracle Site Guard functionality, use any of the following scripts:

Custom Precheck Scripts

These scripts are provided by the user. They are used to perform user-defined activities during the Precheck or Health Check phase that occurs before an operation plan executes. Custom Precheck Scripts are executed as part of a Precheck or Health Check.

Pre Scripts

These scripts are provided by the user. They are used to perform user-defined activities at the beginning of site-specific operations in an operation plan. Pre Scripts are executed before Oracle Site Guard performs any target-related operations at a site.

Post Scripts

These scripts are provided by the user. They are used to perform user-defined activities at the end of site-specific operations in an operation plan. Post scripts are executed after Oracle Site Guard performs any target-related operation at a site.

Global Pre Scripts

These scripts are provided by the user. They are used to perform user-defined operation-specific activities at the beginning of an operation plan. Global Pre Scripts are executed before Oracle Site Guard begins any operation at the first site (usually the primary site).

Global Post Scripts

These scripts are provided by the user. They are used to perform user-defined operation-specific activities at the end of an operation plan. Global Post Scripts are executed after Oracle Site Guard has completed performing operations on the last site (usually a standby site).

Mount/Unmount Scripts

These scripts are bundled with Oracle Site Guard, but you can also define your own scripts. They are used to perform mount and un-mount operations on file systems during an operation. Unmount scripts are executed after all services and applications have been stopped at the primary site. Mount scripts are executed before any services or applications are started at the standby site.

Storage Scripts

These scripts are bundled with Oracle Site Guard, but you can also define your own storage scripts. They are used to perform storage role-reversal activities for Oracle Sun ZFS Appliance during a disaster-recovery operation. Storage Switchover scripts are executed during a switchover operation and they execute at the standby site before any mount scripts are executed. Storage Failover scripts are executed during a failover operation and they execute at the standby site before any mount scripts are executed.

Table 2-1 provides an overview of the various types of scripts used when you set up sites with Oracle Site Guard.

Figure 2-2 and Figure 2-3 provide a visual representation of the source of the scripts and their functions.

Table 2-1 Types of Scripts Used by Oracle Site Guard

Type of Script Provided by the User? (Custom Scripts) Provided with Oracle Site Guard? (Bundled Scripts)

Custom Precheck Script

Yes (optional)

No

Pre Script, Post Script, Global Pre Script, Global Post Script

Yes (optional)

No

Mount and Unmount Scripts

Yes (optional)

Yes.; must be configured by user.

Storage Switchover and Storage Failover Scripts

Yes (optional)

Yes; only for Oracle Sun ZFS and NetApp MetroCluster. To be configured by user.)

Figure 2-2 Oracle Site Guard Scripts: What They Do

Oracle Site Guard scripts

Figure 2-3 Oracle Site Guard Scripts: Who Provides Them

Oracle Site Guard scripts
Sequence of Script Execution

The executing workflow of an Oracle Site Guard operation varies according to the operation carried out: switchover, failover, or start and stop.

Figure 2-4, Figure 2-5, and Figure 2-6 show the sequence in which Oracle Site Guard executes user-defined scripts in different kinds of operations.

Figure 2-4 Executing Sequence of Scripts for Switchover Operation

Switchover operations

Figure 2-5 Execution Sequence of Scripts for Failover Operation

Failover operations

Note:

The optional scripts that are executed at the Primary site during a failover, are the same as that executed at the Primary site during a switchover operation. The scripts at the primary site are only executed as part of the failover operation if the user chooses to stop the Primary site during the failover.

Figure 2-6 Execution Sequence of Scripts for Start or Stop Operation

Start or stop operations

Note:

Custom Precheck scripts are scheduled to run on the Primary site for a Failover operation. But, since the Primary site might be inaccessible or non-operational, these scripts are set to run with a Continue on Error mode.

Configuring Script Paths

Configure the path of your user-defined scripts with the appropriate format according to the script type and runtime behavior.

Oracle Site Guard determines the location (path) of the script using the configuration path and type of script you provide. Table 2-2 illustrates how to configure various types of scripts, the corresponding script path that the user needs to specify, and the component that is extracted and used by Oracle Site Guard. Only the script path formats listed in the following tables are supported.

Table 2-2 Script Paths in Enterprise Manager Software Library

Script Type User Configured Path Script Path Extracted by Oracle Site Guard

Shell script

sh swlib_script.sh

sh ./swlib_script.sh

sh ./swlib_script.sh -

sh ./swlib_script.sh -option1 -option2

/home/bash swlib_script.sh

/home/bash swlib_script.sh -a param1 -b param2

swlib_script.sh

Perl script

perl swlib_script.pl

perl swlib_script.pl -a param1 -b param2

/home/perl swlib_script.pl

/home/perl swlib_script.pl -a param1 -b param2

swlib_script.pl

Python script

python swlib_script.py

python swlib_script.py -a param1 -b param2

/home/python swlib_script.py

/home/python swlib_script.py -a param1 -b param2

swlib_script.py

Table 2-3 Script Paths in Custom Scripts

Script Type User Configured Path Script Path Extracted by Oracle Site Guard

Shell script

sh /home/oracle/custom_script.sh

/home/oracle/custom_script.sh

/home/bash /home/oracle/custom_script.sh

/home/bash /home/oracle/custom_script.sh -a param1 -b param2

/home/bash /home/oracle/custom_script

/home/oracle/custom_script

/home/oracle/custom_script.sh

/home/oracle/custom_script

Perl script

perl /home/oracle/custom_script.pl

/home/perl /home/oracle/custom_script.pl -a param1 -b param2

/home/oracle/custom_script.pl

Python script

/home/python /home/oracle/custom_script.py

/home/python /home/oracle/custom_script.py -a param1 -b param2

/home/oracle/custom_script.py

Prechecks and Health Checks

Ensure that the standby site is ready to perform the production role before initiating any disaster recovery operation by running prechecks and health checks.

The success of a disaster recovery plan depends on how accurately the plan represents the environment it is supposed to protect. Topology changes and configuration drift in the protected site can cause the disaster-recovery operation plan to lose synchronization with the environment, and can render the plan partially or completely ineffective. Frequently, this divergence, between the disaster-recovery plan and the environment being protected, is not discovered until an actual disaster-recovery attempt is in progress.

Oracle Site Guard provides a solution to this problem offering precheck and health checks:

Prechecks

A Precheck is an on-demand, automated procedure that assesses the disaster recovery readiness of a site.

A Precheck can be executed by itself (stand-alone mode) to check if a selected operation plan will succeed. It can also be invoked before an operation plan is executed. In the latter case, if the Precheck fails, the operation plan is not executed. Prechecks invoked before an operation plan are optional and can be skipped if desired.

Health Checks

A Health Check is a precheck that is scheduled to run periodically to provide an ongoing assessment of disaster recovery readiness.

A health check must be configured for a specified operation plan and must have a user-specified schedule associated with it. For example, you might set up a health check associated with the Switchover to Standby Site plan to run every Wednesday and Saturday at 12:30 am to monitor the fidelity of that operation plan on an ongoing basis. You can also choose to be notified of health check results through e-mail.

Each configured operation plan can have an associated health check, and health checks for different plans execute independent of each other. You can stop health checks for an operation plan at any time

Oracle Site Guard performs the following checks during Prechecks and Health Checks:

  • Checks whether all the hosts involved in the planned disaster recovery operation are reachable. During this check, Oracle Site Guard logs into each host using the credentials configured for that host. This ensures that the host is reachable and can be accessed for executing directives and scripts.

  • Checks whether the primary and standby databases are configured correctly and Data Guard protection is functioning correctly. This check verifies the following:

    • The primary and standby database names are correct.

    • The database login credentials are correct.

    • Data Guard broker is ready to switchover the database.

    • Database Flashback status is set to ON.

    • Data Guard Redo and Transport Lags are within the limits specified by the user.

  • Checks whether the ZFS storage replication is functioning correctly. This check verifies the following:

    • The replication lags are within the limits specified by the user.

    • The source and destination ZFS appliances are reachable.

    • The login credentials are valid.

    • The replication action is configured correctly.

  • Checks whether user scripts are configured correctly by verifying whether each configured user script is found at the correct location.

  • Checks whether replicated file systems can be mounted during a switchover or failover. To confirm this, the check verifies that the file system mount points exist and can be accessed for mount operations.

  • Checks whether the Data Guard and ZFS replication lag checks are within the bounds specified by the user.

Note:

An associated Precheck is automatically created for every operation plan that is created. However, a health check must be explicitly scheduled for an operation plan.

Customizing Prechecks and Health Checks

You can customize built-in Prechecks and Health Checks by adding custom (user-defined) scripts that execute in any of those operations.

This allows you to enhance Oracle Site Guard Prechecks and Health Checks by inserting prechecks for third-party components that need to be included in the disaster recovery workflow. Custom precheck scripts function the same way that built-in Prechecks. If a custom precheck script detects an anomaly and returns an error, that precheck step is regarded as failed and, depending on how the script is configured (for example, if the script execution step is configured with the attribute Stop on Error), the disaster recovery operation may be aborted.

Lag Checks

The efficiency and timeliness of the data replication between the primary and standby sites depend on many factors, including network bandwidth, congestion, latency, storage appliance load, amount of replicated data, and so on.

Disaster Recovery configurations typically include one or more storage appliances and data stores that are used for data storage by the application and database tiers. To make this data available at the standby site in the event of disaster recovery, these data stores are replicated from the primary to standby site using either continuous or periodic replication. To perform a successful site switchover or failover, Oracle Site Guard must also perform storage role reversal as part of the disaster recovery process.

It is not uncommon for a certain amount of lag to be present between the source data at the primary site and the replicated data at the standby site. Oracle Site Guard provides a mechanism to configure the amount of replication lag that is permissible before a disaster recovery operation can begin execution. During the Precheck phase of a disaster recovery operation, Oracle Site Guard checks the current replication lag. If the lag exceeds the user-specified threshold, Oracle Site Guard does not execute the disaster recovery operation.

You can configure the following lag check parameters:

Database Lag Check

This parameter specifies the permissible lag for Redo Apply and Redo Transport which is managed by Oracle Data Guard.

ZFS Lag Check

By default Site Guard will determine the proper lag value for application-tier storage replication which is managed by ZFS. Alternatively, the ZFS lag check parameter can be used to specify a permissible lag value.

Storage Integration

During a disaster recovery, the storage replication direction must be reversed and storage appliances must be reconfigured before applications can be migrated to a standby site.

Managing storage operations is an essential part of a disaster recovery. Oracle Site Guard offers storage management and integration options for various storage technologies.

The following sections describe the Oracle Site Guard storage integration options:

Oracle Sun ZFS

Oracle Site Guard provides a built-in script to orchestrate Oracle Sun ZFS storage role reversals.

If you are deploying Oracle Sun ZFS storage appliances, you can use the bundled storage management zfs_storage_role_reversal.sh script to orchestrate Oracle Sun ZFS storage role reversal as part of Oracle Site Guard disaster recovery operations.

NetApp MetroCluster

Oracle Site Guard provides a built-in script to orchestrate NetApp MetroCluster storage role reversals.

If you have deployed a NetApp MetroCluster Disaster Recovery configuration, you can use the bundled NetApp storage management siteguard_netapp_control.sh script to orchestrate NetApp MetroCluster storage role reversal as part of Oracle Site Guard disaster recovery operations. For details, see MOS note titled Oracle Site Guard Feature For NetApp MetroCluster (Doc ID 1964220) at https://support.oracle.com.

Integrating Other Storage Types

Oracle Site Guard offers integration with storage technologies by allowing you to incorporate your own custom storage management scripts into Oracle Site Guard operation plans.

You can implement storage role reversal for third-party storage technologies by invoking your own custom storage management scripts during the storage script execution phase of the operation plan execution.

Mount and Unmount Scripts

Oracle Site Guard provides a built-in script to mount and unmont file systems and allows you to use custom scripts to manage file systems.

In addition to integrating with storage technologies, Oracle Site Guard allows you to incorporate your own scripts to manage file systems. For example, during a switchover operation, file systems that are used by a multi-tier application are unmounted at the primary site after the application is stopped; and replicated versions of those file systems are then mounted at the standby site before the application is started. These unmount and mount operations for application servers at the primary and standby sites can be orchestrated using the built-in mechanism for integrating scripts. Oracle Site Guard provides the mount_umount.sh script for file system mount and unmount operations. Alternately, you can define your own custom script to be invoked at appropriate points in the operation plan.

Standby Site Validation

Standby Site Validation allows you to convert your standby site into a fully functional site, so you can test and validate standby sites.

In a normal Site Guard disaster recovery configuration, the standby site is offline and unavailable for business operations.

To open a standby site for validation, configure and execute a Open for Validation type of operation plan for the site. After testing and validation are complete, you can revert the site back to a standby role by configuring and executing a Revert to Standby type of operation plan.

When opening a standby site for validation, Oracle Site Guard:

  • Converts the standby database from a physical standby database to a snapshot standby database. In this mode, the Data Guard redo logs are still shipped from the primary to the standby, but the logs are not applied to the standby database. The accumulated redo logs are applied after the database converts back to a physical standby database (after executing a Revert to Standby operation in Oracle Site Guard).

  • Clones ZFS replicated projects that are part of this Site Guard configuration to create a readable and writable copy of the replicated project. The file systems in this cloned project are then mounted for use by applications at the standby site. While the cloned project is being read to or written from by applications at the standby site, the ZFS replication from the primary site to the standby site (that was originally configured) continues with no interruptions. When the opened for validation standby site is closed with a Revert to Standby operation, the cloned file systems are un-mounted and the ZFS clones that were created as part of the Open for Validation operation are destroyed.

  • Executes all configured Global Prescripts and Global Postscripts, Prescripts and Postscripts, and Custom Precheck Scripts as they would be in any other operation plan.

When a standby site is opened for validation, the Recovery Point Objective (RPO) remains unaffected because database redo transport and ZFS storage replication continue uninterrupted as configured. No transaction data at the primary site is lost. However, the Recovery Time Objective (RTO) is affected because the standby site is not immediately available to accept an incoming switchover or failover. The standby site must first be reverted back to a (normal) Standby mode before the primary site can switchover or failover to the standby site.

The ability to open a standby site in validation mode offers the following benefits:

  • It increases your confidence that the disaster recovery configuration is correct and provides a way to verify that the standby site can become operational and meets your expectations.

  • It increases resource utilization by using standby sites for testing patches, validating new configurations, and generating analytics and reports.

Caution:

Note the following important points regarding a standby site opened for validation:

  • A standby site that is opened for validation is not available as a disaster recovery site. It must be reverted back to a standby role (with Revert to Standby) before it can accept an incoming switchover or failover from the primary site.

  • A standby site that is opened for validation must not be used for production activities (customer traffic) because any transactions that occur in the site will be discarded when the site reverts to a standby site.

  • When a physical standby has a RedoRoutes property assigned to the primary database, it must be specified as (LOCAL : ... ) in the rule. If not, Data Guard broker will not allow the conversion to a snapshot standby and the operation will fail with the ORA-16692 error. Refer to Oracle Database documentation for details on configuring RedoRoutes with the LOCAL primary database value.

Creating Execution Groups

An Execution Group allows you to customize the step sequence of executions (within common functional areas) when those executions run in parallel.

Site Guard operation plans consist of separate buckets for handling a common functional areas or target types when the plan executes; for example, all database instances for a site will be in a single bucket. Each of these buckets typically consists of one or more steps which process the target type or functionality for which that bucket is intended. Additionally, the Execution Mode of a bucket specifies whether the steps in a bucket should be executed in Serial or Parallel.

For example, a typical operation plan will contain separate buckets that contain all the steps for each of the following functional areas in a site:

  • Shutting down all Oracle WebLogic Server domains

  • Switching over all databases

  • Executing all the Pre Scripts

  • Executing all the Mount or Unmount scripts

Execution Groups allow you to define a precise orchestration sequence within a bucket. For example, operation plan steps that are in Execution Group 3 will all execute in parallel only after all the steps in Execution Group 2 have finished execution. Similarly, Site Guard will ensure that all the operation plan steps in Execution Group 3 finish executing before any steps in Execution Group 4 are started. This allows you to place each operation plan step in a given bucket in a specific group in order to determine when that operation plan step will be executed.

When you create an operation plan, Site Guard initially marks the Execution Mode for each bucket as Parallel, and will place all the steps in the bucket in Execution Group 1. However, you can edit the operation plan and customize the Execution Group for each step to determine its execution sequence.

Note:

If a bucket has an Execution Mode of Serial, then Execution Groups become irrelevant because all the steps in that bucket will be executed sequentially. This is the equivalent of putting each step in its own execution group. Site Guard allows you to edit the operation plan and re-order the sequence of steps in a Serial execution bucket.

When viewing or editing plans in the Site Guard UI, the Execution Group column is hidden by default.

Custom pre checks can be placed into execution groups, however regular pre checks cannot and will always execute in parallel.

Monitoring Executions and Managing Errors

In this section, you learn how to customize, execute, and monitor execution plans with Oracle Site Guard.

When you execute an Oracle Site Guard operation plan, you can customize the plan before you execute it, monitor the execution of the plan, manage any errors you encounter during plan execution, and retry plan execution after making changes.

This section contains the following topics:

Customizing Operations

Learn how to customize Oracle Site Guard operations according to your environment.

Oracle Site Guard operation plans can be customized according to the characteristics of your environment. Specifically, you can customize any operation step by:

  • Specifying whether the step should be enabled or disabled for execution (disabled steps are skipped during execution).

  • Moving the step to another point in the execution sequence (for example, changing the order of managed servers to be brought up within a domain group).

  • Specifying how errors for the step are to be handled (that is, stop or continue operation execution when an error is encountered).

  • Specifying whether the steps of a given group are to be executed serially or in parallel (for example, start up all the managed servers at the same time, or start one managed server at the time).

Monitoring Executions

You can monitor operation results in the Procedure Activity page of Oracle Enterprise Manager Cloud Control Console.

Oracle Site Guard disaster recovery operations execute as Oracle Enterprise Manager Deployment Procedures. The procedure activity screen for an Oracle Site Guard operation displays each operation plan as a hierarchy of steps with a graphical icon showing the result of each step as it is executed. A check mark is displayed if the step succeeds, or a cross is displayed if the step fails. The icon, Icon that indicates that the step was skipped., indicates that the step was skipped and not configured for execution. This mechanism provides a visual summary of the operation plan progress.

When viewed in the Operation Activity page, the execution details for each operation plan or precheck are organized as a hierarchy of top-level steps with consequent sub-steps. Initially, only the top-level steps are visible to the user. The consequent sub-steps are collapsed and hidden within each top-level step. However, each top-level step in the operation activity can be further inspected in detail by clicking on the step to expand it, and navigating down into the hierarchy to select a constituent sub-step. The execution log for each sub-step can also be examined for additional details. This hierarchical organization of operation activity allows you to examine the results of the operation plan at any desired level of detail.

Operation Error Modes

Each step in an Oracle Site Guard operation plan has an error mode an associated with it, which you can configure.

This error mode defines how Oracle Site Guard handles any error that is encountered during the execution of that step.

The following error modes are available:

Stop on Error

This mode specifies that Oracle Site Guard should stop executing the operation plan if it encounters an error while executing the current step.

Continue on Error

This mode specifies that Oracle Site Guard should continue with the execution of the next step if it encounters an error while executing the current step.

Retrying Failed Operations

If Oracle Site Guard stops execution because of an error encountered during an operation, you can resolve the issue that caused the error and retry the operation.

Oracle Site Guard resumes execution of the failed operation at the step where the failure occurred. You can also ignore the failed step, by clicking remove, and retry the operation. In this case, Oracle Site Guard will ignore the failed step, and resume execution of the operation plan starting with the step immediately following the failed step.

Suspending and Resuming Operations

You can suspend an in-progress Oracle Site Guard operation or resume a suspended operation, at any point in time.

When resuming a suspended operation, Oracle Site Guard will resume execution of the operation at the point where it was suspended. Additionally, you can also stop an operation that is currently in progress.

Note:

Stopped operations cannot be resumed.

Credential Management

Learn how to manage credentials used in Oracle Site Guard.

Oracle Enterprise Manager Credential Management Framework

Oracle Enterprise Manager provides the Credential Management Framework that you can use to manage identities and to ensure that the access to Oracle Enterprise Manager targets is authorized and authenticated.

Typically, you can set up Named Credentials in Enterprise Manager before configuring Oracle Site Guard to use these credentials. After the credentials are configured, Oracle Site Guard uses them to access all managed targets at protected sites.

Depending on the topology of the site, Oracle Site Guard may need to use Named Credentials for different targets such as hosts, Oracle Database instances, WebLogic Servers, and other target types. For information about setting up credentials in Enterprise Manager, see Setting Up Credentials.

Oracle Site Guard Credential Configuration

Learn how to use credentials in Oracle Site Guard operations.

After the required target credentials have been configured in Enterprise Manager's Credential Management framework, you can utilize them during Oracle Site Guard's credential configuration process. Oracle Site Guard credential configuration requires that the targets that are accessed and controlled by Oracle Site Guard for disaster recovery operations have valid credentials associated with the target. For information about setting up credentials and associating them with targets, see Creating Credential Associations.

Role-Based Access Control

Oracle Site Guard offers Role-Based Access Control (RBAC) using the User Accounts framework provided by Oracle Enterprise Manager.

Oracle Enterprise Manager provides preconfigured roles for different areas or functions within Enterprise Manager. One of these administrator roles, EM_SG_ADMINISTRATOR, is customized for Oracle Site Guard-focused activities within Enterprise Manager. You can utilize this built-in role to create users focused on Oracle Site Guard administration tasks. Alternately, you can create your own customized roles and users that allow for greater flexibility in tuning role-based access to Oracle Site Guard functionality.

For information about setting up role-based access control, see Creating Oracle Site Guard Administrator Users.

Software Library Integration

Oracle Site Guard includes ready-to-use scripts to perform typical activities during a disaster recovery operation, such as switching over an Oracle Database, or starting or stopping an Oracle WebLogic Server.

These scripts are included as part of the Enterprise Manager Software Library, and all required scripts are automatically deployed to the applicable hosts during operation execution. However, in addition to the bundled scripts, you may require other custom scripts to be automatically deployed and executed as part of an operation. Oracle Site Guard provides a mechanism for you to upload your own custom scripts to the Enterprise Manager Software Library and to add these scripts to your operation plan when you create the plan.

An additional advantage of using scripts that are part of the Enterprise Manager Software Library is that these scripts are automatically deployed to all configured script hosts at runtime. On the other hand, user scripts that are not part of the Enterprise Manager Software Library must be manually deployed on each configured script host before the operation plan begins execution.

For more information about the various types of scripts that a user can add to the Enterprise Manager Software Library, see Extensibility.

Custom Credentials for Script Execution

You can add a set of credentials to the credential repository and configure a script to execute with these credentials.

User-defined scripts that are either externally deployed or deployed through the Software Library are typically executed using the credentials configured for the host on which the script will execute. These credentials are configured and maintained in the Enterprise Manager credential management framework, and are referred to as the Host Normal Credentials or Host Privileged Credentials.

You can also add other sets of credentials to the credential repository and configure a script to execute with this set of credentials. This is useful in cases where the script requires credential privileges that are different from the standard (Host Normal) or privileged (Host Privileged) credentials configured for the script host. For example, a script that must be executed with a specific user ID to shut down a server process on that host.

Passing Credentials as Script Parameters

Oracle Site Guard provides a mechanism to pass credentials to a configured script.

User defined scripts frequently perform actions that require them to first authenticate with some other entity and they require one or more sets of credentials to perform this authentication. To avoid hard-coding credentials into the script or passing them insecurely as clear-text parameters to the script, Oracle Site Guard provides a mechanism to securely pass one or more sets of credentials to a configured script. These credentials are stored and maintained in a secure manner in Oracle Enterprise Manager's credential management framework. Once these credentials are configured and associated as parameters for the user script, Oracle Site Guard will encrypt and pass these credentials to the user script at execution time. The user script can then extract these credentials and use them for authentication.

For details about extracting encrypted credentials inside a user script, see Passing Credentials as Parameters.

Oracle Site Guard Workflows

Oracle Site Guard workflows (or operations) are modeled as Enterprise Manager deployment procedures.

Oracle Site Guard provides the following distinct types of workflows for disaster-recovery operations:

When there is a failure or planned outage of the primary site, Oracle Site Guard automates the following steps to enable the standby site to assume the production role in the topology:

  1. Stops the services and applications running on the primary site, and unmounts the storage on the primary site.

  2. Disables ongoing replication from primary site to standby site and performs role reversal.

  3. Performs a failover or switchover of the Oracle Databases with Oracle Data Guard Broker.

  4. Mounts the replicated storage (file systems) on the standby site.

  5. Starts the services and applications on the standby site. At this point, the standby site assumes the production role.

Note:

If continuous storage replication is not configured, Oracle recommends that you perform a final storage replication from the primary site to the standby site, before you initiate the Site Guard operation. However, if the primary site has failed, it may not be possible to perform this final replication.

Oracle Site Guard workflows can be monitored, suspended, resumed, and stopped with Enterprise Manager's Procedure Management framework.

Switchover Workflow

This workflow transitions production activities from the primary site to a standby site.

The Switchover workflow provides the ability to perform a controlled transition of the production activity from the primary site to a standby site. Figure 2-7 shows the steps executed during a typical Switchover workflow.

Note:

A disaster recovery operation comprises of operations that are dependent on the topology and site configuration.

Figure 2-7 Switchover Workflow

Switchover workflow

Failover Workflow

This workflow transitions production activities to a standby site.

The Failover workflow provides the ability to perform a forced transition of production activity to a standby site. When a failover operation is launched, Oracle Site Guard assumes that the primary site is unavailable, and starts all protected applications at the standby site. Figure 2-8 shows the steps executed during a typical Failover workflow:

Figure 2-8 Failover Workflow

Failover workflow

Start Workflow

This workflow starts activities at a production site.

The start workflow provides the ability to start activities at a production site. This workflow is typically used to bring up a site after maintenance, or to test whether the site can be started as part of testing a larger workflow such as a switchover. Figure 2-9 shows the steps executed during a typical Start workflow.

Figure 2-9 Start Workflow

Start workflow

Stop Workflow

This workflow stops activities at a production site.

The Stop workflow provides the ability to stop activities at a production site. This workflow is typically used to bring down a site for maintenance, or to test whether the site can be stopped as part of testing a larger workflow such as a switchover. Figure 2-10 shows the steps executed during a typical Stop workflow.

Figure 2-10 Stop Workflow

Stop workflow

Open for Validation Workflow

This workflow converts a standby site to an operational site so that it can be tested and validated.

The Open for Validation workflow provides the ability to convert a standby site to an operational site. This workflow is typically used to convert a standby site to a functional site so that it can be tested and validated. Figure 2-11 shows the steps executed during a typical Open for Validation workflow.

Figure 2-11 Open for Validation Workflow

Open for validation workflow

Revert to Standby Workflow

This workflow converts a site that has been opened for validation back to a standby site.

The Revert to Standby workflow provides the ability to convert a site back to a standby site after you opened the site for validation. This workflow is typically used to convert a standby site that is has been opened for validation, back to a standby site so that it can be used for disaster recovery operations such as switchover or failover. Figure 2-12 shows an example of the steps executed during a typical Revert to Standby workflow.

Figure 2-12 Revert to Standby Workflow

Revert to standby workflow