|Oracle® Clusterware Installation Guide
11g Release 1 (11.1) for Linux
|PDF · Mobi · ePub|
This chapter describes how to complete the postinstallation tasks after you have installed the Oracle Clusterware software.
This chapter contains the following topics:
You must perform the following tasks after completing your installation:
After your Oracle Clusterware installation is complete and after you are sure that your system is functioning properly, make a backup of the contents of the voting disk. Use the
dd utility. For example:
# dd if=/dev/sda1 of=/dev/myvdisk1.bak
Also, make a backup copy of the voting disk contents after you complete any node additions or node deletions, and after running any deinstallation procedures.
Review the following sections to configure input/output fencing:
Input/output fencing (IO Fencing) is a required mechanism to ensure that a node that is evicted from the cluster is prevented from writing to the cluster shared storage. The mechanisms that provide fencing are the hangcheck-timer kernel module, and the Oracle Clusterware Process Monitor Daemon (
oprocd process is installed during Oracle Clusterware installation. Both mechanisms are required.
The hangcheck-timer and
oprocd are independent mechanisms. The
oprocd process provides additional hang check capability, and it can catch hang conditions that the hangcheck-timer misses.
Load the hangcheck-timer kernel module as root, using
modprobe. The examples in this section show
Check the following hangcheck timer settings:
hangcheck_tick parameter: This parameter defines how often, in seconds, the
hangcheck-timer checks the node for hangs. The default value is 60 seconds. Oracle recommends that you change the value of
hangcheck_margin parameter: This parameter defines how long the timer waits, in seconds, for a response from the kernel. The default value is 180 seconds. Oracle recommends that you change the value of
hangcheck_reboot parameter determines if the hangcheck-timer restarts the node if the kernel fails to respond within the sum of the
hangcheck_margin parameter values. If the value of hangcheck_reboot is equal to or greater than 1, then the
hangcheck-timer module restarts the system when a hang is detected. If the hangcheck_reboot parameter is set to zero, then the hangcheck-timer module does not restart the node when a hang is detected.
Note:On Linux 2.6 kernels, by default
hangcheck_rebootis set to 0. The value for
hangcheck_rebootmust always be set to
1to restart the system if a hang is detected.
These settings assume that the CSS misscount value is set to 30 or 60 seconds, which are the default for release 11g and release 10g respectively. The value for CSS misscount should always be greater than the sum of
For optimal cluster performance, test applications with the hangcheck parameter values that Oracle recommends. If you find that the cluster produces false node evictions with these values, then increase the
hangcheck_margin parameter value, with the help of Oracle Support.
Use the following procedure to configure the hangcheck timer:
Log in as root.
Check to see if settings for the hangcheck timer are listed in the module configuration file. On Red Hat and on Oracle Linux, that file is
/etc/modprobe.conf. On SUSE, it is
/etc/modprobe.conf.local. For example:
# more /etc/modprobe.conf |grep hang
You should see something similar to the following:
options hangcheck-timer hangcheck_tick=1 hangcheck_margin=10 hangcheck_reboot=1
If the hangcheck configuration does not exist, or if it exists but the values are set to different values than those recommended, then enter a command similar to the following to load them into the configuration file. The following example is for Red Hat and Oracle Linux:
# echo "options hangcheck-timer hangcheck_tick=1 hangcheck_margin=10 \ hangcheck_reboot=1" >>/etc/modprobe.conf
If necessary, enter the following command to remove the existing hangcheck-timer values:
# /sbin/modprobe -r hangcheck-timer
Enter the following command to load the new hangcheck-timer values:
# /sbin/modprobe -v hangcheck-timer
To confirm that the hangcheck module is loaded, enter the following command:
# /sbin/lsmod | grep hang
The output should be similar to the following:
hangcheck_timer 3289 0
To ensure that the module is loaded every time the system restarts, verify that the local system startup file contains the command
/sbin/modprobe -v hangcheck-timer, or add it if necessary:
Add the command to the
Add the command to the
Repeat this process on each node that you intend to make a member of the cluster.
The Oracle Clusterware Process Monitor Daemon (
oprocd) process is part of the Oracle Clusterware software installation. It is started automatically by Oracle Clusterware to detect system hangs. When it detects a system hang, it restarts the hung node.
Oracle has found wide variations in scheduling latencies observed across operating systems and versions of operating systems. Because of these scheduling latencies, the default values for
oprocd can be overly sensitive, particularly under heavy system load, resulting in unnecessary
oprocd-initiated restarts (false restarts).
Oracle recommends that you address scheduling latencies with your operating system vendor to reduce or eliminate them as much as possible, as they can cause other problems.
To overcome these scheduling latencies, Oracle recommends that you set the Oracle Clusterware parameter
diagwait to the value
13. This setting increases the time for failed nodes to flush final trace files, which helps to debug the cause of a node failure. You must shut down the cluster to change the
If you require more aggressive failover times to meet more stringent service level requirements, then you should open a service request with Oracle Support to receive advice about how to tune for lower failover settings.
diagwaitparameter requires a clusterwide shutdown. Oracle recommends that you change the
diagwaitsetting either immediately after the initial installation, or during a scheduled outage.
To change the
Log in as root, and run the following command on all nodes, where
CRS_home is the home directory of the Oracle Clusterware installation:
# CRS_home/bin/crsctl stop crs
Enter the following command, where
CRS_home is the Oracle Clusterware home:
# CRS_home/bin/oprocd stop
Repeat this command on all nodes.
From one node of the cluster, change the value of the
diagwait parameter to 13 seconds by issuing the following command as
# CRS_home/bin/crsctl set css diagwait 13 -force
Restart the Oracle Clusterware by running the following command on all nodes:
# CRS_home/bin/crsctl start crs
Run the following command to ensure that Oracle Clusterware is functioning properly:
# CRS_home/bin/crsctl check crs
Use a Web browser to view the OracleMetaLink Web site:
Log in to OracleMetaLink.
Note:If you are not an OracleMetaLink registered user, then click Register for MetaLink and register.
On the main OracleMetaLink page, click Patches & Updates.
On the Patches & Update page, click Advanced Search.
On the Advanced Search page, click the search icon next to the Product or Product Family field.
In the Search and Select: Product Family field, select Database and Tools in the Search list field, enter RDBMS Server in the text field, and click Go.
RDBMS Server appears in the Product or Product Family field. The current release appears in the Release field.
Select your platform from the list in the Platform field, and at the bottom of the selection list, click Go.
Any available patch updates appear under the Results heading.
Click the number of the patch that you want to download.
On the Patch Set page, click View README and read the page that appears. The README page contains information about the patch set and how to apply the patches to your installation.
Return to the Patch Set page, click Download, and save the file on your system.
Use the unzip utility provided with Oracle Database 10g to uncompress the Oracle patch updates that you downloaded from OracleMetaLink. The unzip utility is located in the
Refer to Appendix B for information about how to stop database processes in preparation for installing patches.
Oracle recommends that you complete the following tasks after installing Oracle Clusterware.
Oracle recommends that you back up the
root.sh script after you complete an installation. If you install other products in the same Oracle home directory, then the Oracle Universal Installer (OUI) updates the contents of the existing
root.sh script during the installation. If you require information contained in the original
root.sh script, then you can recover it from the
root.sh file copy.