|Oracle® Clusterware Administration and Deployment Guide
12c Release 1 (12.1)
|PDF · Mobi · ePub|
Oracle Clusterware enables servers to communicate with each other, so that they appear to function as a collective unit. This combination of servers is commonly known as a cluster. Although the servers are standalone servers, each server has additional processes that communicate with other servers. In this way the separate servers appear as if they are one system to applications and end users.
Oracle Clusterware provides the infrastructure necessary to run Oracle Real Application Clusters (Oracle RAC). Oracle Clusterware also manages resources, such as virtual IP (VIP) addresses, databases, listeners, services, and so on. These resources are generally named
entity_name.resource_type_abbreviation, such as
ora.mydb.db, which is the name of a resource that is a database. (Some examples of
db for database,
lsnr for listener, and
vip for VIP.) Oracle does not support editing these resources except under the explicit direction of My Oracle Support.
See Also:Chapter 2, "Administering Oracle Clusterware" and Chapter 8, "Making Applications Highly Available Using Oracle Clusterware" for more information
Figure 1-1 shows a configuration that uses Oracle Clusterware to extend the basic single-instance Oracle Database architecture. In Figure 1-1, the cluster is running Oracle Database and is actively servicing applications and users. Using Oracle Clusterware, you can use the same high availability mechanisms to make your Oracle database and your custom applications highly available.
The benefits of using a cluster include:
Scalability of applications
Reduce total cost of ownership for the infrastructure by providing a scalable system with low-cost commodity hardware
Ability to fail over
Increase throughput on demand for cluster-aware applications, by adding servers to a cluster to increase cluster resources
Increase throughput for cluster-aware applications by enabling the applications to run on all of the nodes in a cluster
Ability to program the startup of applications in a planned order that ensures dependent processes are started in the correct sequence
Ability to monitor processes and restart them if they stop
Eliminate unplanned downtime due to hardware or software malfunctions
Reduce or eliminate planned downtime for software maintenance
You can program Oracle Clusterware to manage the availability of user applications and Oracle databases. In an Oracle RAC environment, Oracle Clusterware manages all of the resources automatically. All of the applications and processes that Oracle Clusterware manages are either cluster resources or local resources.
Oracle Clusterware is required for using Oracle RAC; it is the only clusterware that you need for platforms on which Oracle RAC operates. Although Oracle RAC continues to support many third-party clusterware products on specific platforms, you must also install and use Oracle Clusterware. Note that the servers on which you want to install and run Oracle Clusterware must use the same operating system.
Using Oracle Clusterware eliminates the need for proprietary vendor clusterware and provides the benefit of using only Oracle software. Oracle provides an entire software solution, including everything from disk management with Oracle Automatic Storage Management (Oracle ASM) to data management with Oracle Database and Oracle RAC. In addition, Oracle Database features, such as Oracle Services, provide advanced functionality when used with the underlying Oracle Clusterware high availability framework.
Oracle Clusterware has two stored components, besides the binaries: The voting files, which record node membership information, and the Oracle Cluster Registry (OCR), which records cluster configuration information. Voting files and OCRs must reside on shared storage available to all cluster member nodes.
To use Oracle Clusterware, you must understand the hardware and software concepts and requirements as described in the following sections:
Note:Many hardware providers have validated cluster configurations that provide a single part number for a cluster. If you are new to clustering, then use the information in this section to simplify your hardware procurement efforts when you purchase hardware to create a cluster.
A cluster consists of one or more servers. Access to an external network is the same for a server in a cluster (also known as a cluster member or node) as for a standalone server. However, a server that is part of a cluster, otherwise known as a node or a cluster member, requires a second network. This second network is referred to as the interconnect. For this reason, cluster member nodes require at least two network interface cards: one for a public network and one for a private network. The interconnect network is a private network using a switch (or multiple switches) that only the nodes in the cluster can access.Foot 1
Note:Oracle does not support using crossover cables as Oracle Clusterware interconnects.
Cluster size is determined by the requirements of the workload running on the cluster and the number of nodes that you have configured in the cluster. If you are implementing a cluster for high availability, then configure redundancy for all of the components of the infrastructure as follows:
At least two network interfaces for the public network, bonded to provide one address
At least two network interfaces for the private interconnect network
The cluster requires cluster-aware storageFoot 2 that is connected to each server in the cluster. This may also be referred to as a multihost device. Oracle Clusterware supports Network File Systems (NFSs), iSCSI, Direct Attached Storage (DAS), Storage Area Network (SAN) storage, and Network Attached Storage (NAS).
To provide redundancy for storage, generally provide at least two connections from each server to the cluster-aware storage. There may be more connections depending on your I/O requirements. It is important to consider the I/O requirements of the entire cluster when choosing your storage subsystem.
Most servers have at least one local disk that is internal to the server. Often, this disk is used for the operating system binaries; you can also use this disk for the Oracle software binaries. The benefit of each server having its own copy of the Oracle binaries is that it increases high availability, so that corruption of one binary does not affect all of the nodes in the cluster simultaneously. It also allows rolling upgrades, which reduce downtime.
Each server must have an operating system that is certified with the Oracle Clusterware version you are installing. Refer to the certification matrices available in the Oracle Grid Infrastructure Installation Guide for your platform or on My Oracle Support (formerly OracleMetaLink) for details, which are available from the following URL:
When the operating system is installed and working, you can then install Oracle Clusterware to create the cluster. Oracle Clusterware is installed independently of Oracle Database. After you install Oracle Clusterware, you can then install Oracle Database or Oracle RAC on any of the nodes in the cluster.
Oracle Clusterware uses voting files to provide fencing and cluster node membership determination. OCR provides cluster configuration information. You can place the Oracle Clusterware files on either Oracle ASM or on shared common disk storage. If you configure Oracle Clusterware on storage that does not provide file redundancy, then Oracle recommends that you configure multiple locations for OCR and voting files. The voting files and OCR are described as follows:
Oracle Clusterware uses voting files to determine which nodes are members of a cluster. You can configure voting files on Oracle ASM, or you can configure voting files on shared storage.
If you configure voting files on Oracle ASM, then you do not need to manually configure the voting files. Depending on the redundancy of your disk group, an appropriate number of voting files are created.
If you do not configure voting files on Oracle ASM, then for high availability, Oracle recommends that you have a minimum of three voting files on physically separate storage. This avoids having a single point of failure. If you configure a single voting file, then you must use external mirroring to provide redundancy.
Oracle recommends that you do not use more than five voting files, even though Oracle supports a maximum number of 15 voting files.
Oracle Clusterware uses the Oracle Cluster Registry (OCR) to store and manage information about the components that Oracle Clusterware controls, such as Oracle RAC databases, listeners, virtual IP addresses (VIPs), and services and any applications. OCR stores configuration information in a series of key-value pairs in a tree structure. To ensure cluster high availability, Oracle recommends that you define multiple OCR locations. In addition:
You can have up to five OCR locations
Each OCR location must reside on shared storage that is accessible by all of the nodes in the cluster
You can replace a failed OCR location online if it is not the only OCR location
You must update OCR through supported utilities such as Oracle Enterprise Manager, the Oracle Clusterware Control Utility (CRSCTL), the Server Control Utility (SRVCTL), the OCR configuration utility (OCRCONFIG), or the Database Configuration Assistant (DBCA)
See Also:Chapter 2, "Administering Oracle Clusterware" for more information about voting files and OCR
Oracle Clusterware enables a dynamic Oracle Grid Infrastructure through the self-management of the network requirements for the cluster. Oracle Clusterware 12c supports the use of Dynamic Host Configuration Protocol (DHCP) or stateless address autoconfiguration for the VIP addresses and the Single Client Access Name (SCAN) address, but not the public address. DHCP provides dynamic assignment of IPv4 VIP addresses, while Stateless Address Autoconfiguration provides dynamic assignment of IPv6 VIP addresses.
When you are using Oracle RAC, all of the clients must be able to reach the database, which means that the clients must resolve VIP and SCAN names to all of the VIP and SCAN addresses, respectively. This problem is solved by the addition of Grid Naming Service (GNS) to the cluster. GNS is linked to the corporate Domain Name Service (DNS) so that clients can resolve host names to these dynamic addresses and transparently connect to the cluster and the databases. Oracle supports using GNS without DHCP and zone delegation in Oracle Clusterware 12c (as with Oracle Flex ASM server clusters, which you can configure without zone delegation or dynamic networks).
Note:Oracle does not support using GNS without DHCP and zone delegation on Windows.
See Also:Oracle Automatic Storage Management Administrator's Guide for more information about Oracle Flex ASM
Beginning with Oracle Clusterware 12c, a GNS instance can now service multiple clusters rather than just one, thus only a single domain must be delegated to GNS in DNS. GNS still provides the same services as in previous versions of Oracle Clusterware.
The cluster in which the GNS server runs is referred to as the server cluster. A client cluster advertises its names with the server cluster. Only one GNS daemon process can run on the server cluster. Oracle Clusterware puts the GNS daemon process on one of the nodes in the cluster to maintain availability.
In previous, single-cluster versions of GNS, the single cluster could easily locate the GNS service provider within itself. In the multicluster environment, however, the client clusters must know the GNS address of the server cluster. Given that address, client clusters can find the GNS server running on the server cluster.
In order for GNS to function on the server cluster, you must have the following:
The DNS administrator must delegate a zone for use by GNS
A GNS instance must be running somewhere on the network and it must not be blocked by a firewall
All of the node names in a set of clusters served by GNS must be unique
See Also:"Overview of Grid Naming Service" for information about administering GNS
The SCAN is a domain name registered to at least one and up to three IP addresses, either in DNS or GNS. When using GNS and DHCP, Oracle Clusterware configures the VIP addresses for the SCAN name that is provided during cluster configuration.
The node VIP and the three SCAN VIPs are obtained from the DHCP server when using GNS. If a new server joins the cluster, then Oracle Clusterware dynamically obtains the required VIP address from the DHCP server, updates the cluster resource, and makes the server accessible through GNS.
Note:"Understanding SCAN Addresses and Client Service Connections" for more information about SCAN
One public host name for each node.
One VIP address for each node.
You must assign a VIP address to each node in the cluster. Each VIP address must be on the same subnet as the public IP address for the node and should be an address that is assigned a name in the DNS. Each VIP address must also be unused and unpingable from within the network before you install Oracle Clusterware.
Up to three SCAN addresses for the entire cluster.
Note:The SCAN must resolve to at least one address on the public network. For high availability and scalability, Oracle recommends that you configure the SCAN to resolve to three addresses on the public network.
See Also:Your platform-specific Oracle Grid Infrastructure Installation Guide installation documentation for information about system requirements and configuring network addresses
When Oracle Clusterware is operational, several platform-specific processes or services run on each node in the cluster. This section describes these various processes and services.
Oracle Clusterware consists of two separate technology stacks: an upper technology stack anchored by the Cluster Ready Services (CRS) daemon (CRSD) and a lower technology stack anchored by the Oracle High Availability Services daemon (OHASD). These two technology stacks have several processes that facilitate cluster operations. The following sections describe these technology stacks in more detail:
The following list describes the processes that comprise CRS:
The CRSD manages cluster resources based on the configuration information that is stored in OCR for each resource. This includes start, stop, monitor, and failover operations. The CRSD process generates events when the status of a resource changes. When you have Oracle RAC installed, the CRSD process monitors the Oracle database instance, listener, and so on, and automatically restarts these components when a failure occurs.
Cluster Synchronization Services (CSS): Manages the cluster configuration by controlling which nodes are members of the cluster and by notifying members when a node joins or leaves the cluster. If you are using certified third-party clusterware, then CSS processes interface with your clusterware to manage node membership information.
cssdagent process monitors the cluster and provides I/O fencing. This service formerly was provided by Oracle Process Monitor Daemon (
oprocd), also known as
OraFenceService on Windows. A
cssdagent failure may result in Oracle Clusterware restarting the node.
Oracle ASM: Provides disk management for Oracle Clusterware and Oracle Database.
Cluster Time Synchronization Service (CTSS): Provides time management in a cluster for Oracle Clusterware.
Oracle Agent (oraagent): Extends clusterware to support Oracle-specific requirements and complex resources. This process runs server callout scripts when FAN events occur. This process was known as RACG in Oracle Clusterware 11g release 1 (11.1).
The Cluster Synchronization Service (CSS), Event Management (EVM), and Oracle Notification Services (ONS) components communicate with other cluster component layers on other nodes in the same cluster database environment. These components are also the main communication links between Oracle Database, applications, and the Oracle Clusterware high availability components. In addition, these background processes monitor and manage database operations.
The following list describes the processes that comprise the Oracle High Availability Services technology stack:
See Also:"Resources" for more information about
Cluster Logger Service (
ologgerd): Receives information from all the nodes in the cluster and persists in an Oracle Grid Infrastructure Management Repository-based database. This service runs on only two nodes in a cluster.
Grid Plug and Play (GPNPD): Provides access to the Grid Plug and Play profile, and coordinates updates to the profile among the nodes of the cluster to ensure that all of the nodes have the most recent profile.
Multicast Domain Name Service (mDNS): Used by Grid Plug and Play to locate profiles in the cluster, and by GNS to perform name resolution. The mDNS process is a background process on Linux and UNIX and on Windows.
Oracle Agent (
oraagent): Extends clusterware to support Oracle-specific requirements and complex resources. This process manages daemons that run as the Oracle Clusterware owner, like the GIPC, GPNPD, and GIPC daemons.
See Also:"Resources" for more information about
System Monitor Service (
osysmond): The monitoring and operating system metric collection service that sends the data to the cluster logger service. This service runs on every node in a cluster.
Table 1-1 lists the processes and services associated with Oracle Clusterware components. In Table 1-1, if a UNIX or a Linux system process has an (r) beside it, then the process runs as the
Table 1-1 List of Processes and Services Associated with Oracle Clusterware ComponentsFoot 1
|Oracle Clusterware Component||Linux/UNIX Process||Windows Processes|
Oracle ASMFoot 2
Grid Plug and Play
Oracle High Availability Services
Oracle root agent
Footnote 1 The only Windows services associated with the Oracle Grid Infrastructure are OracleOHService (OHASD), Oracle ASM, listener services (including node listeners and SCAN listeners), and management database. Oracle ASM can be considered part of the Oracle Clusterware technology stack when OCR is stored on Oracle ASM. The listeners and management database are Oracle Clusterware resources and are not properly part of the Oracle Clusterware technology stack.
Footnote 2 Oracle ASM is not just one process, but an instance. Given Oracle Flex ASM, Oracle ASM does not necessarily run on every cluster node but only some of them.
See Also:"Oracle Clusterware Diagnostic and Alert Log Data" for information about the location of log files created for processes
Note:Oracle Clusterware on Linux platforms can have multiple threads that appear as separate processes with unique process identifiers.
Figure 1-2 illustrates cluster startup.
The following section introduces the installation processes for Oracle Clusterware.
You can install different releases of Oracle Clusterware, Oracle ASM, and Oracle Database on your cluster. Follow these guidelines when installing different releases of software on your cluster:
You can only have one installation of Oracle Clusterware running in a cluster, and it must be installed into its own home (
Grid_home). The release of Oracle Clusterware that you use must be equal to or higher than the Oracle ASM and Oracle RAC versions that run in the cluster. You cannot install a version of Oracle RAC that was released after the version of Oracle Clusterware that you run on the cluster. In other words:
Oracle Clusterware 12c supports Oracle ASM 12c only, because Oracle ASM is in the Oracle Grid Infrastructure home, which also includes Oracle Clusterware
Oracle Clusterware 12c supports Oracle Database 12c, Oracle Database 11g release 2 (11.2) and 11g release 1 (11.1), and Oracle Database 10g release 2 (10.2) and 10g release 1 (10.1)
Oracle ASM 12c requires Oracle Clusterware 12c and supports Oracle Database 12c, Oracle Database 11g release 2 (11.2), Oracle Database 11g release 1 (11.1), Oracle Database 10g release 2 (10.2), and 10g release 1 (10.1)
Oracle Database 12c requires Oracle Clusterware 12c
If you have Oracle Clusterware 12c installed as your clusterware, then you can have an Oracle Database 10g release 1 (10.1) single-instance database running on one node, and separate Oracle Real Application Clusters 10g release 1 (10.1), 10g release 2 (10.2), and Oracle Real Application Clusters 11g release 1 (11.1) databases also running on the cluster. However, you cannot have Oracle Clusterware 10g release 2 (10.2) installed on your cluster, and install Oracle Real Application Clusters 11g. You can install Oracle Database 11g single-instance on a node in an Oracle Clusterware 10g release 2 (10.2) cluster.
When using different Oracle ASM and Oracle Database releases, the functionality of each depends on the functionality of the earlier software release. Thus, if you install Oracle Clusterware 11g and you later configure Oracle ASM, and you use Oracle Clusterware to support an existing Oracle Database 10g release 2 (10.2.0.3) installation, then the Oracle ASM functionality is equivalent only to that available in the 10g release 2 (10.2.0.3) release version. Set the compatible attributes of a disk group to the appropriate release of software in use.
See Also:Oracle Automatic Storage Management Administrator's Guide for information about compatible attributes of disk groups
There can be multiple Oracle homes for the Oracle database (both single instance and Oracle RAC) in the cluster. The Oracle homes for all nodes of an Oracle RAC database must be the same.
You can use different users for the Oracle Clusterware and Oracle database homes if they belong to the same primary group.
As of Oracle Clusterware 12c, there can only be one installation of Oracle ASM running in a cluster. Oracle ASM is always the same version as Oracle Clusterware, which must be the same (or higher) release than that of the Oracle database.
For Oracle RAC running Oracle9i you must run an Oracle9i cluster. For UNIX systems, that is HACMP, Serviceguard, Sun Cluster, or Veritas SF. For Windows and Linux systems, that is the Oracle Cluster Manager. To install Oracle RAC 10g, you must also install Oracle Clusterware.
Oracle recommends that you do not run different cluster software on the same servers unless they are certified to work together. However, if you are adding Oracle RAC to servers that are part of a cluster, either migrate to Oracle Clusterware or ensure that:
The clusterware you run is supported to run with Oracle RAC 12c.
You have installed the correct options for Oracle Clusterware and the other vendor clusterware to work together.
See Also:Oracle Grid Infrastructure Installation Guide for more version compatibility information
Oracle supports out-of-place upgrades, only, because Oracle Clusterware 12c must have its own, new Grid home. For Oracle Clusterware 12c, Oracle supports in-place or out-of-place patching. Oracle supports patch bundles and one-off patches for in-place patching but only supports patch sets and major point releases for out-of-place upgrades.
In-place patching replaces the Oracle Clusterware software with the newer version in the same Grid home. Out-of-place upgrade has both versions of the same software present on the nodes at the same time, in different Grid homes, but only one version is active.
Rolling upgrades avoid downtime and ensure continuous availability of Oracle Clusterware while the software is upgraded to the new version. When you upgrade to Oracle Clusterware 12c, Oracle Clusterware and Oracle ASM binaries are installed as a single binary called the Oracle Grid Infrastructure. You can upgrade Oracle Clusterware in a rolling manner from Oracle Clusterware 10g and Oracle Clusterware 11g, however you can only upgrade Oracle ASM in a rolling manner from Oracle Database 11g release 1 (11.1).
Oracle supports force upgrades in cases where some nodes of the cluster are down.
See Also:Your platform-specific Oracle Grid Infrastructure Installation Guide for procedures on upgrading Oracle Clusterware
The following list describes the tools and utilities for managing your Oracle Clusterware environment:
Cluster Health Monitor (CHM): Cluster Health Monitor detects and analyzes operating system and cluster resource-related degradation and failures to provide more details to users for many Oracle Clusterware and Oracle RAC issues, such as node eviction. The tool continuously tracks the operating system resource consumption at the node, process, and device levels. It collects and analyzes the clusterwide data. In real-time mode, when thresholds are met, the tool shows an alert to the user. For root-cause analysis, historical data can be replayed to understand what was happening at the time of failure.
See Also:"Cluster Health Monitor" for more information about CHM
Cluster Verification Utility (CVU): CVU is a command-line utility that you use to verify a range of cluster and Oracle RAC specific components. Use CVU to verify shared storage devices, networking configurations, system requirements, and Oracle Clusterware, and operating system groups and users.
Install and use CVU for both preinstallation and postinstallation checks of your cluster environment. CVU is especially useful during preinstallation and during installation of Oracle Clusterware and Oracle RAC components to ensure that your configuration meets the minimum installation requirements. Also use CVU to verify your configuration after completing administrative tasks, such as node additions and node deletions.
See Also:Your platform-specific Oracle Clusterware and Oracle RAC installation guide for information about how to manually install CVU, and Appendix A, "Cluster Verification Utility Reference" for more information about using CVU
Oracle Cluster Registry Configuration Tool (OCRCONFIG): OCRCONFIG is a command-line tool for OCR administration. You can also use the OCRCHECK and OCRDUMP utilities to troubleshoot configuration problems that affect OCR.
See Also:Chapter 2, "Administering Oracle Clusterware" for more information about managing OCR
Oracle Clusterware Control (CRSCTL): CRSCTL is a command-line tool that you can use to manage Oracle Clusterware. Use CRSCTL for general clusterware management, management of individual resources, configuration policies, and server pools for non-database applications.
Oracle Clusterware 12c introduces cluster-aware commands with which you can perform operations from any node in the cluster on another node in the cluster, or on all nodes in the cluster, depending on the operation.
You can use
crsctl commands to monitor cluster resources (
crsctl status resource) and to monitor and manage servers and server pools other than server pools that have names prefixed with
ora.*, such as
crsctl status server,
crsctl status serverpool,
crsctl modify serverpool, and
crsctl relocate server. You can also manage Oracle High Availability Services on the entire cluster (
crsctl start | stop | enable | disable | config crs), using the optional node-specific arguments
-all. You also can use CRSCTL to manage Oracle Clusterware on individual nodes (
crsctl start | stop | enable | disable | config crs).
Chapter 2, "Administering Oracle Clusterware" for more information about using
crsctl commands to manage Oracle Clusterware
Appendix E, "Oracle Clusterware Control (CRSCTL) Utility Reference" for a complete list of CRSCTL commands
Oracle Enterprise Manager: Oracle Enterprise Manager has both the Cloud Control and Grid Control GUI interfaces for managing both single instance and Oracle RAC database environments. It also has GUI interfaces to manage Oracle Clusterware and all components configured in the Oracle Grid Infrastructure installation. Oracle recommends that you use Oracle Enterprise Manager to perform administrative tasks.
See Also:Oracle Database 2 Day + Real Application Clusters Guide, Oracle Real Application Clusters Administration and Deployment Guide, and Oracle Enterprise Manager online documentation for more information about administering Oracle Clusterware with Oracle Enterprise Manager
Oracle Interface Configuration Tool (OIFCFG): OIFCFG is a command-line tool for both single-instance Oracle databases and Oracle RAC environments. Use OIFCFG to allocate and deallocate network interfaces to components. You can also use OIFCFG to direct components to use specific network interfaces and to retrieve component configuration information.
Note:You can only use SRVCTL to manage server pools that have names prefixed with
See Also:Oracle Real Application Clusters Administration and Deployment Guide for more information about SRVCTL
Cloning nodes is the preferred method of creating new clusters. The cloning process copies Oracle Clusterware software images to other nodes that have similar hardware and software. Use cloning to quickly create several clusters of the same configuration. Before using cloning, you must install an Oracle Clusterware home successfully on at least one node using the instructions in your platform-specific Oracle Clusterware installation guide.
For new installations, or if you must install on only one cluster, Oracle recommends that you use the automated and interactive installation methods, such as Oracle Universal Installer or the Provisioning Pack feature of Oracle Enterprise Manager. These methods perform installation checks to ensure a successful installation. To add or delete Oracle Clusterware from nodes in the cluster, use the
Oracle Clusterware provides many high availability application programming interfaces called CLSCRS APIs that you use to enable Oracle Clusterware to manage applications or processes that run in a cluster. The CLSCRS APIs enable you to provide high availability for all of your applications.
See Also:Appendix G, "Oracle Clusterware C Application Program Interfaces" for more detailed information about the CLSCRS APIs
You can define a VIP address for an application to enable users to access the application independently of the node in the cluster on which the application is running. This is referred to as the application VIP. You can define multiple application VIPs, with generally one application VIP defined for each application running. The application VIP is related to the application by making it dependent on the application resource defined by Oracle Clusterware.
To maintain high availability, Oracle Clusterware components can respond to status changes to restart applications and processes according to defined high availability rules. You can use the Oracle Clusterware high availability framework by registering your applications with Oracle Clusterware and configuring the clusterware to start, stop, or relocate your application processes. That is, you can make custom applications highly available by using Oracle Clusterware to create profiles that monitor, relocate, and restart your applications.
The Cluster Time Synchronization Service (CTSS) is installed as part of Oracle Clusterware and runs in observer mode if it detects a time synchronization service or a time synchronization service configuration, valid or broken, on the system. For example, if the
etc/ntp.conf file exists on any node in the cluster, then CTSS runs in observer mode even if no time synchronization software is running.
If CTSS detects that there is no time synchronization service or time synchronization service configuration on any node in the cluster, then CTSS goes into active mode and takes over time management for the cluster.
If CTSS is running in active mode while another, non-NTP, time synchronization software is running, then you can change CTSS to run in observer mode by creating a file called
etc/ntp.conf. CTSS puts an entry in the alert log about the change to observer mode.
When nodes join the cluster, if CTSS is in active mode, then it compares the time on those nodes to a reference clock located on one node in the cluster. If there is a discrepancy between the two times and the discrepancy is within a certain stepping limit, then CTSS performs step time synchronization, which is to step the time, forward or backward, of the nodes joining the cluster to synchronize them with the reference.
Clocks on nodes in the cluster become desynchronized with the reference clock (a time CTSS uses as a basis and is on the first node started in the cluster) periodically for various reasons. When this happens, CTSS performs slew time synchronization, which is to speed up or slow down the system time on the nodes until they are synchronized with the reference system time. In this time synchronization method, CTSS does not adjust time backward, which guarantees monotonic increase of the system time.
When Oracle Clusterware starts, if CTSS is running in active mode and the time discrepancy is outside the stepping limit (the limit is 24 hours), then CTSS generates an alert in the alert log, exits, and Oracle Clusterware startup fails. You must manually adjust the time of the nodes joining the cluster to synchronize with the cluster, after which Oracle Clusterware can start and CTSS can manage the time for the nodes.
When performing slew time synchronization, CTSS never runs time backward to synchronize with the reference clock. CTSS periodically writes alerts to the alert log containing information about how often it adjusts time on nodes to keep them synchronized with the reference clock.
CTSS writes entries to the Oracle Clusterware alert log and
syslog when it:
Detects a time change
Detects significant time difference from the reference node
The mode switches from observer to active or vice versa
Having CTSS running to synchronize time in a cluster facilitates troubleshooting Oracle Clusterware problems, because you will not have to factor in a time offset for a sequence of events on different nodes.
To activate CTSS in your cluster, you must stop and deconfigure the vendor time synchronization service on all nodes in the cluster. CTSS detects when this happens and assumes time management for the cluster.
For example, to deconfigure NTP, you must remove or rename the
Similarly, to deactivate CTSS in your cluster:
Configure the vendor time synchronization service on all nodes in the cluster. CTSS detects this change and reverts back to observer mode.
crsctl check ctss command to ensure that CTSS is operating in observer mode.
Start the vendor time synchronization service on all nodes in the cluster.
cluvfy comp clocksync -n all command to verify that the vendor time synchronization service is operating.
See Also:Oracle Grid Infrastructure Installation Guide for your platform for information about configuring NTP for Oracle Clusterware, or disabling it to use CTSS
Footnote LegendFootnote 1: Oracle Clusterware supports up to 100 nodes in a cluster on configurations running Oracle Database 10g release 2 (10.2) and later releases.