12 Overview of Oracle RAC and Clusterware Best Practices
Oracle Clusterware and Oracle Real Application Clusters (RAC) are Oracle's strategic high availability and resource management database framework in a cluster environment, and an integral part of the Oracle MAA Silver reference architecture.
Adding Oracle RAC to a Bronze MAA reference architecture elevates it to a Silver MAA reference architecture. The Silver MAA reference architecture is designed for databases that can’t afford to wait for a cold restart or a restore from backup, should there be an unrecoverable database instance or server failure.
The Silver reference architecture has the potential to provide zero downtime for node or instance failures, and zero downtime for most database and system software updates, that are not achievable with the Bronze architecture. To learn more about the Silver MAA reference architecture, see High Availability Reference Architectures.
Oracle Clusterware and Oracle RAC provide the following benefits:
- High availability framework and cluster management solution
- Manages resources, such as Virtual Internet Protocol (VIP) addresses, databases, listeners, and services
- Provides HA framework for Oracle database resources and non-Oracle database resources, such as third party agents
-
Active-active clustering for scalability and availability
- High Availability If a server or database instance fails, connections to surviving instances are not affected; connections to the failed instance quickly failover to surviving instances that are already running and open on other servers in the Oracle RAC cluster
- Scalability and Performance Oracle RAC is ideal for high-volume applications or consolidated environments where scalability and the ability to dynamically add or re-prioritize capacity across more than a single server are required. An individual database may have instances running on one or more nodes of a cluster. Similarly, a database service may be available on one or more database instances. Additional nodes, database instances, and database services can be provisioned online. The ability to easily distribute workload across the cluster makes Oracle RAC the ideal complement for Oracle Multitenant when consolidating many databases.
The following table highlights various Oracle Clusterware and Real Application Cluster configuration best practices.
Table 12-1 Oracle RAC HA Use Cases and Best Practices
Use Case | Best Practices |
---|---|
Certified and validated Clusterware software stack |
Use Oracle Clusterware and avoid third-party Clusterware. See Oracle Database Clusterware Administration and Deployment Guide Clusterware is built-in to all Oracle Exadata Systems. |
Certified and validated storage architecture |
Use Oracle Automatic Storage Management (Oracle ASM) and Oracle Advanced Cluster File System (Oracle ACFS) instead of third party volume managers and cluster file systems for the following MAA benefits:
When using ASM with external redundancy, ensure that the underlying storage and network is highly available with no single point of failure. When using ASM native redundancy, high redundancy diskgroups are recommended to provide maximum protection for unplanned outages and during storage software updates. By default Exadata deployments use high redundancy for all diskgroups (both for data and recovery destinations). Oracle ACFS is a multi-platform, scalable file system and storage management technology that extends Oracle ASM functionality to support all customer files and can be leveraged for non-database files. These best practices are built-in to all Oracle Exadata Systems. |
Certified and validated network architecture |
Ensure that the entire database and storage network topology has multiple network paths with no single point of failure. When connecting to the database service, use built-in Virtual Internet Protocol (VIP) addresses, Single Client Access Name (SCAN), and multiple local SCAN listeners configured over a bonded client network. Use a separate high bandwidth, bonded network for backup or Data Guard traffic. For the private network used as the cluster interconnect, Oracle recommends that
non-Exadata customers use Oracle HAIP for network redundancy instead of using bonded
networks. Bonding configurations have various attributes that behave differently
with different network cards and switch settings. This recommendation does not apply
to the private cluster interconnect in Exadata environments, because the bond setup
has been properly configured and validated. Further, Exadata uses the
|
Cluster configuration checks |
Use Cluster Verification Utility (CVU) at monthly intervals to validate a range of cluster and Oracle RAC components such as shared storage devices, networking configurations, system requirements, and Oracle Clusterware. See Cluster Verification Utility Reference To perform a holistic, proactive health check and to evaluate if Oracle RAC or
Exadata best practices are being followed, use See ORAchk - Health Checks for the Oracle Stack (Doc ID 1268927.2)and Oracle Exadata Database Machine exachk or HealthCheck (Doc ID 1070954.1). Note that both Incorporate configuration recommendations from CVU, |
Reduce downtime for database node or instance failures |
Typically, the default settings are sufficient for most use cases. If node
detection and instance recovery need to be expedited, evaluate lower values for
Reducing For Exadata systems, Instant Failure Detection capabilities use remote direct memory access (RDMA) to quickly confirm server failures in less than 2 seconds compared to typical 30 seconds detection found in most Oracle RAC clusters. |
Eliminate downtime for software updates |
Use Oracle RAC rolling updates for Clusterware or database software updates (for example, Release Updates) to avoid downtime. Use out-of-place software updates when possible, so rollback and fallback use cases are simplified. Use software gold images to eliminate the complexity of running database
For a fleet of databases on a single Oracle RAC cluster or multiple clusters, use Oracle Fleet Patching and Provisioning |
Make application and processes highly available on the cluster |
When an application, process, or server fails in a cluster, you want the disruption
to be as short as possible and transparent to users. For example, when an
application fails on a server, that application can be restarted on another server
in the cluster, minimizing or negating any disruption in the use of that
application. Similarly, if a server in a cluster fails, then all of the applications
and processes running on that server must failover to another server to continue
providing service to the users. Using the built-in
Use Oracle Clusterware to manage third-party resources and agents that reside on the cluster. See Making Applications Highly Available Using Oracle Clusterware |
Reduce application downtime for planned and unplanned outages |
Leverage Clusterware-managed services and application best practices to achieve zero application downtime. Use Applications should subscribe to HA Fast Application Notifications (FAN) and be configured to respond and failover if required. See Enabling Continuous Service for Applications and Continuous Availability - Application Checklist for Continuous Service for MAA Solutions |
Capacity planning |
Capacity planning and sizing should be done before deployment, and periodically afterward, to ensure that there are sufficient system resources to meet application performance requirements. Capacity planning needs to accommodate growth or consolidation of databases, additional application workloads, additional processes, or anything that strains existing system resources. Evaluating if performance requirements are still met during an unplanned outage or planned maintenance events is also crucial. |