Implementing Oracle best practices can provide high availability for the Oracle database and related technology.
This chapter contains these topics:
Designing and implementing a high availability architecture can be a daunting task given the broad range of Oracle technologies and deployment options. A successful effort begins with clearly defined and thoroughly understood business requirements. Thorough analysis of the business requirements enables you to make intelligent design decisions and develop an architecture that addresses your business needs in the most cost effective manner. The architecture you choose must achieve the required levels of availability, performance, scalability, and security. Moreover, the architecture should have a clearly defined plan for deployment and ongoing management that minimizes complexity and business risk.
Once your business requirements are understood, you should begin designing your high availability architecture by reading the Oracle Database High Availability Overview to get a high-level view of the various Oracle high availability solutions that comprise the Oracle Maximum Availability Architecture (MAA). This should result in an architecture that can be validated and fully vetted by using the best practices that are documented in this book.
Oracle high availability best practices help you deploy a highly available architecture throughout your enterprise. Having a set of technical and operational best practices helps you achieve high availability and reduces the cost associated with the implementation and ongoing maintenance. Also, employing best practices can optimize usage of system resources.
By implementing the high availability best practices described in this book, you can:
Reduce the implementation cost of an Oracle Database high availability system by following detailed guidelines on configuring your database, storage, application failover, backup and recovery as described in Chapter 2, "Configuring for High Availability"
Avoid potential downtime by monitoring and maintaining your database using Oracle Grid Control as described in Chapter 3, "Monitoring Using Oracle Grid Control"
Recover quickly from unscheduled outages caused by computer failure, storage failure, human error, or data corruption as described in Chapter 4, "Managing Unscheduled Outages"
Eliminate or reduce downtime that might occur due to scheduled maintenance such as database patches or application upgrades as described in Chapter 4, "Managing Unscheduled Outages"
Oracle Maximum Availability Architecture (MAA) is a best practices blueprint based on proven Oracle high availability technologies and recommendations. MAA involves high availability best practices for all Oracle products—Oracle Database, Oracle Application Server, Oracle Applications, Oracle Collaboration Suite, and Oracle Grid Control.
Some key attributes of MAA include:
Using database grid servers and storage grid with low-cost storage to provide highly resilient, lower cost infrastructure
Using results from extensive performance impact studies for different configurations to ensure that the high availability architecture is optimally configured to perform and scale to business needs
Giving the ability to control the length of time to recover from an outage and the amount of acceptable data loss from any outage
Evolving with each Oracle version and is completely independent of hardware and operating system
One of the best ways to reduce downtime is incorporating operational best practices. You can often prevent problems and downtime before they occur by rigorously testing changes in your test environment, following stringent change control policies to guard your primary database from harm, and having a well-validated repair strategy for each outage type.
A monitoring infrastructure such as Grid Control is essential to quickly detect problems. Having an outage and repair decision tree and an automated repair facility reduces downtime by eliminating or reducing decision and repair times.
Key operational best practices include the following:
Create test environments
A good test environment accurately mimics the production system to test changes and prevent problems before they can affect your business. Testing best practices should include thorough functional testing of the applications for correctness and replication of the production workload as a whole to ensure that system performance is acceptable.
Establish change control and security procedures
Change control and security procedures maintain the stability of the system and ensure that no changes are incorporated in the primary database unless they have been rigorously evaluated on your test systems.
The biggest threat to corporate data comes from employees and contractors with internal access to networks and facilities. Corporate data can be at grave risk if placed on a system or database that does not have proper security measures in place. A well-defined security policy can help protect your systems from unwanted access and protect sensitive corporate information from sabotage. Proper data protection reduces the chance of outages due to security breaches.
Use Grid Control or another monitoring infrastructure to detect and react to potential failures and problems before they occur
Monitor system, network, and database statistics
Monitor performance statistics
Create performance thresholds as early warning indicators of system or application problems
Use MAA recommended repair strategies and create an outage and repair decision tree for crisis scenarios using the recommended MAA matrix