2
Application Deployment in Real Application Clusters

This chapter discusses both the deployment of online e-commerce (OLTP) applications and the use of data warehouse applications in Oracle Real Application Clusters environments. This chapter also briefly describes application performance monitoring. The topics in this chapter are:

Deployment Strategies for Real Application Clusters-Based Applications

All single-instance application development and deployment techniques apply to Real Application Clusters. If your applications run well on a single-instance Oracle database, then they will run well on Real Application Clusters.

Deploying E-Commerce Applications with Real Application Clusters

Cache Fusion makes Real Application Clusters databases the optimal deployment servers for online e-commerce applications. This is because these types of applications require:

High availability in the event of failures
Scalability to accommodate increased system demands
Load balancing according to demand fluctuations

The high availability features of Oracle and Real Application Clusters can re-distribute and load balance the workloads to surviving instances without interrupting processing. Real Application Clusters also provides excellent scalability so that if you add or replace a node, Oracle re-masters resources and re-distributes processing loads without reconfiguration or application re-partitioning.

The application workload management feature of Real Application Clusters is highly dynamic. Real Application Clusters can alter workloads in real-time based on changing business requirements. This occurs in a manageable environment with minimal administrative overhead. The dynamic resource allocation capabilities of the Cache Fusion architecture provide optimal performance for online applications with great deployment flexibility.

Flexible Implementation with Cache Fusion

E-commerce requirements, especially the requirements of online transaction processing systems, have frequently changing workloads. To accommodate this, Real Application Clusters remains flexible and dynamic despite changes in system load and system availability. Real Application Clusters addresses a wide range of service levels that, for example, fluctuate due to:

Varying user demands
Peak scalability issues like trading storms (bursts of high volumes of transactions)
Varying availability of system resources

To accommodate these requirements, it is impractical and often too complex to partition packaged e-commerce applications. Once deployed, such applications can access hundreds or even thousands of tables. A common recommendation to meet these service levels is to purchase a larger server. Or you may be told to segment your application or application modules across distinct databases. Segmenting applications, however, can fragment data and constrain a global enterprise-wide view of your information.

Real Application Clusters eliminates the need to purchase excess hardware and it avoids application segmentation. Instead, Real Application Clusters meets these demands by dynamically migrating the mastering of database resources to adapt to changing business requirements and workloads.

Deploying Data Warehouse Applications with Real Application Clusters

This section discusses deploying data warehouse systems in Real Application Clusters environments by briefly describing the data warehouse features available in shared disk architectures. The topics in this section are:

Speed-Up for Data Warehouse Applications on Real Application Clusters
Dynamic Parallel-Aware Query Optimization

See Also:
Oracle9i Data Warehousing Guide for detailed information about implementing data warehouse applications in Real Application Clusters environments

Speed-Up for Data Warehouse Applications on Real Application Clusters

Real Application Clusters is ideal for data warehouse applications because it augments the single instance benefits of Oracle. Real Application Clusters does this by maximizing the processing available on all nodes of a cluster database to provide speed-up and scale-up for data warehouse systems.

For example, e-businesses strategically use data warehouse systems to acquire customers and expand markets. Companies can use product promotions to gather information using data warehouse systems. They can then customize client lists to best fit the profiles of the company's target demographics.

Flexible Parallelism within Real Application Clusters Environments

Oracle's parallel execution feature uses multiple processes to execute SQL statements on one or more CPUs. Parallel execution is available on both single instance and Real Application Clusters databases.

Real Application Clusters takes full advantage of parallel execution by distributing parallel processing to all the nodes in your cluster. The number of processes that can participate in parallel operations depends on the degree of parallelism (DOP) assigned to each table or index.

Function Shipping

On loosely coupled systems, Oracle's parallel execution technology uses a function shipping strategy to perform work on remote nodes. Oracle's parallel architecture uses function shipping when the target data is located on the remote node. This delivers efficient parallel execution and eliminates unneeded internode data transfers over the interconnect.

Exploitation of Data Locality

On some hardware systems, powerful data locality capabilities were more relevant when shared nothing hardware systems were popular. However, almost all current cluster systems use a shared disk architecture.

On shared nothing systems, each node has direct hardware connectivity to a subset of disk devices. On these systems it is more efficient to access local devices from the owning nodes. Real Application Clusters exploits this affinity of devices to nodes and delivers performance that is superior to shared nothing systems using cluster configurations and a shared disk architecture.

As with other elements of Cache Fusion, Oracle's strategy works transparently without data partitioning. Oracle dynamically detects the disk on which the target data resides and makes intelligent use of the data's location in the following two ways:

Oracle spawns parallel execution server processes on nodes where the data is waiting to be processed
Oracle assigns local data partitions to each sub-process to eliminate or minimize internode data movement

Dynamic Parallel-Aware Query Optimization

Oracle's cost-based optimizer considers parallel execution when determining the optimal execution plans. The optimizer dynamically computes intelligent heuristic defaults for parallelism based on the number of processors.

An evaluation of the costs of alternative access paths--table scans versus indexed access, for example--takes into account the degree of parallelism (DOP) available for the operation. This results in Oracle selecting execution plans that are optimized for parallel execution.

Oracle also makes intelligent decisions in Real Application Clusters environments with regard to intranode and internode parallelism. For intranode parallelism, for example, if a SQL statement requires six query sub-processes and six CPUs are idle on the local node, or the node to which the user is connected, then the SQL statement is processed using local resources. This eliminates query coordination overhead across multiple nodes.

Continuing with this example: if there are only two CPUs on the local node, then those two CPUs and four CPUs of another node are used to complete the SQL statement. In this manner, Oracle uses both internode and intranode parallelism to provide speed-up for query operations.

In all real world data warehouse applications, SQL statements are not perfectly partitioned across the different parallel execution servers. Therefore, some CPUs in the system complete their assigned work and become idle sooner than others. Oracle's parallel execution technology is able to dynamically detect idle CPUs and assign work to these idle CPUs from the execution queue of the CPUs with greater workloads. In this way, Oracle efficiently re-distributes the query workload across all of the CPUs in the system. Real Application Clusters extends these efficiencies to clusters by re-distributing the work across all the nodes of the cluster.

Load Balancing for Multiple Concurrent Parallel Operations

Load balancing distributes parallel execution server processes to spread CPU and memory use evenly among nodes. It also minimizes communication and remote I/O. Oracle does this by allocating parallel execution servers to the nodes that are running the fewest number of processes.

The load balancing algorithm maintains an even load across all nodes. For example, if a degree of parallelism of eight is requested on an eight-node system with one CPU for each node, the algorithm places two servers on each node. If the entire parallel execution server group fits on one node, then the load balancing algorithm places all the processes on a single node to avoid communications overhead. For example, if you use a DOP of 8 on a two-node cluster with 16 CPUs for each node, then the algorithm places all 8 parallel execution server processes on one node.

Using Parallel Instance Groups

You can control which instances allocate parallel execution server processes with instance groups. To do this, assign each active instance to at least one or more instance groups. Then dynamically control which instances spawn parallel processes by activating a particular group of instances.

Establish instance group membership on an instance-by-instance basis by setting the INSTANCE_GROUPS initialization parameter to a name representing one or more instance groups. For example, on a 32-node system owned by both a Marketing and a Sales organization, you could assign half the nodes to one organization and the other half to the other organization using instance group names. To do this, assign nodes 1-16 to the Marketing organization using the following parameter syntax in your initialization parameter file:

sid|1-16|.INSTANCE_GROUPS=marketing

Then assign nodes 17-32 to Sales using the following syntax in the parameter file:

sid|17-32|.INSTANCE_GROUPS=sales

Activate the nodes owned by Sales to spawn a parallel execution server process by entering the following:

ALTER SESSION SET PARALLEL_INSTANCE_GROUP = 'sales';

In response, Oracle allocates parallel execution server processes to nodes 17-32. The default value for PARALLEL_INSTANCE_GROUP is all active instances.

Note:

An instance can belong to one or more groups. You can enter several instance group names with the INSTANCE_GROUPS parameter using a comma as a separator.

Disk Affinity

Disk affinity refers to the relationship of an instance to the data that it accesses. The more often an instance accesses a particular set of data, the greater the affinity that instance has to the disk on which the data resides.

Disk affinity is used for parallel table scans, parallel temporary tablespace allocation, parallel DML, and parallel index scans. It is not used for parallel table creation or parallel index creation. Access to temporary tablespaces preferentially uses local datafiles. It guarantees optimal space management extent allocation. Disks striped by the operating system are treated by disk affinity as a single unit.

Without disk affinity, Oracle attempts to balance the allocation of parallel execution servers evenly across instances. With disk affinity, Oracle allocates parallel execution servers for parallel table scans on the instances that most frequently access the requested data.

Assume that Oracle is performing a full table scan on table T and that instances 1, 2, and 3 have a affinity for that table and instance 4 does not:

If a query requires two instances, then two instances from the set 1, 2, and 3 are used.
If a query requires three instances, then instances 1, 2, and 3 are used.
If a query requires four instances, then all four instances are used.
If there are two concurrent operations against table T, each requiring three instances, and enough processes are available on the instances for both operations, then both operations use instances 1, 2, and 3. Instance 4 is not used. In contrast, without disk affinity, instance 4 is used.

See Also:
Oracle9i Real Application Clusters Concepts for more information about instance affinity

Deploying Internet-Based Applications

Testing your application's scalability, availability, and load balancing capabilities are some of the most challenging aspects of internet-based application deployment. In rapid prototyping environments, for example, you can internally test and benchmark your site with a limited number of Real Application Clusters instances on a test hardware platform.

All applications have limits on the number of users they support given the constraints of specific hardware and software architectures. To accommodate anticipated demand, you can estimate the traffic loads on your system and determine how many nodes you need. You should also consider peak workloads.

Your scalability tests must also simulate user access to your Web site. To do this, configure your traffic generator to issue pseudo get commands as used in the Hypertext Transfer Protocol (http). This tests your system's performance and load processing capabilities under fairly realistic conditions.

You can then conduct structured, Web-based testing using traffic generators to stress test your system. This type of persistent testing against your most aggressive performance goals should reveal any capacity limitations.

After making further enhancements as dictated by your testing results, prototype your site to early adopters. This enables you to obtain real-world benchmarks against which you can further refine your system's performance.

Note:

These techniques are not Real Application Clusters-specific. You could also use them for deploying single-instance Web-based Oracle database applications.

Application Performance in Real Application Clusters

You use the same methodology to develop and tune applications in Real Application Clusters as you use for single-instance Oracle databases. If your application ran efficiently on a single-instance Oracle database, then it will run well on Real Application Clusters.

If necessary, refer to the deployment techniques outlined in Chapter 3, "Database Deployment Techniques in Real Application Clusters" in the unlikely event that your application experiences hot spots. You can also refer to the performance monitoring recommendations in Chapter 4, "Monitoring Real Application Clusters Performance".

Administrative Aspects of System Scalability for Real Application Clusters

You may need to add nodes before deployment or during production to accommodate growth requirements or to replace failed hardware. Adding a node to your Real Application Clusters environment involves two main steps:

Adding a node at the clusterware layer
Adding a node at the Oracle layer

To add a node, connect the hardware according to your vendor's installation instructions. Then install the operating system-dependent (OSD) clusterware. Complete the installation of the instance on the new node using the Oracle Universal Installer (OUI) and the Database Configuration Assistant (DBCA). On some platforms you can dynamically add nodes and instances.

See Also:

Oracle9i Real Application Clusters Administration for detailed procedures on adding nodes and instances

The next part of this book describes application deployment and performance monitoring considerations for Real Application Clusters databases.

2 Application Deployment in Real Application Clusters