|Oracle7 Parallel Server Concepts and Administrator's Guide||
To do this successfully you must understand how multiprocessing works, what resources it requires, and when you can--and cannot--effectively apply it. This chapter answers the following questions:
Note: A node is a separate processor, often on a separate machine. Multiple processors, however, can reside on a single machine.
Some tasks can be effectively divided, and thus are good candidates for parallel processing. Other tasks, however, do not lend themselves to this approach.
For example, in a bank with only one teller, all customers must form a single queue to be served. With two tellers, the task can be effectively split so that customers form two queues and are served twice as fast--or they can form a single queue to provide fairness. This is an instance in which parallel processing is an effective solution.
By contrast, if the bank manager must approve all loan requests, parallel processing will not necessarily speed up the flow of loans. No matter how many tellers are available to process loans, all the requests must form a single queue for bank manager approval. No amount of parallel processing can overcome this built-in bottleneck to the system.
Figure 1 - 1 and Figure 1 - 2 contrast sequential processing of a single parallel query with parallel processing of the same query.
Figure 1 - 1. Sequential Processing of a Large Task
Figure 1 - 2. Parallel Processing: Executing Component Tasks in Parallel
In sequential processing, the query is executed as a single large task. In parallel processing, the query is divided into multiple smaller tasks, and each component task is executed on a separate node.
The following figures contrast sequential processing with parallel processing of multiple independent tasks from an online transaction processing (OLTP) environment.
Figure 1 - 3. Sequential Processing of Multiple Independent Tasks
In sequential processing (Figure 1 - 3), independent tasks compete for a single resource. Only task 1 runs without having to wait. Task 2 must wait until task 1 has completed; task 3 must wait until tasks 1 and 2 have completed, and so on. (Although the figure shows the independent tasks as the same size, the size of the tasks will vary.)
Figure 1 - 4. Parallel Processing: Executing Independent Tasks in Parallel
Figure 1 - 4 illustrates parallel processing (for example, a parallel server on a symmetric multiprocessor), where more CPU power is assigned to the tasks. Each independent task executes immediately on its own processor: no wait time is involved.
Database management systems that support only one type of hardware limit the portability of applications, the potential to migrate applications to new hardware systems, and the scalability of applications. Oracle Parallel Server (OPS) exploits both clusters and MPP systems, and has no such limitations. (Oracle without the Parallel Server Option exploits single CPU or SMP machines.)
Parallel database software is often specialized--usually to serve as query processors. Since they are designed to serve a single function, however, specialized servers do not provide a common foundation for integrated operations. These include online decision support, batch reporting, data warehousing, OLTP, distributed operations, and high availability systems. Specialized servers have been used most successfully in the area of very large databases: in DSS applications, for example.
Versatile parallel database software should offer excellent price/performance on open systems hardware, and be designed to serve a wide variety of enterprise computing needs. Features such as online backup, data replication, portability, interoperability, and support for a wide variety of client tools can enable a parallel server to support application integration, distributed operations, and mixed application workloads.
A parallel server processes transactions in parallel by servicing a stream of transactions using multiple CPUs on different nodes, where each CPU processes an entire transaction. This is an efficient approach because many applications consist of online insert and update transactions which tend to have short data access requirements. In addition to balancing the workload among CPUs, the parallel database provides for concurrent access to data and protects data integrity.
See Also: "Is Parallel Server the Oracle Configuration You Need?" for a discussion of the available Oracle configurations.
Figure 1 - 5. Speedup
With good speedup, additional processors reduce system response time.
You can measure speedup using this formula:
is the elapsed time spent by a small system on the given task
is the elapsed time spent by a larger, parallel system on the given task
For example, if the original system took 60 seconds to perform a task, and two parallel systems took 30 seconds, then the value of speedup would be equal to 2.
A value of n, where n times more hardware is used indicates the ideal of linear speedup: when twice as much hardware can perform the same task in half the time (or when three times as much hardware performs the same task in a third of the time, and so on).
Attention: For most OLTP applications, no speedup can be expected: only scaleup. The overhead due to synchronization may, in fact, cause speed-down.
Figure 1 - 6. Scaleup
With good scaleup, if transaction volumes grow, you can keep response time constant by adding hardware resources such as CPUs.
You can measure scaleup using this formula:
is the transaction volume processed in a given amount of time on a small system
is the transaction volume processed in a given amount of time on a parallel system
For example, if the original system can process 100 transactions in a given amount of time, and the parallel system can process 200 transactions in this amount of time, then the value of scaleup would be equal to 2. That is, 200/100 = 2.
A value of 2 indicates the ideal of linear scaleup: when twice as much hardware can process twice the data volume in the same amount of time.
Synchronization: A Critical Success Factor
Coordination of concurrent tasks is called synchronization. Synchronization is necessary for correctness. The key to successful parallel processing is to divide up tasks so that very little synchronization is necessary. The less synchronization necessary, the better the speedup and scaleup.
In parallel processing between nodes, a high-speed interconnect is required among the parallel processors. The overhead of this synchronization can be very expensive if a great deal of inter-node communication is necessary. For parallel processing within a node, messaging is not necessary: shared memory is used instead. Messaging and locking between nodes is handled by the distributed lock manager (DLM).
The amount of synchronization depends on the amount of resources and the number of users and tasks working on the resources. Little synchronization may be needed to coordinate a small number of concurrent tasks, but lots of synchronization may be necessary to coordinate many concurrent tasks.
Attention: Too much time spent in synchronization can diminish the benefits of parallel processing. With less time spent in synchronization, better speedup and scaleup can be achieved.
Response time equals time spent waiting and time spent doing useful work. Table 1 - 1 illustrates how overhead increases as more concurrent processes are added. If 3 processes request a service at the same time, and they are served serially, then response time for process 1 is 1 second. Response time for process 2 is 2 seconds (waiting 1 second for process 1 to complete, then being serviced for 1 second). Response time for process 3 is 3 seconds (2 seconds waiting time plus 1 second service time).
|Process Number||Service Time||Waiting Time||Response Time|
|1||1 second||0 seconds||1 second|
|2||1 second||1 second||2 seconds|
|3||1 second||2 seconds||3 seconds|
One task, in fact, may require multiple messages. If tasks must continually wait to synchronize, then several messages may be needed per task.
Sometimes synchronization can be accomplished very cheaply. In other cases, however, the cost of synchronization may be too high. For example, if one table takes inserts from many nodes, a lot of synchronization is necessary. There will be high contention from the different nodes to insert into the same datablock: the datablock must be passed between the different nodes. This kind of synchronization can be done--but not efficiently.
See Also: "Application Analysis" .
"Tuning the System to Optimize Performance" .
"Distributed Lock Manager: Access to Resources" .
A distributed lock manager (DLM) is the external locking facility used with Oracle Parallel Server. A DLM is operating system software which coordinates resource sharing between nodes running a parallel server. The instances of a parallel server use the distributed lock manager to communicate with each other and coordinate modification of database resources. Each node operates independently of other nodes, except when contending for the same resource.
The DLM allows applications to synchronize access to resources such as data, software, and peripheral devices, so that concurrent requests for the same resource are coordinated between applications running on different nodes.
The DLM performs the following services for applications:
"Distributed Lock Manager: Access to Resources" .
Bandwidth is the total size of messages which can be sent per second. Latency is the time (in seconds) it takes to place a message on the interconnect. Latency thus indicates the number of messages which can be put on the interconnect per second.
An interconnect with high bandwidth is like a wide highway with many lanes to accommodate heavy traffic: the number of lanes affects the speed at which traffic can move. An interconnect with low latency is like a highway with an entrance ramp which permits vehicles to enter without delay: the cost of getting on the highway is low.
MPP systems characteristically use interconnects with high bandwidth and low latency; clusters use Ethernet connections with relatively low bandwidth and high latency.
The following table shows which types of workload can attain speedup and scaleup with properly implemented parallel processing.
|Parallel Query Option||Yes||Yes|
If processes can run ten times faster, then the system can accomplish ten times more in the original amount of time. The parallel query option, for example, permits scaleup: a system might maintain the same response time if the data queried increases tenfold, or if more users can be served. Oracle Parallel Server without the parallel query option also permits scaleup, but by running the same query sequentially on different nodes.
With a mixed workload of DSS, OLTP, and reporting applications, scaleup can be achieved by running multiple programs on different nodes. Speedup can also be achieved if you rewrite the batch programs, splitting them into a number of parallel streams to take advantage of the multiple CPUs which are now available.
For OLTP applications, however, no speedup can be expected: only scaleup. With OLTP applications each process is independent: even with parallel processing, each insert or update on an order table will still run at the same speed. In fact, the overhead due to synchronization may cause a slight speed-down.
Speedup can also be achieved with batch processing, but the degree of speedup depends on the synchronization between tasks.
Oracle Parallel Server provides the framework for the Parallel Query Option to work between nodes. The Parallel Query Option behaves the same way in Oracle with or without the Parallel Server Option. The only difference is that OPS enables the parallel query option to ship queries between nodes so that multiple nodes can execute on behalf of a single query.
In some applications (notably decision support or "DSS" applications), an individual query often consumes a great deal of CPU resource and disk I/O, unlike most online insert or update transactions. To take advantage of multiprocessing systems, the data server must parallelize individual queries into units of work which can be processed simultaneously. The following figure shows an example of parallel query processing.
Figure 1 - 7. Example of Parallel Query Processing
If the query were not processed in parallel, disks would be read serially with a single I/O. A single CPU would have to scan all rows in the LINE_ITEMS table and total the revenues across all rows. With the query parallelized, disks are read in parallel, with multiple I/Os. Several CPUs can each scan a part of the table in parallel, and aggregate the results. Parallel query benefits not only from multiple CPUs but also from more of the available I/O bandwidth.
See Also: "The Parallel Query Option on OPS" .
Note: Support for any given Oracle configuration is platform-dependent; check to confirm that your platform supports the configuration you want.
For optimal performance, configure your system according to your particular application requirements and available resources, then design and tune the database and applications to make the best use of the configuration. Consider also the migration of existing hardware or software to the new system or to future systems.
The following sections help you determine which Oracle configuration best meets your needs.
See Also: "Parallel Hardware Architecture" .
Figure 1 - 8. Single Instance Database System
A single instance accessing a single database can improve performance by running on a larger computer. A large single computer does not require coordination between several nodes and generally performs better than two small computers in a multinode system. However, two small computers often cost less than one large one.
The cost of redesigning and tuning your database and applications for the Parallel Server Option might be significant if you want to migrate from a single computer to a multinode system. In situations like this, consider whether, a larger single computer might be a better solution than moving to a parallel server.
See Also: Oracle7 Server Concepts for complete information about single instance Oracle.
Figure 1 - 9. Multi-Instance Database System
As noted in the preceding figure, this database system requires a distributed lock manager (DLM) which provides an LCK background process on each instance. Such a configuration minimizes the use of the distributed lock manager and eliminates unnecessary I/O.
In a parallel server, instances are decoupled from databases. In exclusive mode, there is a one-to-one correspondence of instance to database. In shared (parallel) mode, however, there can be many instances to a single database.
In general, any single application performs best when it has exclusive access to a database on a larger system, as compared with its performance on a smaller node of a multinode environment. This is because the cost of synchronization may become too high if you go to a multinode environment. The performance difference depends on characteristics of that application and all other applications sharing access to the database.
Applications with one or both of the following characteristics are well suited to run on separate instances of a parallel server:
"The Distributed Lock Manager: Access to Resources" .
Oracle7 Server Concepts for more information on the DBWR, LGWR, and LCK background processes.
Note: Oracle Parallel Server can be one of the constituents of a distributed database.
The following figure illustrates a distributed database system. This database system requires the RECO background process on each instance. There is no LCK background process because this is not an Oracle Parallel Server configuration, and no distributed lock manager is needed.
Figure 1 - 10. Distributed Database System
The multiple databases of a distributed system can be treated as one logical database, because servers can access remote databases transparently, using SQL*Net.
If your data can be partitioned into multiple databases with minimal overlap, you can use a distributed database system instead of a parallel server, sharing data between the databases with SQL*Net. A parallel server provides automatic data sharing among nodes through the common database.
A distributed database system allows you to keep your data at several widely separated sites. Users can access data from databases which are geographically distant, as long as network connections exist between the separate nodes. A parallel server requires all data to be at a single site because of the requirement for low latency, high bandwidth communication between nodes, but it can also be part of a distributed database system. Such a system is illustrated in the following figure.
Figure 1 - 11. Oracle Parallel Server as Part of a Distributed Database
Multiple databases require separate database administration, and a distributed database system requires coordinated administration of the databases and network protocols. A parallel server can consolidate several databases to simplify administrative tasks.
Multiple databases can provide greater availability than a single instance accessing a single database, because an instance failure in a distributed database system does not prevent access to data in the other databases: only the database owned by the failed instance is inaccessible. A parallel server, however, allows continued access to all data when one instance fails, including data which was accessed by the instance running on the failed node.
A parallel server accessing a single consolidated database can avoid the need for distributed updates, inserts, or deletions and more expensive two-phase commits by allowing a transaction on any node to write to multiple tables simultaneously, regardless of which nodes usually write to those tables.
See Also: Oracle7 Server Distributed Systems, Volume I for complete information about Oracle with the Distributed Option.
The following figure illustrates an Oracle client-server system.
Figure 1 - 12. Client-Server System
Note: Client-server processing is suitable for any Oracle configuration. Check your Oracle platform-specific documentation to see whether it is implemented on your platform.
The client-server configuration allows you to offload processing from the computer which runs an Oracle server. If you have too many applications running on one machine, you can offload them to improve performance. However, if your database server is reaching its processing limits you might want to move either to a larger machine or to a multinode system.
For compute-intensive applications, you could run some applications on one node of a multinode system while running Oracle and other applications on another node, or on several other nodes. In this way you could effectively use various nodes of a parallel machine as client nodes, and one as a server node.
If the database consists of several distinct high-throughput parts, a parallel server running on high-performance nodes can provide quick processing for each part of the database while also handling occasional access across parts.
Remember that a client-server configuration requires that all communications between the client application and the database take place over the network. This may not be appropriate where a very high volume of such communications is required--as in many batch applications.
See Also: "Client-Server Architecture" in Oracle7 Server Concepts.
Copyright © 1996 Oracle Corporation.
All Rights Reserved.