|Oracle8i Parallel Server Concepts
Release 2 (8.1.6)
Part Number A76968-01
This chapter describes the scalability features of Oracle Parallel Server. Topics in this chapter include:
You can implement Oracle Parallel Server using several features that enhance application performance and maintain high availability levels. These features:
This section discusses these features in four parts:
The Oracle Database Configuration Assistant automatically configures most of these features for you. To manually configure these features, refer to the Net8 Administrator's Guide where noted in the following discussions.
If tasks can run independently of one another, Oracle Parallel Server can distribute them to different nodes. This permits you to achieve scaleup: more processes run through the database in the same amount of time. The number of nodes used, however, depends upon the purpose of the system.
If processes can run faster, then the system accomplishes more work. The parallel execution feature, for example, permits scaleup: a system might maintain the same response time if the data queried increases tenfold, or if more users can be served. Oracle Parallel Server without the parallel execution feature also provides scaleup, but by running the same query sequentially on different nodes.
With a mixed workload of DSS, OLTP, and reporting applications, you can achieve scaleup by running multiple programs on different nodes. You can also achieve speed-up by rewriting the batch programs and separating them into a number of parallel streams to take advantage of the multiple CPUs.
Oracle Parallel Server also offers improved flexibility by overcoming memory limitations so you can use it to accommodate thousand of users. You can allocate or deallocate instances as necessary. For example, as database demand increases, you can temporarily allocate more instances. Then you can deallocate the instances and use them for other purposes once they are no longer required. This feature is useful in internet-based computing environments.
You can measure the performance goals of parallel processing in two ways:
Scale-up is the factor that expresses how much more work can be done in the same time period by a larger system. With added hardware, a formula for scaleup holds the time constant, and measures the increased size of the job which can be done.
If transaction volumes grow and you have good scale-up, you can keep response time constant by adding hardware resources such as CPUs.
You can measure scaleup using this formula:
is the transaction volume processed in a given amount of time on a parallel system.
For example, if the original system processes 100 transactions in a given amount of time and the parallel system processes 200 transactions in this amount of time, then the value of scaleup would be equal to 2. A value of 2 indicates the ideal of linear scaleup: when twice as much hardware can process twice the data volume in the same amount of time.
Speed-up is the extent to which more hardware can perform the same task in less time than the original system. With added hardware, speed-up holds the task constant and measures time savings. Figure 8-2 shows how each parallel hardware system performs half of the original task in half the time required to perform it on a single system.
However, you may not experience direct, linear speed-up. Instead, speed-up may be more logarithmic. That is, assume the system can perform a task of size "x" in a time duration of "n". But for a task of size 2x, the system may require a time duration of 3n.
This section describes applications that commonly benefit from a parallel server.
Data warehousing applications that infrequently update, insert, or delete data are often appropriate for Oracle Parallel Server. Query-intensive applications and other applications with low update activity can access the database through different instances with little additional overhead.
If the data blocks are not modified, multiple nodes can read the same block into their buffer caches and perform queries on the block without additional I/O or lock operations.
Decision support applications are good candidates for Oracle Parallel Server because they only occasionally modify data, as in a database of financial transactions that is mostly accessed by queries during the day and is updated during off-peak hours.
Applications that either update different data blocks or update the same data blocks at different times are also well suited to the parallel server. An example is a time-sharing environment where users each own and use one set of tables.
An instance that needs to update blocks held in its buffer cache must hold one or more instance locks in exclusive mode while modifying those buffers. Tune parallel server and the applications that run on it to reduce this type of contention for instance locks. Do this by planning how each instance and application uses data and partition your tables accordingly.
Online transaction processing applications that modify different sets of data benefit the most from parallel server. One example is a branch banking system where each branch (node) accesses its own accounts and only occasionally accesses accounts from other branches.
Applications that access a database in a mostly random pattern also benefit from parallel server. This is true only if the database is significantly larger than any node's buffer cache. One example is a motor vehicle department's system where individual records are unlikely to be accessed by different nodes at the same time. Another example is an archived tax record or research data system. In cases like these, most access results in I/O even if the instance had exclusive access to the database. Oracle features such as 1:1 locking further improve performance of such applications.
Departmentalized applications are applications that you have effectively partitioned based on business or departmental processes. Such applications are also suitable for Oracle Parallel Server because they primarily modify different tables in the same database. An example is a system where one node is dedicated to inventory processing, another is dedicated to personnel processing, and a third is dedicated to sales processing. In this case there is only one database to administer, not three.
Online transaction processing (OLTP) applications tend to perform best on symmetric multiprocessors; they perform well on clusters and MPP systems if they can be well partitioned. Decision support (DSS) applications tend to perform well on SMPs, clusters, and massively parallel systems. Select the implementation providing the power you need for the application(s) you require.
Figure 8-3 illustrates the relative scalability of different application types. Data warehousing applications, depicted by the left-most bubble, typically scale well since updates are less common and the degree of partitioning is higher than other application types. OLTP and departmentalized applications with partitioning and increasing rates of change also scale well.
OLTP applications that make random changes to large databases were historically not considered good parallel server candidates. Such applications, however, are becoming more scalable because they use advanced inter-node communication channels such as an interconnect. This is particularly true if, for example, a table is modified on one instance and then another instance reads the table. Such configurations are now much more scalable than in previous releases.
You can execute operations in parallel using multi-node parallel execution. This can significantly reduce the duration of complex queries and DML statements.
In Oracle Parallel Server, client-to-server connections are established using several subcomponents. A client submits a connection request to a listener process that resides on the destination node in the cluster. A "listener" process is a process that manages in-coming connection requests. The listener grants a client-to-server connection by way of a particular node based on several factors depending on how you configure the connection options.
The Multi-Threaded Server allows enhanced scalability in that for a given amount of memory it allows you to support a larger number of users than when using dedicated servers. The incremental costs of memory as you add a user is less than when using a dedicated server. In addition to improving the scalability of the number of users, Multi-Threaded Server offers additional benefits.
An increasing number of Oracle8i features require the Multi-Threaded Server. You need the Multi-Threaded Server if you use certain database features such as:
An important feature of the Multi-Threaded Server is "Service Registration". This feature provides the listener with the service names, instance names and network addresses of the database, as well as the current state and load of all instances and MTS dispatchers. With this information, the listener can forward client connection requests to the appropriate dispatchers and dedicated servers.
Because this information is registered with the listener, you do not need to configure the
listener.ora file with this static information about a database service. Service registration also extends benefits to connection load balancing which is a feature of the Multi-Threaded Server.
The connection load balancing feature provides exceptional benefits in Oracle Parallel Server environments where there are multiple instances and dispatchers. Connection load balancing improves performance by balancing the number of active connections among various instances and MTS dispatchers for the same service.
Because of service registration's ability to register with remote listeners, a listener is always aware of all instances and dispatchers regardless of their location. This way, a listener can send an incoming client request for a specific service to the least loaded instance and least loaded dispatcher.
Service Registration also facilitates connect-time failover and client load balancing which you can use with or without MTS.
Service registration enables listeners to know whether an instance is up at all times. This allows you to use connect-time failover when multiple listeners support a service. Do this by configuring a client to failover client requests to a different listener if the first listener fails. The reconnection attempts continue until the client successfully connects to a listener. If an instance is down, a listener returns a network error.
When more than one listener supports a service, a client can randomly send connection requests to the various listeners. The random nature of the connection requests distributes the load to avoid overburdening a single listener.
In addition to load balancing, clients can also specify that their connection request automatically fail over to a different listener if a connection cannot be made with the listener chosen at random. A connection could fail either because the specified listener is not up or because the instance is not up and the listener therefore cannot accept the connection.
For configuration information on the features described in this section, please refer to the Net8 Administrator's Guide.
Successful implementation of parallel processing and parallel database requires optimal scalability on four levels:
Interconnects are key to hardware scalability. That is, every system must have some means of connecting the CPUs, whether this is a high speed bus or a low speed Ethernet connection. Bandwidth and latency of the interconnect then determine the scalability of the hardware.
Most interconnects have sufficient bandwidth. A high bandwidth may, in fact, disguise high latency.
Hardware scalability depends heavily on very low latency. Lock coordination traffic communication is characterized by a large number of very small messages among the LMD processes.
Consider the example of a highway and the difference between conveying a hundred passengers on a single bus, compared to one hundred individual cars. In the latter case, efficiency depends largely upon the ability of cars to quickly enter and exit the highway. Even if the highway has 5 lanes so multiple cars can pass, if there is only a one-lane entrance ramp, there can be a bottleneck getting onto the "fast" highway.
Other operations between nodes, such as parallel execution, rely on high bandwidth.
Local I/Os are faster than remote I/Os (those which occur between nodes). If a great deal of remote I/O is needed, the system loses scalability. In this case you can partition data so that the data is local.
The ultimate scalability of your system also depends upon the scalability of the operating system. This section explains how to analyze this factor.
Software scalability can be an important issue if one node is a shared memory system (that is, a system where multiple CPUs connect to a symmetric multiprocessor single memory). Methods of synchronization in the operating system can determine the scalability of the system. In asymmetrical multiprocessing, for example, only a single CPU can handle I/O interrupts. Consider a system where multiple user processes request resources from the operating system:
An important distinction in parallel server architectures is internal versus external parallelism; this has a strong effect on scalability. The key difference is whether the object-relational database management system (ORDBMS) parallelizes the query, or an external process parallelizes the query.
Disk affinity can improve performance by ensuring that nodes mainly access local, rather than remote, devices. An efficient synchronization mechanism enables better speed-up and scaleup.
Application design is key to taking advantage of the scalability of the other elements of the system.
No matter how scalable the hardware, software, and database may be, a table with only one row which every node is updating will synchronize on one data block. Consider the process of generating a unique sequence number:
Every node needing to update this sequence number must wait to access the same row of this table: the situation is inherently unscalable. A better approach is to use sequences to improve scalability:
In this example, you can preallocate and cache sequence numbers to improve scalability. However you may not be able to scale some applications due to business rules. In such cases, you must determine the cost of the rule.
Oracle8i Parallel Server Administration, Deployment, and Performance for information on designing databases and application analysis.
Oracle Parallel Server allows users on multiple instances to generate unique sequence numbers with minimal synchronization.
The sequence number generator allows multiple instances to access and increment a sequence without contention among instances for sequence numbers and without waiting for transactions to commit. Each instance can have its own sequence cache for faster access to sequence numbers. Distributed Lock Manager locks coordinate sequences across instances in Oracle Parallel Server.
This section describes the CREATE SEQUENCE statement and its options.
The SQL statement CREATE SEQUENCE establishes a database object from which multiple users can generate unique integers without waiting for other users to commit transactions to access the same sequence number generator.
Oracle Parallel Server allows users on multiple instances to generate unique sequence numbers with minimal cooperation or contention among instances. Instance locks coordinate sequences across instances in Oracle Parallel Server.
Sequence numbers are always unique, unless you use the CYCLE option. However, you can assign sequence numbers out of order if you use the CACHE option without the ORDER option, as described in the following section.
Oracle8i SQL Reference for more information about the CREATE SEQUENCE and CYCLE options.
The CACHE option of CREATE SEQUENCE pre-allocates sequence numbers and retains them in an instance's System Global Area for faster access. You can specify the number of sequence numbers cached as an argument to the CACHE option. The default value is 20.
Caching sequence numbers significantly improves performance but can cause the loss of some numbers in the sequence. Losing sequence numbers is unimportant in some applications, such as when sequences are used to generate unique numbers for primary keys.
A cache for a given sequence is populated at the first request for a number from that sequence. After the last number in that cached set of numbers is assigned, the cache is repopulated with another set of numbers.
Each instance keeps its own cache of sequence numbers in memory. When an instance shuts down, cached sequence values that have not been used in committed DML statements can be lost. The potential number of lost values can be as great as the value of the CACHE option multiplied by the number of instances shutting down. Cached sequence numbers can be lost even when an instance shuts down normally.
The ORDER option of CREATE SEQUENCE guarantees that sequence numbers are generated in the order of the requests. You can use the ORDER option for time-
stamp numbers and other sequences that must indicate the request order across multiple processes and instances.
If you do not need Oracle to issue sequence numbers in order, the NOORDER option of CREATE SEQUENCE can significantly reduce overhead in an Oracle Parallel Server environment.
Oracle Parallel Server does not support the CACHE option with the ORDER option of CREATE SEQUENCE when the database is mounted in parallel mode. Oracle cannot guarantee an order if each instance has some sequence values cached. Therefore, if you should create sequences with both the CACHE and ORDER options, they will be ordered but not cached.
Oracle Parallel Server provides the framework for parallel execution to operate between nodes. Parallel execution behaves the same way with or without the Oracle Parallel Server Option. The only difference is that Oracle Parallel Server enables parallel execution to distribute portions of statements among nodes so that the nodes execute on behalf of a single query. The server sub-divides the statement into smaller operations that run against a common database residing on shared disks. Because parallel execution is performed by the server, this parallelism can occur at a low level of server operation, rather than at an external SQL level.
If Oracle does not process a statement in parallel, Oracle reads disks serially with one I/O. In this case, a single CPU scans all rows in a table. With the statement parallelized, disks are read in parallel with multiple I/Os.
Several CPUs can each scan a part of the table in parallel, and aggregate the results. Parallel execution benefits not only from multiple CPUs but also from greater I/O bandwidth availability.
Oracle parallel execution can run with or without the Oracle Parallel Server. Without the Oracle Parallel Server option, parallel execution cannot perform multi-node parallelism. Oracle Parallel Server optimizes the Oracle8i Enterprise Edition running on clustered hardware using a parallel cache architecture to avoid shared memory bottlenecks in OLTP and DSS applications.