30
Distributed Databases

Good sense is of all things in the world the most equally distributed, for everybody thinks he is so well supplied with it, that even the most difficult to please in all other matters never desire more of it than they already possess.

René Descartes: Le Discours de la Methode

This chapter describes the basic concepts and terminology of Oracle's distributed database architecture. The chapter includes:

Oracle's Distributed Database Architecture

A distributed database is a set of databases stored on multiple computers that typically appears to applications as a single database. Consequently, an application can simultaneously access and modify the data in several databases in a network. Each Oracle database in the system is controlled by its local Oracle server but cooperates to maintain the consistency of the global distributed database.

Figure 30-1 illustrates a representative Oracle distributed database system.

Clients and Servers

A database server is the Oracle software managing a database, and a client is an application that requests information from a server. Each computer in a system is a node. A node in a distributed database system act as a client, a server, or both, depending on the situation. For example, in Figure 30-1, the computer that manages the HQ database is acting as a database server when a statement is issued against its local data (for example, the second statement in each transaction issues a query against the local DEPT table), and is acting as a client when it issues a statement against remote data (for example, the first statement in each transaction is issued against the remote table EMP in the SALES database).

Direct and Indirect Connections

A client can connect directly or indirectly to a database server. In Figure 30-1, when the client application issues the first and third statements for each transaction, the client is connected directly to the intermediate HQ database and indirectly to the SALES database that contains the remote data.

Figure 30-1 An Oracle Distributed Database System

The Network

To link the individual databases of a distributed database system, a network is necessary. The following sections explain more about network issues in an Oracle distributed database system.

Net8

All Oracle databases in a distributed database system use Oracle's networking software, Net8, to facilitate inter-database communication across a network. Just as Net8 connects clients and servers that operate on different computers of a network, it also allows database servers to communicate across networks to support remote and distributed transactions in a distributed database.

Net8 makes transparent the connectivity that is necessary to transmit SQL requests and receive data for applications that use the system. Net8 takes SQL statements from a client and packages them for transmission to an Oracle server over a supported industry-standard communication protocol or programmatic interfaces. Net8 also takes replies from a server and packages them for transmission back to the appropriate client. Net8 performs all processing independent of an underlying network operating system.

Additional Information:

See the Oracle Net8 Administrator's Guide for more information about Net8 and its features.

Oracle Names

Optionally, an Oracle network can use Oracle Names to provide the system with a global directory service. When an Oracle network supports a distributed database system, you can use Oracle Names servers as a central repositories of information about each database in the system to ease the configuration of distributed database access.

Databases and Database Links

Each database in a distributed database is distinct from all other databases in the system and has its own global database name. Oracle forms a database's global database name by prefixing the database's network domain with the individual database's name.

For example, Figure 30-2 illustrates a representative hierarchical arrangement of databases throughout a network.

Figure 30-2 Network Directories and Global Database Names

While several databases can have the same individual name, each database must have a unique global database name.

For example, the network domains US.AMERICAS.ACME_AUTO.COM and UK.EUROPE.ACME_AUTO.COM in Figure 30-2 each contain a SALES database:

SALES.US.AMERICAS.ACME_AUTO.COM 

SALES.UK.EUROPE.ACME_AUTO.COM

Database Links

To facilitate application requests in a distributed database system, Oracle uses database links. A database link defines a one-way communication path from an Oracle database to another database.

Database links are essentially transparent to the users of an Oracle distributed database system, because the name of a database link is the same as the global name of the database to which the link points. For example, the following SQL statement creates a database link in the local database that describes a path to the remote SALES.US.AMERICAS.ACME_AUTO.COM database.

CREATE DATABASE LINK sales.us.americas.acme_auto.com ... ;

After creating a database link, applications connected to the local database can access data in the remote SALES.US.AMERICAS.ACME_AUTO.COM database. The next section explains how applications can reference remote schema objects in a distributed database and includes examples of how SQL statements use database links.

Additional Information:

Oracle supports several different types of database links. See Oracle8 Distributed Database Systems for more information.

Schema Object Name Resolution

To resolve application references to schema objects (a process called name resolution) Oracle forms object names using a hierarchical approach. For example, within a single database, Oracle guarantees that each schema has a unique name, and that within a schema, each object has a unique name. As a result, a schema object's name is always unique within the database. Furthermore, Oracle can easily resolve application references to a schema object's local name.

In a distributed database, a schema object such as a table is accessible to all applications in the system. Oracle simply extends the hierarchical naming model with global database names to effectively create global object names and resolve references to the schema objects in a distributed database system. For example, a query can reference a remote table by specifying its fully qualified name, including the database in which it resides.

SELECT * FROM scott.emp@sales.us.americas.acme_auto.com;

To complete the request, the local database server implicitly uses a database link that connects to the remote SALES database.

Connecting Between Oracle Server Versions

An Oracle distributed database system can incorporate Oracle databases of different versions. All supported releases of Oracle can participate in a distributed database system. However, the applications that work with the distributed database must understand the functionality that is available at each node in the system. For example, a distributed database application cannot expect an Oracle7 database to understand the object SQL extensions that are available with Oracle8.

Distributed Databases and Distributed Processing

The terms "distributed database" and "distributed processing" are closely related, but have very distinct meanings.

Distributed Database

A distributed database is a set of databases stored on multiple computers that appears to applications as a single database.

Distributed Processing

Distributed processing occurs when an application system distributes its tasks among different computers in a network. For example, a database application typically distributes front-end presentation tasks to client PCs or NCs and allows a back-end database server to manage shared access to a database. Consequently, a distributed database application processing system is more commonly referred to as a "client-server" database application system.

Oracle distributed database systems employ a distributed processing architecture to function. For example, an Oracle server acts as a client when it requests data that another Oracle server manages.

Distributed Databases and Database Replication

The terms "distributed database" and "database replication" are also closely related, yet different. In a pure distributed database, the system manages a single copy of all data and supporting database objects. Distributed database applications typically use distributed transactions to access both local and remote data and modify the global database in real-time.

Note:

This chapter discusses pure distributed databases. See Chapter 31, "Database Replication" for a discussion of replication.

Replication is the process of copying and maintaining database objects in multiple databases that make up a distributed database system. While replication relies on distributed database technology to function, database replication can offer applications benefits that are not possible within a pure distributed database environment. Most commonly, replication is useful to improve the performance and protect the availability of applications because alternate data access options exist. For example, an application might normally access a local database rather than a remote server to minimize network traffic and achieve maximum performance. Furthermore, the application can continue to function if the local server experiences a failure, but other servers with replicated data remain accessible.

Additional Information:

See Oracle8 Replication for more information about Oracle's replication features.

Heterogeneous Distributed Databases

A heterogeneous distributed database is a distributed database in which at least one of the databases is a non-Oracle system. Access to non-Oracle systems from an Oracle server is provided by Oracle Open Gateways. The Oracle server, together with the gateway, can provide full heterogeneity transparency to the application. That is, the application doesn't have to be aware that a non-Oracle system is accessed.

Database links are used between the Oracle server and the non-Oracle system to access the data in the non-Oracle system, or to execute a remote procedure in the non-Oracle system. By integrating the transactional system of the non-Oracle system with the Oracle server, the integrity of the data can be guaranteed.

Transparent SQL Access

The application just issues Oracle SQL statements against the local Oracle server. The SQL statement can refer to data in a remote non-Oracle system, just as if it were a remote Oracle system. The Oracle server, together with the gateway, will perform the necessary translations to access the non-Oracle system in its own SQL dialect.

Procedural Access

Some non-Oracle systems need to be accessed procedurally. The application just issues a PL/SQL remote procedure call, and the Oracle server, together with the gateway, will perform translations to execute procedures or functions in the remote non-Oracle system. For example, the remote procedure could interface with a messaging system and put messages in the non-Oracle messaging system. If the messaging system has transactional support, the Oracle server together with the gateway, could perform operations in the messaging system in a larger Oracle distributed transaction, guaranteeing that either all changes (both in the messaging system and in the Oracle server) are committed or rolled back.

Gateway Features

In summary, features of Oracle Open Gateways include, but are not limited to:

Distributed Transactions. A transaction can span both Oracle and non-Oracle systems, while still guaranteeing, through Oracle's two phase commit mechanism, that changes are either all committed or all rolled back.
Transparent SQL access. Integrate data from non-Oracle systems into the Oracle environment as if the data is stored in one single, local database. SQL statements issued by the application are transparently transformed into SQL statement understood by the non-Oracle system.
Procedural Access. Procedural systems, like messaging and queuing systems, are accessed from an Oracle server using PL/SQL remote procedure calls.
Data Dictionary translations. To make the non-Oracle system appear as another Oracle server, SQL statements containing references to Oracle's data dictionary tables are transformed into SQL statements containing references to a non-Oracle system's data dictionary tables.
Pass-through SQL. Optionally, application programmers can directly access a non-Oracle system from an Oracle application using the non-Oracle system's SQL dialect.
Accessing stored procedures. Stored procedures in SQL-based non-Oracle systems are accessed as if they were PL/SQL remote procedures.
National Language Support. Gateways supports multibyte character sets, and will translate character sets between a non-Oracle system and the Oracle server.
Global query optimization. Cardinality and indexes on tables at the non-Oracle system are taken into account by the Oracle query optimizer and decomposed to produce efficient SQL statements to be executed at the non-Oracle system.

Additional Information:

Not all features listed above might apply to your particular gateway. See your gateway documentation for the supported features.

Version 8 Gateways

Version 8 Gateways are tightly integrated with the Oracle server, by using the Oracle feature Heterogeneous Services. Heterogeneous Services is integrated into the Oracle server, and therefore, administration tasks for heterogeneous access are now generic across the different gateways, and integrated into the Oracle server.

Additional Information:

See Oracle8 Replication for more information about Heterogeneous Services.

Version 4 Gateways

Version 4 gateways are supported against Oracle7 and Oracle8 servers.

Additional Information:

See Oracle Open Gateway Technology, Concepts and Administration Guide Version 4 for gateway information.

Developing Distributed Database Applications

When you build applications on top of a distributed database system, there are several issues to consider. The following sections explain how applications access data in a distributed database.

Remote and Distributed SQL Statements

A remote query is a query that selects information from one or more remote tables, all of which reside at the same remote node. For example:

SELECT * FROM scott.dept@sales.us.americas.acme_auto.com;

A remote update is an update that modifies data in one or more tables, all of which are located at the same remote node. For example:

UPDATE scott.dept@sales.us.americas.acme_auto.com 
  SET loc = 'NEW YORK' 
  WHERE deptno = 10;

Note:

A remote update may include a subquery that retrieves data from one or more remote nodes, but because the update happens at only a single remote node, the statement is classified as a remote update.

A distributed query retrieves information from two or more nodes. For example:

SELECT ename, dname 
  FROM scott.emp e, scott.dept@sales.us.americas.acme_auto.com d 
  WHERE e.deptno = d.deptno;

A distributed update modifies data on two or more nodes. A distributed update is possible using a PL/SQL subprogram unit, such as a procedure or trigger, that includes two or more remote updates that access data on different nodes. For example:

BEGIN 
  UPDATE scott.dept@sales.us.americas.acme_auto.com 
    SET loc = 'NEW YORK' 
    WHERE deptno = 10; 
  UPDATE scott.emp 
    SET deptno = 11 
    WHERE deptno = 10; 
END;

Statements in the program are sent to the remote nodes, and the execution of it succeeds or fails as a unit.

Remote Procedure Calls (RPCs)

Developers can code PL/SQL packages and procedures to support applications that work with a distributed database. Applications can make local procedure calls to perform work at the local database and remote procedure calls (RPCs) to perform work at a remote database. When a program calls a remote procedure, the local server passes all procedure parameters to the remote server in the call. For example:

BEGIN 
  emp_mgmt.del_emp@sales.us.americas.acme_auto.com(1257); 
END;

When developing packages and procedures for distributed database systems, developers must code with an understanding of what program units should do at remote locations, and how to return the results to a calling application.

Remote and Distributed Transactions

A remote transaction contains one or more remote statements, all of which reference the same remote node. For example:

UPDATE scott.dept@sales.us.americas.acme_auto.com 
  SET loc = 'NEW YORK' 
  WHERE deptno = 10; 
UPDATE scott.emp@sales.us.americas.acme_auto.com 
  SET deptno = 11 
  WHERE deptno = 10; 
COMMIT;

A distributed transaction contains one or more statements that, individually or as a group, update data on two or more distinct nodes of a distributed database. For example:

UPDATE scott.dept@sales.us.americas.acme_auto.com 
  SET loc = 'NEW YORK' 
  WHERE deptno = 10; 
UPDATE scott.emp 
  SET deptno = 11 
  WHERE deptno = 10; 
COMMIT;

Note:

If all statements of a transaction reference only a single remote node, the transaction is remote, not distributed.

Two-Phase Commit Mechanism

A DBMS must guarantee that all statements in a transaction, distributed or non-distributed, either commit or rollback as a unit, so that if the transaction is designed properly, the data in the logical database is always consistent. The effects of an ongoing transaction should be invisible to all other transactions at all nodes; this should be true for transactions that include any type of operation, including queries, updates, or remote procedure calls.

The general mechanisms of transaction control in a non-distributed database are discussed in Chapter 15, "Transaction Management". In a distributed database, Oracle must coordinate transaction control with the same characteristics over a network and maintain data consistency, even if a network or system failure occurs.

Oracle's two-phase commit mechanism guarantees that all database servers participating in a distributed transaction either all commit or all roll back the statements in the transaction. A two-phase commit mechanism also protects implicit DML operations performed by integrity constraints, remote procedure calls, and triggers.

Additional Information:

Oracle8 Distributed Database Systems has more information about Oracle's two-phase commit mechanism.

Transparency in a Distributed Database System

With minimal effort, you can make the functionality of an Oracle distributed database system transparent to users that work with the system. The goal of transparency is to make a distributed database system appear as though it is a single Oracle database. Consequently, the system does not burden developers and users of the system with complexities that would otherwise make distributed database application development challenging and detract from user productivity.

The following sections explain more about transparency in a distributed database system.

Location Transparency

An Oracle distributed database system has features that allow application developers and administrators to hide the physical location of database objects from applications and users. Location transparency exists when a user can universally refer to a database object such as a table, regardless of the node to which an application connects. Location transparency has several benefits, including:

Access to remote data is simple, because database users do not need to know the physical location of database objects.
Administrators can move database objects with no impact on end-users or existing database applications.

Most typically, administrators and developers use synonyms to establish location transparency for the tables and supporting objects in an application schema. For example, the following statements create synonyms in a database for tables in another, remote database.

CREATE PUBLIC SYNONYM emp 
  FOR scott.emp@sales.us.americas.acme_auto.com 
CREATE PUBLIC SYNONYM dept 
  FOR scott.dept@sales.us.americas.acme_auto.com

Now, rather than access the remote tables with a query such as:

SELECT ename, dname 
  FROM scott.emp@sales.us.americas.acme_auto.com e, 
       scott.dept@sales.us.americas.acme_auto.com d 
  WHERE e.deptno = d.deptno;

an application can issue a much simpler query that does not have to account for the location of the remote tables.

SELECT ename, dname 
  FROM emp e, dept d 
  WHERE e.deptno = d.deptno;

In addition to synonyms, developers can also use views and stored procedures to establish location transparency for applications that work in a distributed database system.

Statement and Transaction Transparency

Oracle's distributed database architecture also provides query, update, and transaction transparency. For example, standard SQL commands such as SELECT, INSERT, UPDATE, and DELETE work just as they do in a non-distributed database environment. Additionally, applications control transactions using the standard SQL commands COMMIT, SAVEPOINT, and ROLLBACK - there is no requirement for complex programming or other special operations to provide distributed transaction control.

The statements in a single transaction can reference any number of local or remote tables.
Oracle guarantees that all nodes involved in a distributed transaction take the same action: they either all commit or all roll back the transaction.
If a network or system failure occurs during the commit of a distributed transaction, the transaction is automatically and transparently resolved globally; that is, when the network or system is restored, the nodes either all commit or all roll back the transaction.

Internal Operations

Each committed transaction has an associated system change number (SCN) to uniquely identify the changes made by the statements within that transaction. In a distributed database, the SCNs of communicating nodes are coordinated when:

A connection is established using the path described by one or more database links.
A distributed SQL statement is executed.
A distributed transaction is committed.

Among other benefits, the coordination of SCNs among the nodes of a distributed database system allows global distributed read-consistency at both the statement and transaction level. If necessary, global distributed time-based recovery can also be completed.

Replication Transparency

Oracle also provides many features to transparently replicate data among the nodes of the system.

Additional Information:

See Oracle8 Replication for more information about Oracle's replication features.

Administering an Oracle Distributed Database System

Just as there are unique issues to consider when developing applications for an Oracle distributed database system, there are special issues to understand for distributed database administration. The following sections explain the some special topics for managing databases in an Oracle distributed database system.

Site Autonomy

Site autonomy means that each server participating in a distributed database is administered independently from all other databases, as though each database operates as a non-distributed database. Although several databases can work together, each database is a distinct, separate repository of data that you manage individually. Some of the benefits of site autonomy in an Oracle distributed database include:

Nodes of the system can mirror the logical organization of companies or cooperating organizations that need to maintain an "arms length" relationship.
Local database administrators control corresponding local data. Therefore, each database administrator's domain of responsibility is smaller and more manageable.
Independent failures are less likely to disrupt other nodes of the distributed database. The global Oracle database is partially available as long as one database and the network are available; no single database failure need halt all global operations or be a performance bottleneck.
Administrators can recovery from isolated system failures independent of other nodes in the system.
A data dictionary exists for each local database - a global catalog is not necessary to access local data.
Nodes can upgrade software independently.

Although Oracle allows you to manage each database in a distributed database system independently, that is not to say that you should ignore the global requirements of the system. For example, additional user accounts might be necessary in each database are necessary to support the links that you create to facilitate server-to-server connections. The following sections explain more about these particular topics and demonstrate the need for a global perspective of the entire distributed database environment when managing individual nodes in the system.

Distributed Database Security

Oracle supports all of the security features that are available with a non-distributed database environment for distributed database systems, including:

password or external service authentication for users and roles
login packet encryption for client-to-server and server-to-server connections

The following sections explain some additional topics to consider when configuring an Oracle distributed database system.

Supporting User Accounts and Roles

In a distributed database system, you must carefully plan the user accounts and roles that are necessary to support applications using the system.

The user accounts necessary to establish server-to-server connections must be available in all databases of the distributed database system.
The roles necessary to make available application privileges to distributed database application users must be present in all databases of the distributed database system.

As you create the database links for the nodes in a distributed database system, determine what user accounts and roles each site needs to support server-to-server connections that use the links.

Additional Information:

See Oracle8 Distributed Database Systems for more information about the user accounts that must be available to support different types of database links in the system.

Global Users and Roles

In a distributed environment, users typically require access to many network services. When it's necessary to configure separate authentications for each user to access each network service, security administration can become unwieldy, especially for large systems. The use of a global authentication service is a common technique for simplifying security management for distributed environments.

In an Oracle client/server or distributed database environment, you have two options to support global authentication for users and roles:

Oracle Security Server is a product that supports centralized authentication and distributed authentication in an Oracle network. Oracle Security Server is a standard option of Oracle.
When global database user and role authentication must work within the framework of a non-Oracle authentication service (for example, DCE), an Oracle distributed database environment can use Net8's Advanced Networking Option. The Net8 Advanced Networking Option is an optional product that bundles a number of features that you can use to enhance Net8 and the security of an Oracle distributed database system.

Attention:

The Advanced Networking Option is available only if you have purchased Oracle8 Enterprise Edition.

Data Encryption

The Net8 Advanced Networking Option also enables Net8 and related products to use network data encryption and checksumming so that data cannot be read or altered. It protects data from unauthorized viewing by using the RSA Data Security RC4 or the Data Encryption Standard (DES) encryption algorithm. To ensure that data has not been modified, deleted, or replayed during transmission, the security services of the Advanced Networking Option can generate a cryptographically secure message digest and include it with each packet sent across the network.

Additional Information:

See the Oracle Net8 Administrator's Guide for more information about these and other features of the Advanced Networking Option. Also see Getting to Know Oracle8 and the Oracle8 Enterprise Edition for information about the features and options that are available with Oracle8 Enterprise Edition.

Tools for Administering Oracle Distributed Databases

The database administrator has several choices for tools to use when managing an Oracle distributed database system:

Oracle Enterprise Manager
third-party administration tools
SNMP support

Oracle Enterprise Manager

Oracle Enterprise Manager is Oracle's database administration tool. The graphical component of Oracle Enterprise Manager allows you to perform database administration tasks with the convenience of a graphical user interface (GUI). The line mode component of Oracle Enterprise Manager provides a line-mode interface.

Oracle Enterprise Manager provides administrative functionality via an easy-to-use interface. You can use Oracle Enterprise Manager to:

Perform traditional administrative tasks, such as database startup, shutdown, backup, and recovery. Rather than manually entering the SQL commands to perform these tasks, you can use Oracle Enterprise Manager's graphical interface to execute the commands quickly and conveniently by pointing and clicking with the mouse.
Concurrently perform multiple tasks. Because you can open multiple windows simultaneously in Oracle Enterprise Manager, you can perform multiple administrative and non-administrative tasks concurrently.
Administer multiple databases. You can use Oracle Enterprise Manager to administer a single database or to simultaneously administer multiple databases.
Centralize database administration tasks. You can administer both local and remote databases running on any Oracle platform in any location worldwide. In addition, these Oracle platforms can be connected by any network protocol(s) supported by Net8.
Dynamically execute SQL, PL/SQL, and Oracle Enterprise Manager commands. You can use Oracle Enterprise Manager to enter, edit, and execute statements. Oracle Enterprise Manager also maintains a history of statements executed. Thus, you can re-execute statements without retyping them, a particularly useful feature if you need to execute lengthy statements repeatedly in a distributed database system.
Perform administrative tasks using Oracle Enterprise Manager's line-mode interface when a graphical user interface is unavailable or undesirable.

Third-Party Administration Tools

Currently more than 60 companies produce more than 150 products that help manage Oracle databases and networks, providing a truly open environment.

SNMP Support

Besides its network administration capabilities, Oracle Simple Network Management Protocol (SNMP) support allows an Oracle server to be located and queried by any SNMP-based network management system. SNMP is the accepted standard underlying many popular network management systems such as:

HP's OpenView
Digital's POLYCENTER Manager on NetView
IBM's NetView/6000
Novell's NetWare Management System
SunSoft's SunNet Manager

Additional Information:

See the Oracle SNMP Support Reference Guide.

National Language Support

Oracle supports client/server environments where clients and servers use different character sets. The character set used by a client is defined by the value of the NLS_LANG parameter for the client session. The character set used by a server is its database character set. Data conversion is done automatically between these character sets if they are different.

Additional Information:

See Oracle8 Reference for more information about National Language Support features.

30Distributed Databases