Oracle9i Database Utilities
Release 1 (9.0.1)

Part Number A90192-01
Go To Documentation Library
Home
Go To Product List
Book List
Go To Table Of Contents
Contents
Go To Index
Index

Master Index

Feedback

Go to previous page Go to next page

11
External Tables Concepts

The Oracle9i external tables feature is a complement to existing SQL*Loader functionality. It allows you to access data in external sources as if it were in a table in the database.

External tables are read-only. No data manipulation language (DML) operations or index creation is allowed on an external table. Therefore, SQL*Loader may be the better choice in data loading situations that require additional indexing of the staging table.

To use the external tables feature, you must have some knowledge of the file format and record format of the datafiles on your platform. You must also know enough about SQL to be able to create an external table and execute DML statements that access the external table.

This chapter discusses the following topics:

The Access Driver

An external table describes how the external table layer must present the data to the server. This is accomplished using the external table API, which is implemented through an access driver. The access driver and the external table layer transform the data in the datafile to match the external table definition.

When you create an external table of a particular type, you provide access parameters that describe the external data source. See Chapter 12 for descriptions of these access parameters.

The description of the data in the data source is separate from the definition of the external table. This means that:

The access driver ensures that data from the data source is processed so that it matches the definition of the external table.

In the following example, a traditional table named emp is defined along with an external table named emp_load.

CREATE TABLE emp (emp_no CHAR(6), last_name CHAR(25), first_name CHAR(20), middle_initial 
CHAR(1));

CREATE TABLE emp_load (employee_number CHAR(5), employee_last_name CHAR(20),
                       employee_first_name CHAR(15), employee_middle_name CHAR(15))
ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_tab_dir
                       ACCESS PARAMETERS (FIXED 62 FIELDS (employee_number INTEGER(2), 
                                                           employee_dob CHAR(20),
                                                           employee_last_name CHAR(18),
                                                           employee_first_name CHAR(11),
                                                           employee_middle_name CHAR(11))
                       LOCATION ('foo.dat'));

INSERT INTO emp (emp_no, first_name, middle_initial, last_name) AS 
   SELECT employee_number, employee_first_name, substr(employee_middle_initial, 1, 1),    
          employee_last_name
   FROM emp_load;

Note the following in the preceding example:

External Table Restrictions

Location of Datafiles and Output Files

The access driver runs inside of the database server. This is different from SQL*Loader in that SQL*Loader is a client program that sends the data to be loaded over to the server. This difference has the following implications:

The access driver does not allow you to specify random names for a file. This is because the server may have access to files that you do not, and allowing you to read this data would affect security. Similarly, you cannot specify a location for an output file, because the server could overwrite a file that you might not normally have privileges to delete.

Instead, you are required to specify directory objects as the locations from which to read files and write files. A directory object maps a name to a directory name on the file system. Once a directory object is created by the system user, then the system user can grant READ or WRITE permission on the object. For example, the following statement creates a directory object load_src.

create directory load_src as '/usr/apps/datafiles';

After a directory is created, the user creating the directory object needs to grant READ or WRITE permission on the directory to other users. For example, to allow the server to read files on behalf of user scott in the directory named by load_src, the user who created the directory object must execute the following command:

GRANT READ ON DIRECTORY load_src TO scott;

The name of the directory object can appear in the following places in a CREATE TABLE...ORGANIZATION EXTERNAL statement:

The SYSTEM user is the only user that can create directory objects, but the SYSTEM user can grant other users the privilege to create directory objects. Note that READ or WRITE permission to a directory object only means that the Oracle database server will read or write that file on your behalf. You are not given direct access to those files outside of the Oracle database server unless you have the operating system privileges to do so. Similarly, the Oracle database server requires permission from the operating system to read and write files in the directories.

Using External Tables to Load Data

The main use for external tables is as a row source for loading data into a real table in the database. After you create an external table, you can issue a CREATE TABLE AS SELECT or INSERT INTO... AS SELECT statement using the external table as the source of the SELECT clause. Remember that external tables are read-only, so you cannot insert into them or update records in them.

When the external table is accessed through a SQL statement, the fields of the external table can be used just like any other field in a normal table. In particular, the fields can be used as arguments for any SQL built-in function, PL/SQL function, or Java function. This allows you to manipulate the data from the external source.

Although external tables cannot contain a column object, you can use constructor functions to build a column object from attributes in the external table. For example, assume a table in the database is defined as follows:

CREATE TYPE student_type AS object (
  student_no CHAR(5),
  name       CHAR(20);

CREATE TABLE roster (
  student    student_type,
  grade      CHAR(2));

Also assume there is an external table defined as follows:

CREATE TABLE roster_data (
  student_no CHAR(5),
  name       CHAR(20),
  grade      CHAR(2))
  ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_tab_dir
                         ACCESS PARAMETERS (FIELDS TERMINATED BY  ,)
                         LOCATION ('foo.dat');

To load table roster from roster_data, you would specify something similar to the following:

INSERT INTO roster (student, grade) 
  AS SELECT (student_type(student_no, name), grade) FROM roster_data;

Parallel Access to External Tables

The external table feature supports parallel processing on the datafiles. You control the degree of parallelism using the standard parallel hints and with the PARALLEL clause when creating the external table. The degree of parallelism indicates the number of access drivers that can be started to process the datafiles. The access driver attempts to divide large datafiles into chunks that can be processed separately. The chunks are not too small (so that the overhead of managing the chunks is small) and not too big (so that all of the parallel access drivers have a chance to keep equally busy).

The following file, record, and data characteristics make it impossible for a file to be processed in parallel:

Performance Hints When Using External Tables

When you monitor performance, the most important measurement is the elapsed time for a load. Other important measurements are CPU usage, memory usage, and I/O rates.

You can alter performance by increasing or decreasing the degree of parallelism. The degree of parallelism allows you to choose on a scale between slower load with little resource usage and faster load with all resources utilized. The access driver cannot automatically tune itself, because it cannot determine how many resources you want to dedicate to the access driver.

In addition to changing the degree of parallelism to improve performance, consider the following information:

The access driver takes advantage of multithreading to streamline the work as much as possible.


Go to previous page Go to next page
Oracle
Copyright © 1996-2001, Oracle Corporation.

All Rights Reserved.
Go To Documentation Library
Home
Go To Product List
Book List
Go To Table Of Contents
Contents
Go To Index
Index

Master Index

Feedback