Managing External Tables

Oracle Database allows you read-only access to data in external tables. External tables are defined as tables that do not reside in the database, and can be in any format for which an access driver is provided. By providing the database with metadata describing an external table, the database is able to expose the data in the external table as if it were data residing in a regular database table. The external data can be queried directly and in parallel using SQL.

You can, for example, select, join, or sort external table data. You can also create views and synonyms for external tables. However, no DML operations (UPDATE, INSERT, or DELETE) are possible, and no indexes can be created, on external tables.

External tables also provide a framework to unload the result of an arbitrary SELECT statement into a platform-independent Oracle-proprietary format that can be used by Oracle Data Pump.

Note:

The DBMS_STATS package can be used for gathering statistics for external tables. The ANALYZE statement is not supported for gathering statistics for external tables.

For information about using the DBMS_STATS package, see Oracle Database Performance Tuning Guide

The means of defining the metadata for external tables is through the CREATE TABLE...ORGANIZATION EXTERNAL statement. This external table definition can be thought of as a view that allows running any SQL query against external data without requiring that the external data first be loaded into the database. An access driver is the actual mechanism used to read the external data in the table. When you use external tables to unload data, the metadata is automatically created based on the datatypes in the SELECT statement (sometimes referred to as the shape of the query).

Oracle Database provides two access drivers for external tables. The default access driver is ORACLE_LOADER, which allows the reading of data from external files using the Oracle loader technology. The ORACLE_LOADER access driver provides data mapping capabilities which are a subset of the control file syntax of SQL*Loader utility. The second access driver, ORACLE_DATAPUMP, lets you unload data--that is, read data from the database and insert it into an external table, represented by one or more external files--and then reload it into an Oracle Database.

The Oracle Database external tables feature provides a valuable means for performing basic extraction, transformation, and loading (ETL) tasks that are common for data warehousing.

These following sections discuss the DDL statements that are supported for external tables. Only DDL statements discussed are supported, and not all clauses of these statements are supported.

Creating External Tables

You create external tables using the ORGANIZATION EXTERNAL clause of the CREATE TABLE statement. You are not in fact creating a table; that is, an external table does not have any extents associated with it. Rather, you are creating metadata in the data dictionary that enables you to access external data.

Note:

External tables cannot have virtual columns.

The following example creates an external table and then uploads the data to a database table. Alternatively, you can unload data through the external table framework by specifying the AS subquery clause of the CREATE TABLE statement. External table data pump unload can use only the ORACLE_DATAPUMP access driver.

EXAMPLE: Creating an External Table and Loading Data

The file empxt1.dat contains the following sample data:

360,Jane,Janus,ST_CLERK,121,17-MAY-2001,3000,0,50,jjanus
361,Mark,Jasper,SA_REP,145,17-MAY-2001,8000,.1,80,mjasper
362,Brenda,Starr,AD_ASST,200,17-MAY-2001,5500,0,10,bstarr
363,Alex,Alda,AC_MGR,145,17-MAY-2001,9000,.15,80,aalda

The file empxt2.dat contains the following sample data:

401,Jesse,Cromwell,HR_REP,203,17-MAY-2001,7000,0,40,jcromwel
402,Abby,Applegate,IT_PROG,103,17-MAY-2001,9000,.2,60,aapplega
403,Carol,Cousins,AD_VP,100,17-MAY-2001,27000,.3,90,ccousins
404,John,Richardson,AC_ACCOUNT,205,17-MAY-2001,5000,0,110,jrichard

The following hypothetical SQL statements create an external table in the hr schema named admin_ext_employees and load its data into the hr.employees table.

CONNECT  /  AS SYSDBA;
-- Set up directories and grant access to hr 
CREATE OR REPLACE DIRECTORY admin_dat_dir
    AS '/flatfiles/data'; 
CREATE OR REPLACE DIRECTORY admin_log_dir 
    AS '/flatfiles/log'; 
CREATE OR REPLACE DIRECTORY admin_bad_dir 
    AS '/flatfiles/bad'; 
GRANT READ ON DIRECTORY admin_dat_dir TO hr; 
GRANT WRITE ON DIRECTORY admin_log_dir TO hr; 
GRANT WRITE ON DIRECTORY admin_bad_dir TO hr;
-- hr connects. Provide the user password (hr) when prompted.
CONNECT hr
-- create the external table
CREATE TABLE admin_ext_employees
                   (employee_id       NUMBER(4), 
                    first_name        VARCHAR2(20),
                    last_name         VARCHAR2(25), 
                    job_id            VARCHAR2(10),
                    manager_id        NUMBER(4),
                    hire_date         DATE,
                    salary            NUMBER(8,2),
                    commission_pct    NUMBER(2,2),
                    department_id     NUMBER(4),
                    email             VARCHAR2(25) 
                   ) 
     ORGANIZATION EXTERNAL 
     ( 
       TYPE ORACLE_LOADER 
       DEFAULT DIRECTORY admin_dat_dir 
       ACCESS PARAMETERS 
       ( 
         records delimited by newline 
         badfile admin_bad_dir:'empxt%a_%p.bad' 
         logfile admin_log_dir:'empxt%a_%p.log' 
         fields terminated by ',' 
         missing field values are null 
         ( employee_id, first_name, last_name, job_id, manager_id, 
           hire_date char date_format date mask "dd-mon-yyyy", 
           salary, commission_pct, department_id, email 
         ) 
       ) 
       LOCATION ('empxt1.dat', 'empxt2.dat') 
     ) 
     PARALLEL 
     REJECT LIMIT UNLIMITED; 
-- enable parallel for loading (good if lots of data to load)
ALTER SESSION ENABLE PARALLEL DML;
-- load the data in hr employees table
INSERT INTO employees (employee_id, first_name, last_name, job_id, manager_id,
                       hire_date, salary, commission_pct, department_id, email) 
            SELECT * FROM admin_ext_employees;

The following paragraphs contain descriptive information about this example.

The first few statements in this example create the directory objects for the operating system directories that contain the data sources, and for the bad record and log files specified in the access parameters. You must also grant READ or WRITE directory object privileges, as appropriate.

Note:

When creating a directory object or BFILEs, ensure that the following conditions are met:
  • The operating system file must not be a symbolic or hard link.

  • The operating system directory path named in the Oracle Database directory object must be an existing OS directory path.

  • The operating system directory path named in the directory object should not contain any symbolic links in its components.

The TYPE specification indicates the access driver of the external table. The access driver is the API that interprets the external data for the database. Oracle Database provides two access drivers: ORACLE_LOADER and ORACLE_DATAPUMP. If you omit the TYPE specification, ORACLE_LOADER is the default access driver. You must specify the ORACLE_DATAPUMP access driver if you specify the AS subquery clause to unload data from one Oracle Database and reload it into the same or a different Oracle Database.

The access parameters, specified in the ACCESS PARAMETERS clause, are opaque to the database. These access parameters are defined by the access driver, and are provided to the access driver by the database when the external table is accessed. See Oracle Database Utilities for a description of the ORACLE_LOADER access parameters.

The PARALLEL clause enables parallel query on the data sources. The granule of parallelism is by default a data source, but parallel access within a data source is implemented whenever possible. For example, if PARALLEL=3 were specified, then more than one parallel execution server could be working on a data source. But, parallel access within a data source is provided by the access driver only if all of the following conditions are met:

  • The media allows random positioning within a data source

  • It is possible to find a record boundary from a random position

  • The data files are large enough to make it worthwhile to break up into multiple chunks

    Note:

    Specifying a PARALLEL clause is of value only when dealing with large amounts of data. Otherwise, it is not advisable to specify a PARALLEL clause, and doing so can be detrimental.

The REJECT LIMIT clause specifies that there is no limit on the number of errors that can occur during a query of the external data. For parallel access, this limit applies to each parallel execution server independently. For example, if REJECT LIMIT is specified, each parallel query process is allowed 10 rejections. Hence, the only precisely enforced values for REJECT LIMIT on parallel query are 0 and UNLIMITED.

In this example, the INSERT INTO TABLE statement generates a dataflow from the external data source to the Oracle Database SQL engine where data is processed. As data is parsed by the access driver from the external table sources and provided to the external table interface, the external data is converted from its external representation to its Oracle Database internal datatype.

See Also:

Oracle Database SQL Language Reference provides details of the syntax of the CREATE TABLE statement for creating external tables and specifies restrictions on the use of clauses

Altering External Tables

You can use any of the ALTER TABLE clauses shown in Table 18-3 to change the characteristics of an external table. No other clauses are permitted.

Table 18-3 ALTER TABLE Clauses for External Tables

ALTER TABLE Clause Description Example

REJECT LIMIT

Changes the reject limit

ALTER TABLE admin_ext_employees
   REJECT LIMIT 100;

PROJECT COLUMN

Determines how the access driver validates rows in subsequent queries:

  • PROJECT COLUMN REFERENCED: the access driver processes only the columns in the select list of the query. This setting may not provide a consistent set of rows when querying a different column list from the same external table. This is the default.

  • PROJECT COLUMN ALL: the access driver processes all of the columns defined on the external table. This setting always provides a consistent set of rows when querying an external table.

ALTER TABLE admin_ext_employees
   PROJECT COLUMN REFERNCED;

ALTER TABLE admin_ext_employees
   PROJECT COLUMN ALL;

DEFAULT DIRECTORY

Changes the default directory specification

ALTER TABLE admin_ext_employees 
    DEFAULT DIRECTORY admin_dat2_dir;

ACCESS PARAMETERS

Allows access parameters to be changed without dropping and re-creating the external table metadata

ALTER TABLE admin_ext_employees
    ACCESS PARAMETERS
       (FIELDS TERMINATED BY ';');

LOCATION

Allows data sources to be changed without dropping and re-creating the external table metadata

ALTER TABLE admin_ext_employees
   LOCATION ('empxt3.txt',
             'empxt4.txt');

PARALLEL

No difference from regular tables. Allows degree of parallelism to be changed.

No new syntax

ADD COLUMN

No difference from regular tables. Allows a column to be added to an external table. Virtual columns are not permitted.

No new syntax

MODIFY COLUMN

No difference from regular tables. Allows an external table column to be modified. Virtual columns are not permitted.

No new syntax

SET UNUSED

Transparently converted into an ALTER TABLE DROP COLUMN command. Because external tables consist of metadata only in the database, the DROP COLUMN command performs equivalently to the SET UNUSED command.

No new syntax

DROP COLUMN

No difference from regular tables. Allows an external table column to be dropped.

No new syntax

RENAME TO

No difference from regular tables. Allows external table to be renamed.

No new syntax


Dropping External Tables

For an external table, the DROP TABLE statement removes only the table metadata in the database. It has no affect on the actual data, which resides outside of the database.

System and Object Privileges for External Tables

System and object privileges for external tables are a subset of those for regular table. Only the following system privileges are applicable to external tables:

  • CREATE ANY TABLE

  • ALTER ANY TABLE

  • DROP ANY TABLE

  • SELECT ANY TABLE

Only the following object privileges are applicable to external tables:

  • ALTER

  • SELECT

However, object privileges associated with a directory are:

  • READ

  • WRITE

For external tables, READ privileges are required on directory objects that contain data sources, while WRITE privileges are required for directory objects containing bad, log, or discard files.