| Oracle® Database Utilities 10g Release 1 (10.1) Part Number B10825-01 | 
 | 
| 
 | View PDF | 
This chapter describes the ORACLE_DATAPUMP access driver. The following topics are discussed:
To use the information in this chapter, you must know enough about SQL to be able to create an external table and perform queries against it.
| Notes: 
 | 
When you create the external table, you can specify certain parameters in an access_parameters clause. This clause is optional, as are its individual parameters. For example, you could specify LOGFILE, but not VERSION, or vice versa. The syntax for the access_parameters clause is as follows:

Comments are lines that begin with two hyphens followed by text. Comments must be placed before any access parameters, for example:
--This is a comment. --This is another comment. NOLOG
All text to the right of the double hyphen is ignored, until the end of the line.
LOGFILE specifies the name of the log file that contains any messages generated while the dump file was being accessed. NOLOGFILE prevents the creation of a log file. If a directory object is not specified as part of the log file name, then the directory object specified by the DEFAULT DIRECTORY attribute is used. If a directory object is not specified and no default directory was specified, an error is returned.
If LOGFILE is not specified, a log file is created in the default directory and the name of the log file is generated from the table name and the process ID with an extension of .log. If a log file already exists by the same name, the access driver reopens that log file and appends the new log information to the end. See Filenames for LOGFILE for information about using wildcards to create unique filenames during parallel loads or unloads.
See Example of LOGFILE Usage for ORACLE_DATAPUMP.
The access driver does some symbol substitution to help make filenames unique in the case of parallel loads. The symbol substitutions supported are as follows:
%p is replaced by the process ID of the current process. For example, if the process ID of the access driver is 12345, then exttab_%p.log becomes exttab_12345.log.
%a is replaced by the agent number of the current process. The agent number is the unique number assigned to each parallel process accessing the external table. This number is padded to the left with zeros to fill three characters. For example, if the third parallel agent is creating a file and exttab_%a.log was specified as the filename, then the agent would create a file named exttab_003.log.
%% is replaced by '%'. If there is a need to have a percent sign in the filename, then this symbol substitution must be used.
If the '%' character is followed by anything other than one of the characters in the preceding list, then an error is returned.
If %p or %a is not used to create unique filenames for output files and an external table is being accessed in parallel, output files may be corrupted or agents may be unable to write to the files.
If no extension is supplied for the file, a default extension of .log will be used. If the name generated is not a valid filename, an error is returned and no data is loaded or unloaded.
In the following example, the dump file, dept_dmp, is in the directory identified by the directory object, load_dir, but the log file, deptxt.log, is in the directory identified by the directory object, log_dir.
CREATE TABLE dept_xt (dept_no INT, dept_name CHAR(20), location CHAR(20))
ORGANIZATION EXTERNAL (TYPE ORACLE_DATAPUMP DEFAULT DIRECTORY load_dir 
ACCESS PARAMETERS (LOGFILE log_dir:deptxt) LOCATION ('dept_dmp'));
The VERSION clause is used when data is unloaded from an Oracle database version that is later than the database version where the data will be loaded. In this case, the VERSION clause should be specified for the external table used to unload the data and the value specified should be the version that will be used to read the data.
The ORACLE_DATAPUMP access driver can be used to populate a file with data. The data in the file is written in a binary format that can only be read by the ORACLE_DATAPUMP access driver. Once the file has been populated with data, that file can be used as the dump file for another external table in the same database or in a different database.
The following steps use the sample schema, oe, to show an extended example of how you can use the ORACLE_DATAPUMP access driver to unload and load data. (The example assumes that the directory object def_dir1 already exists, and that user oe has read and write access to it.)
An external table will populate a file with data only as part of creating the external table with the AS SELECT clause. The following example creates an external table named inventories_xt and populates the dump file for the external table with the data from table inventories in the oe schema.
SQL> CREATE TABLE inventories_xt
  2  ORGANIZATION EXTERNAL
  3  (
  4    TYPE ORACLE_DATAPUMP
  5    DEFAULT DIRECTORY def_dir1
  6    LOCATION ('inv_xt.dmp')
  7  )
  8  AS SELECT * FROM inventories;
Table created.
Describe both inventories and the new external table, as follows. They should both match.
SQL> DESCRIBE inventories Name Null? Type ---------------------------------------- --------- ---------------- PRODUCT_ID NOT NULL NUMBER(6) WAREHOUSE_ID NOT NULL NUMBER(3) QUANTITY_ON_HAND NOT NULL NUMBER(8) SQL> DESCRIBE inventories_xt Name Null? Type ----------------------------------------- -------- ----------------- PRODUCT_ID NOT NULL NUMBER(6) WAREHOUSE_ID NOT NULL NUMBER(3) QUANTITY_ON_HAND NOT NULL NUMBER(8)
Now that the external table is created, it can be queried just like any other table. For example, select the count of records in the external table, as follows:
SQL> SELECT COUNT(*) FROM inventories_xt;
  COUNT(*)
----------
      1112
Compare the data in the external table against the data in inventories. There should be no differences.
SQL> SELECT * FROM inventories MINUS SELECT * FROM inventories_xt; no rows selected
After an external table has been created and the dump file populated by the CREATE TABLE AS SELECT statement, no rows may be added, updated, or deleted from the external table. Any attempt to modify the data in the external table will fail with an error.
The following example shows an attempt to use data manipulation language (DML) on an existing external table. This will return an error, as shown.
SQL> DELETE FROM inventories_xt WHERE warehouse_id = 5;
DELETE FROM inventories_xt WHERE warehouse_id = 5
            *
ERROR at line 1:
ORA-30657: operation not supported on external organized table
The dump file created for the external table can now be moved and used as the dump file for another external table in the same database or different database. Note that when you create an external table that uses an existing file, there is no AS SELECT clause for the CREATE TABLE statement.
SQL> CREATE TABLE inventories_xt2
  2  (
  3    product_id          NUMBER(6),
  4    warehouse_id        NUMBER(3),
  5    quantity_on_hand    NUMBER(8)
  6  )
  7  ORGANIZATION EXTERNAL
  8  (
  9    TYPE ORACLE_DATAPUMP
 10    DEFAULT DIRECTORY def_dir1
 11    LOCATION ('inv_xt.dmp')
 12  );
Table created.
Compare the data for the new external table against the data in the inventories table. The product_id field will be converted to a compatible datatype before the comparison is done. There should be no differences.
SQL> SELECT * FROM inventories MINUS SELECT * FROM inventories_xt2; no rows selected
Create an external table with three dump files and with a degree of parallelism of three.
SQL> CREATE TABLE inventories_xt3
  2  ORGANIZATION EXTERNAL
  3  (
  4    TYPE ORACLE_DATAPUMP
  5    DEFAULT DIRECTORY def_dir1
  6    LOCATION ('inv_xt1.dmp', 'inv_xt2.dmp', 'inv_xt3.dmp')
  7  )
  8  PARALLEL 3
  9  AS SELECT * FROM inventories;
Table created.
Compare the data unload against inventories. There should be no differences.
SQL> SELECT * FROM inventories MINUS SELECT * FROM inventories_xt3; no rows selected
Create an external table containing some rows from table inventories.
SQL> CREATE TABLE inv_part_xt
  2  ORGANIZATION EXTERNAL
  3  (
  4  TYPE ORACLE_DATAPUMP
  5  DEFAULT DIRECTORY def_dir1
  6  LOCATION ('inv_p1_xt.dmp')
  7  )
  8  AS SELECT * FROM inventories WHERE warehouse_id < 5;
 
Table created.
Create another external table containing the rest of the rows from inventories.
SQL> drop table inv_part_xt;
 
Table dropped.
 
SQL> 
SQL> CREATE TABLE inv_part_xt
  2  ORGANIZATION EXTERNAL
  3  (
  4  TYPE ORACLE_DATAPUMP
  5  DEFAULT DIRECTORY def_dir1
  6  LOCATION ('inv_p2_xt.dmp')
  7  )
  8  AS SELECT * FROM inventories WHERE warehouse_id >= 5;
 
Table created.
Create an external table that uses the two dump files created in Steps 10 and 11.
SQL> CREATE TABLE inv_part_all_xt
  2  (
  3  product_id NUMBER(6),
  4  warehouse_id NUMBER(3),
  5  quantity_on_hand NUMBER(8)
  6  )
  7  ORGANIZATION EXTERNAL
  8  (
  9  TYPE ORACLE_DATAPUMP
 10  DEFAULT DIRECTORY def_dir1
 11  LOCATION ('inv_p1_xt.dmp','inv_p2_xt.dmp')
 12  );
 
Table created.
Compare the new external table to the inventories table. There should be no differences. This is because the two dump files used to create the external table have the same metadata (for example, the same table name inv_part_xt and the same column information)
SQL> SELECT * FROM inventories MINUS SELECT * FROM inv_part_all_xt; no rows selected
The dump file must be on a disk big enough to hold all the data being written. If there is insufficient space for all of the data, then an error will be returned for the CREATE TABLE AS SELECT statement. One way to alleviate the problem is to create multiple files in multiple directory objects (assuming those directories are on different disks) when executing the CREATE TABLE AS SELECT statement. Multiple files can be created by specifying multiple locations in the form directory:file in the LOCATION clause and by specifying the PARALLEL clause. Each parallel I/O server process that is created to populate the external table writes to its own file. The number of files in the LOCATION clause should match the degree of parallelization because each I/O server process requires its own files. Any extra files that are specified will be ignored. If there are not enough files for the degree of parallelization specified, then the degree of parallelization will be lowered to match the number of files in the LOCATION clause.
Here is an example of unloading the inventories table into three files.
SQL> CREATE TABLE inventories_XT_3
  2  ORGANIZATION EXTERNAL
  3  (
  4    TYPE ORACLE_DATAPUMP
  5    DEFAULT DIRECTORY def_dir1
  6    LOCATION ('inv_xt1.dmp', 'inv_xt2.dmp', 'inv_xt3.dmp')
  7  )
  8  PARALLEL 3
  9  AS SELECT * FROM oe.inventories;
Table created.
The degree of parallelization is not tied to the number of files in the LOCATION clause when reading from ORACLE_DATAPUMP external tables. There is information in the dump files so that multiple parallel I/O server processes can read different portions of the same file. So, even if there is only one dump file, the degree of parallelization can be increased to speed the time required to read the file.
Dump files populated by different external tables can all be specified in the LOCATION clause of another external table. For example, data from different production databases can be unloaded into separate files, and then those files can all be included in an external table defined in a data warehouse. This provides an easy way of aggregating data from multiple sources. The only restriction is that the metadata for all of the external tables be exactly the same. This means that the character set, time zone, schema name, table name, and column names must all match. Also, the columns must be defined in the same order, and their datatypes must be exactly alike. This means that after you create the first external table you must drop it so that you can use the same table name for the second external table. This ensures that the metadata listed in the two dump files is the same and they can be used together to create the same external table.
SQL> CREATE TABLE inv_part_1_xt
  2  ORGANIZATION EXTERNAL
  3  (
  4    TYPE ORACLE_DATAPUMP
  5    DEFAULT DIRECTORY def_dir1
  6    LOCATION ('inv_p1_xt.dmp')
  7  )
  8  AS SELECT * FROM oe.inventories WHERE warehouse_id < 5;
Table created.
SQL> DROP TABLE inv_part_1_xt;
SQL> CREATE TABLE inv_part_1_xt
  2  ORGANIZATION EXTERNAL
  3  (
  4    TYPE ORACLE_DATAPUMP
  5    DEFAULT directory def_dir1
  6    LOCATION ('inv_p2_xt.dmp')
  7  )
  8  AS SELECT * FROM oe.inventories WHERE warehouse_id >= 5;
Table created.
SQL> CREATE TABLE inv_part_all_xt
  2  (
  3    PRODUCT_ID          NUMBER(6),
  4    WAREHOUSE_ID        NUMBER(3),
  5    QUANTITY_ON_HAND    NUMBER(8)
  6  )
  7  ORGANIZATION EXTERNAL
  8  (
  9    TYPE ORACLE_DATAPUMP
 10    DEFAULT DIRECTORY def_dir1
 11    LOCATION ('inv_p1_xt.dmp','inv_p2_xt.dmp')
 12  );
Table created.
SQL> SELECT * FROM inv_part_all_xt MINUS SELECT * FROM oe.inventories;
no rows selected
You may encounter the following situations when you use external tables to move data between databases:
The database character set and the database national character set may be different between the two platforms.
The endianness of the platforms for the two databases may be different.
The ORACLE_DATAPUMP access driver automatically resolves some of these situations.
The following datatypes are automatically converted during loads and unloads:
Character (CHAR, NCHAR, VARCHAR2, NVARCHAR2)
RAW
NUMBER
Date/Time
BLOB
CLOB and NCLOB
ROWID and UROWID
If you attempt to use a datatype that is not supported for external tables, you will receive an error. This is demonstrated in the following example, in which the unsupported datatype, LONG, is used:
SQL> CREATE TABLE bad_datatype_xt
  2  (
  3    product_id             NUMBER(6),
  4    language_id            VARCHAR2(3),
  5    translated_name        NVARCHAR2(50),
  6    translated_description LONG
  7  )
  8  ORGANIZATION EXTERNAL
  9  (
 10    TYPE ORACLE_DATAPUMP
 11    DEFAULT DIRECTORY def_dir1
 12    LOCATION ('proddesc.dmp')
 13  );
  translated_description LONG
  *
ERROR at line 6:
ORA-30656: column type not supported on external organized table
An external table supports a subset of all possible datatypes for columns. In particular, it supports character datatypes (except LONG), the RAW datatype, all numeric datatypes, and all date, timestamp, and interval datatypes.
This section describes how you can use the ORACLE_DATAPUMP access driver to unload and reload data for some of the unsupported datatypes, specifically:
BFILE
LONG and LONG RAW
Final object types
Tables of final object types
The BFILE datatype has two pieces of information stored in it: the directory object for the file and the name of the file within that directory object.
You can unload BFILE columns using the ORACLE_DATAPUMP access driver by storing the directory object name and the filename in two columns in the external table. The procedure DBMS_LOB.FILEGETNAME will return both parts of the name. However, because this is a procedure, it cannot be used in a SELECT statement. Instead, two functions are needed. The first will return the name of the directory object, and the second will return the name of the file.
The steps in the following extended example demonstrate the unloading and loading of BFILE datatypes.
Create a function to extract the directory object for a BFILE column. Note that if the column is NULL, then NULL is returned.
SQL> CREATE FUNCTION get_dir_name (bf BFILE) RETURN VARCHAR2 IS 2 DIR_ALIAS VARCHAR2(255); 3 FILE_NAME VARCHAR2(255); 4 BEGIN 5 IF bf is NULL 6 THEN 7 RETURN NULL; 8 ELSE 9 DBMS_LOB.FILEGETNAME (bf, dir_alias, file_name); 10 RETURN dir_alias; 11 END IF; 12 END; 13 / Function created.
Create a function to extract the filename for a BFILE column.
SQL> CREATE FUNCTION get_file_name (bf BFILE) RETURN VARCHAR2 is 2 dir_alias VARCHAR2(255); 3 file_name VARCHAR2(255); 4 BEGIN 5 IF bf is NULL 6 THEN 7 RETURN NULL; 8 ELSE 9 DBMS_LOB.FILEGETNAME (bf, dir_alias, file_name); 10 RETURN file_name; 11 END IF; 12 END; 13 / Function created.
You can then add a row with a NULL value for the BFILE column, as follows:
SQL> INSERT INTO PRINT_MEDIA (product_id, ad_id, ad_graphic) 2 VALUES (3515, 12001, NULL); 1 row created.
You can use the newly created functions to populate an external table. Note that the functions should set columns ad_graphic_dir and ad_graphic_file to NULL if the BFILE column is NULL.
Create an external table to contain the data from the print_media table. Use the get_dir_name and get_file_name functions to get the components of the BFILE column.
SQL> CREATE TABLE print_media_xt
  2  ORGANIZATION EXTERNAL
  3  (
  4    TYPE oracle_datapump
  5    DEFAULT DIRECTORY def_dir1
  6    LOCATION ('pm_xt.dmp')
  7  ) AS
  8  SELECT product_id, ad_id,
  9         get_dir_name (ad_graphic) ad_graphic_dir,
 10         get_file_name(ad_graphic) ad_graphic_file
 11  FROM print_media;
Table created.
Create a function to load a BFILE column from the data that is in the external table. This function will return NULL if the ad_graphic_dir column in the external table is NULL.
SQL> CREATE FUNCTION get_bfile (dir VARCHAR2, file VARCHAR2) RETURN BFILE is 2 bf BFILE; 3 BEGIN 4 IF dir IS NULL 5 THEN 6 RETURN NULL; 7 ELSE 8 RETURN BFILENAME(dir,file); 9 END IF; 10 END; 11 / Function created.
The get_bfile function can be used to populate a new table containing a BFILE column.
SQL> CREATE TABLE print_media_int AS 2 SELECT product_id, ad_id, 3 get_bfile (ad_graphic_dir, ad_graphic_file) ad_graphic 4 FROM print_media_xt; Table created.
The data in the columns of the newly loaded table should match the data in the columns of the print_media table.
SQL> SELECT product_id, ad_id, 2 get_dir_name(ad_graphic), 3 get_file_name(ad_graphic) 4 FROM print_media_int 5 MINUS 6 SELECT product_id, ad_id, 7 get_dir_name(ad_graphic), 8 get_file_name(ad_graphic) 9 FROM print_media; no rows selected
The ORACLE_DATAPUMP access driver can be used to unload LONG and LONG RAW columns, but that data can only be loaded back into LOB fields. The steps in the following extended example demonstrate the unloading of LONG and LONG RAW datatypes.
If a table to be unloaded contains a LONG or LONG RAW column, then define the corresponding columns in the external table as CLOB for LONG columns or BLOB for LONG RAW columns.
SQL> CREATE TABLE long_tab 2 ( 3 key SMALLINT, 4 description LONG 5 ); Table created. SQL> INSERT INTO long_tab VALUES (1, 'Description Text'); 1 row created.
Now, an external table can be created that contains a CLOB column to contain the data from the LONG column. Note that when loading the external table, the TO_LOB operator is used to convert the LONG column into a CLOB.
SQL> CREATE TABLE long_tab_xt
  2  ORGANIZATION EXTERNAL
  3  (
  4    TYPE ORACLE_DATAPUMP
  5    DEFAULT DIRECTORY def_dir1
  6    LOCATION ('long_tab_xt.dmp')
  7  )
  8  AS SELECT key, TO_LOB(description) description FROM long_tab;
Table created.
The data in the external table can be used to create another table exactly like the one that was unloaded except the new table will contain a LOB column instead of a LONG column.
SQL> CREATE TABLE lob_tab 2 AS SELECT * from long_tab_xt; Table created.
Verify that the table was created correctly.
SQL> SELECT * FROM lob_tab;
       KEY  DESCRIPTION
----------------------------------------------------------------------------
         1  Description Text
Final column objects are populated into an external table by moving each attribute in the object type into a column in the external table. In addition, the external table needs a new column to track whether the column object is atomically null. The following steps demonstrate the unloading and loading of columns containing final object types.
In the following example, the warehouse column in the external table is used to track whether the warehouse column in the source table is atomically NULL.
SQL> CREATE TABLE inventories_obj_xt
  2  ORGANIZATION EXTERNAL
  3  (
  4    TYPE ORACLE_DATAPUMP
  5    DEFAULT DIRECTORY def_dir1
  6    LOCATION ('inv_obj_xt.dmp')
  7  )
  8  AS
  9  SELECT oi.product_id,
 10         DECODE (oi.warehouse, NULL, 0, 1) warehouse,
 11         oi.warehouse.location_id location_id,
 12         oi.warehouse.warehouse_id warehouse_id,
 13         oi.warehouse.warehouse_name warehouse_name,
 14         oi.quantity_on_hand
 15  FROM oc_inventories oi;
Table created.
The columns in the external table containing the attributes of the object type can now be used as arguments to the type constructor function when loading a column of that type. Note that the warehouse column in the external table is used to determine whether to call the constructor function for the object or set the column to NULL.
Load a new internal table that looks exactly like the oc_inventories view. (The use of the WHERE 1=0 clause creates a new table that looks exactly like the old table but does not copy any data from the old table into the new table.)
SQL> CREATE TABLE oc_inventories_2 AS SELECT * FROM oc_inventories WHERE 1 = 0; Table created. SQL> INSERT INTO oc_inventories_2 2 SELECT product_id, 3 DECODE (warehouse, 0, NULL, 4 warehouse_typ(warehouse_id, warehouse_name, 5 location_id)), quantity_on_hand 6 FROM inventories_obj_xt; 1112 rows created.
Object tables have an object identifier that uniquely identifies every row in the table. The following situations can occur:
If there is no need to unload and reload the object identifier, then the external table only needs to contain fields for the attributes of the type for the object table.
If the object identifier (OID) needs to be unloaded and reloaded and the OID for the table is one or more fields in the table, (also known as primary-key-based OIDs), then the external table has one column for every attribute of the type for the table.
If the OID needs to be unloaded and the OID for the table is system-generated, then the procedure is more complicated. In addition to the attributes of the type, another column needs to be created to hold the system-generated OID.
The steps in the following example demonstrate this last situation.
Create a table of a type with system-generated OIDs:
SQL> CREATE TYPE person AS OBJECT (name varchar2(20)) NOT FINAL
  2  /
Type created.
SQL> CREATE TABLE people OF person;
Table created.
SQL> INSERT INTO people VALUES ('Euclid');
1 row created.
Create an external table in which the column OID is used to hold the column containing the system-generated OID.
SQL> CREATE TABLE people_xt
  2  ORGANIZATION EXTERNAL
  3  (
  4    TYPE ORACLE_DATAPUMP
  5    DEFAULT DIRECTORY def_dir1
  6    LOCATION ('people.dmp')
  7  )
  8  AS SELECT SYS_NC_OID$ oid, name FROM people;
Table created.
Create another table of the same type with system-generated OIDs. Then, execute an INSERT statement to load the new table with data unloaded from the old table.
SQL> CREATE TABLE people2 OF person; Table created. SQL> SQL> INSERT INTO people2 (SYS_NC_OID$, SYS_NC_ROWINFO$) 2 SELECT oid, person(name) FROM people_xt; 1 row created. SQL> SQL> SELECT SYS_NC_OID$, name FROM people 2 MINUS 3 SELECT SYS_NC_OID$, name FROM people2; no rows selected
When identifiers (for example, column or table names) are specified in the external table access parameters, certain values are considered to be reserved words by the access parameter parser. If a reserved word is used as an identifier, it must be enclosed in double quotation marks. The following are the reserved words for the ORACLE_DATAPUMP access driver:
BADFILE
COMPATIBLE
DATAPUMP
DEBUG
INTERNAL
JOB
LATEST
LOGFILE
NOBADFILE
NOLOGFILE
PARALLEL
TABLE
VERSION
WORKERID