Query External Partitioned Data

If you want to query multiple data files in the Object Store as a single external table and the files can be represented as multiple logical partitions, then it is highly recommended to use an external partitioned table. Using an external partitioned table preserves the logical partitioning of your data files for query access. Use the procedure DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE to create an external partitioned table.

When you a create a partitioned external table, you include a partitioning clause in the DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE statement. The partitioning clause that you include depends upon the data files in the Cloud and the type of partitioning you use. See Creating a Partitioned External Table for more information.

  1. Store your object store credentials using the procedure DBMS_CLOUD.CREATE_CREDENTIAL.

    For example:

    BEGIN
      DBMS_CLOUD.CREATE_CREDENTIAL (
        credential_name => 'DEF_CRED_NAME',
        username => 'adwc_user@example.com',
        password => 'password'
      );
    END;
    /
    

    This operation stores the credentials in the database in an encrypted format. You can use any name for the credential name. Note that this step is required only once unless your object store credentials change. Once you store the credentials you can then use the same credential name for creating external tables.

    See CREATE_CREDENTIAL Procedure for information about the username and password parameters for different object storage services.

  2. Create an external partitioned table on top of your source files using the procedure  DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE.

    The procedure DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE supports external partitioned files in the supported cloud object storage services, including: Oracle Cloud Infrastructure Object Storage, Microsoft Azure, and AWS S3. The credential is a table level property; therefore, the external files must be on the same object store.

    For example:

    BEGIN
    DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE(
          table_name =>'PET1',  
          credential_name =>'DEF_CRED_NAME',
          format => json_object('delimiter' value ',', 'recorddelimiter' value 'newline', 'characterset' value 'us7ascii'),  
          column_list => 'col1 number, col2 number, col3 number'
          partitioning_clause => 'partition by range (col1)
                                    (partition p1 values less than (1000) location
                                        ( ''https://objectstorage.us-phoenix-1.oraclecloud.com/n/adwc/b/adwc_user/o/file_11.txt'')
                                     ,
                                     partition p2 values less than (2000) location
                                        ( ''https://objectstorage.us-phoenix-1.oraclecloud.com/n/adwc/b/adwc_user/o/file_21.txt'')
                                     ,
                                     partition p3 values less than (3000) location 
                                        ( ''https://objectstorage.us-phoenix-1.oraclecloud.com/n/adwc/b/adwc_user/o/file_31.txt'')
                                     )
         )
       END;
    / 
    
    

    The parameters are:

    • table_name: is the external table name.

    • credential_name: is the name of the credential created in the previous step.

    • partitioning_clause: is the complete partitioning clause, including the location information for individual partitions.

    • format: defines the options you can specify to describe the format of the source file.

    • column_list: is a comma delimited list of the column definitions in the source files.

    You can now run queries on the external partitioned table you created in the previous step. Your Autonomous Data Warehouse takes advantage of the partitioning information of your external partitioned table, ensuring that the query only accesses the relevant data files in the Object Store. For example, the following query only reads data files from partition P1:

    SELECT * FROM pet1 WHERE col1 < 750;

    If there are any rows in the source files that do not match the format options you specified, the query reports an error. You can use DBMS_CLOUD parameters, like rejectlimit, to suppress these errors. As an alternative, you can also validate the external partitioned table you created to see the error messages and the rejected rows so that you can change your format options accordingly. See Validate External Data for more information.

    See CREATE_EXTERNAL_PART_TABLE Procedure for detailed information about the parameters.

    See DBMS_CLOUD Package File URI Formats for more information on the supported cloud object storage services.