B Oracle Data Integrator Driver for XML Reference

This appendix describes how to work with the Oracle Data Integrator driver for XML.

This appendix includes the following sections:

B.1 Introduction to Oracle Data Integrator Driver for XML

Oracle Data Integrator Driver for XML (XML driver) handles an XML document as a JDBC data source. This allows Oracle Data Integrator to use XML documents as data servers.

With Oracle Data Integrator Driver for XML, Oracle Data Integrator can query XML documents using standard SQL syntax and perform changes in the XML files. These operations occur within transactions and can be committed or rolled back.

The Oracle Data Integrator driver for XML supports the following features:

  • Standard SQL (Structured Query Language) Syntax

  • Correlated subqueries, inner and outer joins

  • ORDER BY and GROUP BY

  • COUNT, SUM, MIN, MAX, AVG and other functions

  • Standard SQL functions

  • Transaction Management

  • Referential Integrity (foreign keys)

  • Saving Changes made on XML data into the XML files

B.2 XML Processing Overview

The XML driver works in the following way:

  1. The driver loads (upon connection or user request) the XML structure and data into a relational schema, using a XML to SQL Mapping.

  2. The user works on the relational schema, manipulating data through regular SQL statements or specific driver commands for driver operations.

  3. Upon disconnection or user request, the XML driver synchronizes the data and structure stored in the schema back to the XML file.

B.2.1 XML to SQL Mapping

The XML to SQL Mapping is a complex process that is used to map a hierarchical structure (XML) into a relational structure (schema). This mapping is automatic.

Elements and Attributes Mapping

The XML driver maps XML elements and attributes the following way:

  • Elements are mapped as tables with the same name.

  • Attributes are mapped as columns named like the attributes. Each column is created in the table representing the attribute's element.

Hierarchy & Order Mapping

Extra data may appear in the relational structure as follows:

  • In order to map the hierarchy of XML elements, or a one-to-many relation between elements, the XML driver generates in each table corresponding to an element the following extra columns:

    • <element_name>PK: This column identifies the element.

    • <parent_element_name>FK: This column links the current element to its parent in the hierarchy. It contains a value matching the parent element's <element_name>PK value. In case of XML recursion the parent element or ancestors of the parent element can be located in the same table.

  • Records in a table, unlike elements in an XML file, are not ordered, unless a specific column is used to define the order. The driver generates also a column named <element_name>ORDER to preserve the order of the elements. When adding new rows in the relational schema, make sure that the ORDER column is correctly set to have the elements correctly ordered under the parent element.

  • The root of the hierarchy is identified by a root table named after the root element. This table contains a single record with the following columns:

    • <root_element_name>PK: All level 1 sub-elements will refer to this PK entry.

    • SNPSFILENAME: This column contains the names of the XML file loaded into this schema.

    • SNPSFILEPATH: This column contains the XML file path.

    • SNPSLOADDATE: This column contains the date and time when the file was loaded into the schema.

    The values in this table are managed by the driver and should not be modified.

Mapping Exceptions

This section details some specific situations for the mapping of extra data.

  • Elements containing only #PCDATA are not mapped as tables, but as columns of the table representing their parent element. These columns are named <element_name>_DATA.

  • List Attributes are mapped as a new table with a link (PK, FK) to the table representing the element containing the list.

  • XML elements and attributes with names that match SQL reserved keywords are automatically renamed (an underscore is added after their name) in the relational structure to avoid naming conflict between table/column names and SQL reserved keywords. For example, an element named SELECT will be mapped to a table named SELECT_. Such elements are restored in the XML file with their original naming when a synchronize operation takes place.

Note that extra objects created by the driver are used to keep the XML file consistency. These records must be loaded in the relational schema before it is synchronized to an XML file.

B.2.2 XML Namespaces

The XML driver supports XML namespaces (xmlns:) specified for XML attributes and elements.

Elements or attributes specified with a namespace (using the syntax <namespace>:<element or attribute name>) are mapped as tables or columns prefixed with the namespace using the syntax: <namespace>_<element or attribute name>. When synchronizing the XML data back to the file, the namespace information is automatically generated.

Note:

In v3 mode, the table names are not prefixed with <namespace>_.

B.2.3 Managing Schemas

A schema corresponds to the concept used in Oracle database and other RDBM systems and is a container that holds a set of relational tables. A schema is a generic relational structure in which an entire set of XML file instances may be successfully parsed and extracted. The identified elements and attributes are inserted in the appropriate relational tables and fields.

This schema is generated by the XML driver from either an XML instance file, a DTD file, or an XSD file. It is recommended to generate the schema from a DTD or XSD file.

Note that only a single DTD or XSD file may be referenced in definition of an XML data server URL. In this case, this DTD or XSD may be considered as a master DTD or XSD file if the artifact includes references to other DTD / XSD files. Note that in certain cases multiple schemas may be required. In this case use the add_schema_bundle property.

B.2.3.1 Schema Storage

The schema may be stored either in a built-in engine or in an external database.

  • The built-in engine requires no other component to run. The XML schema is stored in memory within the driver. The SQL commands and functions available on this driver are detailed in the SQL Syntax.

  • The external database can be a relational database management system. The driver connects through JDBC to this engine, and uses it to store the schema. This enables the:

    • Use of the processing and storage power of the RDBMS engine

    • Use of the statements and functions of the RDBMS

    • Persistence of schema storage

    See Section B.3.3, "Using an External Database to Store the Data" for more information.

B.2.3.2 Multiple Schemas

It is possible to handle, within the same JDBC connection, multiple schemas and to load multiple XML files simultaneously. It is possible to CREATE, TRUNCATE, SET, and LOAD FILE INTO schemas. When connecting to the JDBC driver, you connect to the schema that is specified on the URL. It is possible to set the current schema to another one using the SET SCHEMA command. See Section B.4, "Detailed Driver Commands" for more information.

The default schema is a specific schema that is used for storing temporary data. The default schema is read-only and cannot be used to store XML files. It is recommeded to create a schema for each XML file.

It is also possible to automatically create additional schemas with different XML structures when creating the connection to the driver. See Section B.3.1, "Driver Configuration" for more information.

B.2.3.3 Accessing Data in the Schemas

Data in the schemas is handled using the SQL language.

It is possible to access tables in a schema that is different from the current schema. To access the tables of a different schema, prefix the table name with the schema name, followed by a period character (.). For example:

SELECT col1, schema2.table2.col2, table1.col3 FROM table1, schema2.table2.

This query returns data from table1 in the current schema, and from table2 from schema2.

Note:

Note that the other schema must be located on the same storage space - built-in engine or external database - as than the current schema.

B.2.3.4 Case Sensitivity

A schema cannot be case-sensitive. All elements in the schema (tables and columns) are in UPPERCASE. If the XML file element names contain lowercase letters, they are converted to upper case. When the elements are synchronized to the XML file, their names are created with their original case.

B.2.3.5 Loading/Synchronizing

A schema is usually automatically created when connecting to an XML file, and loaded with the data contained in the XML file. It is possible to force the schema creation and the data loading in the schema using specific driver commands. See Section B.4, "Detailed Driver Commands" for more information. It is also possible to force a synchronization process of the data by using the SYNCHRONIZE command, as described in Section B.4.9, "SYNCHRONIZE".

B.2.4 Locking

When accessing an XML file, the driver locks it in order to prevent other instances of the driver to connect to the file. The lock file has the same name as the XML file but an .lck extension.

If the driver is incorrectly disconnected, a lock may remain on the file. To remove it, delete the .lck file. It is also possible to unlock an XML file with the UNLOCK FILE command.

B.2.5 XML Schema (XSD) Support

XSD is supported by the XML driver for describing XML file structures. See Section B.8, "XML Schema Supported Features" for more information.

In addition, the XML driver supports document validation against XSD schemas specified within the XML file. This operation may be performed using the VALIDATE driver specific command.

B.3 Installation and Configuration

The Oracle Data Integrator driver for XML is automatically installed with Oracle Data Integrator. The following topics cover advanced configuration topics and reference information.

This section contains the following topics:

Note:

If using an External Database storage, you must also make sure that the JDBC driver used to connect the external database, as well as the.properties file are in the classpath.

B.3.1 Driver Configuration

This section details the driver configuration.

  • The driver name is: com.sunopsis.jdbc.driver.xml.SnpsXmlDriver

  • The URL Syntax is: jdbc:snps:xml?f=<filename>[&s=<schema>&<property>=<value>...]

The properties for the URL are detailed in Table B-1.

Table B-1 Driver Properties

Property Mandatory Type Default Description

blank_attribute_as_column or baac

No

boolean (true | false)

false

If this option is set to true, any empty element in the XML file that does not have child element of its own is considered as a column rather than a table.

file or f

Yes

string (file location)

-

XML file name. Use slash "/" in the path name instead of back slash "\". It is possible to use an HTTP, FTP or File URL to locate the file. Files located by URL are read-only.

For an XML file, if this property is missing, a relational schema is created by the XML driver from the DTD/XSD file and no XML file is searched for.

dtd or d

No

string (file location)

-

Description file: This file may be a DTD or XSD file. It is possible to use an HTTP, FTP or File URL to locate the file. Files located by URL are read-only.

Note that the DTD or XSD file that is specified in the URL takes precedence over the DTD or XSD file that is specified within the XML file. References should be made with an absolute path.

For an XML file, if this property is missing, and no DTD or XSD is referenced in the XML file, the driver will automatically consider a DTD file name similar to the XML file name with .dtd extension.

A DTD file may be created from the XML file structure depending on the generate_dtd URL property.

Note that when no DTD or XSD file is present, the relational structure is built using only the XML file content. It is not recommended to reverse-engineer the data model from such a structure as one XML file instance may not contain all the possible elements described in the DTD or XSD, and data model may be incomplete.

root_elt or re

No

String

-

Name of the element to take as the root table of the schema. This value is case sensitive. This property can be used for reverse-engineering for example a specific message definition from a WSDL file, or when several possible root elements exist in a XSD file.

Important: This property is used to designate ONLY the Element in the XSD / DTD file which will serve as the Root Element DEFINITION of any XML instance file Root Element.

read_only or ro

No

boolean (true | false)

false

Open the XML file in read only mode.

schema or s

No

string

-

Name of the schema where the XML file will be loaded. If this property is missing, a schema name is automatically generated from the XML file name.

If this property is not specified in the XML data Server URL, the XML Driver will automatically create a schema name. This schema will be named after the five first letters of the XML file name.

Note: It is not possible to make more than one connection to a schema. Subsequent connections fail if trying to connect to a schema already in use.

Important: The schema name should be specified in uppercase.

Important: It is forbidden to have a schema name identical to an XML ELEMENT name.

standalone or st

No

boolean (true | false)

false

If this option is set to true, the schema for this connection is completely isolated from all other schemas. With this option, you can specify the same schema name for several connections, each schema being kept separated. When using this option, tables in this schema cannot be accessed from other schemas, and this connection cannot access tables from other schemas. The schema is restricted to this connection and only this one. Other connections cannot see this schema.

This option is active only for In-Memory HSQL intermediate database. Using this option causes increased memory consumption by the agent, as for every staging schema, an entirely new HSQL instance is created in the in-memory.

Useful for parallel jobs with the same topology in order to avoid that the jobs overlap each other.

Note: This option is not applicable when an external database is used.

ns_prefix_generation or nspg

No

auto | xml | xsd

auto

This option defines how namespace prefixes are generated and written in the XML file.

  • auto (default): Prefixes are automatically generated from the namespace names themselves when possible or generated as ns1, ns2, etc.

  • xml: Namespace prefixes are taken from the source XML file, if any.

  • xsd: Namespace prefixes are taken from the XSD file, if any.

Note that the xsd option value assumes that a similar prefix is not used in several XSD files to reference a different namespace.

no_default_ns or ndns

No

boolean (true | false)

false

If this property is set to true, the driver generates the target file with no default namespace entry.

no_closing_tags or nct

No

boolean (true | false)

false

If this property is set to true, the driver generates the empty tags without their closing tags (for example <element/>). If set to false the driver generates an empty element as <element></element>. This property is true by default if the v1_compatibility property is used.

db_props or dp

No

string

-

This property is used to use an external database instead of the memory engine to store the schema.

The db_props property indicates that the schema must be loaded in a database schema whose connection information are stored in a external database property file named like the db_props property with the extension .properties. This property file must be located in the application's classpath.

load_data_on_connect or ldoc

No

boolean (true | false)

true

Load automatically the data in the schema when performing the JDBC connection. If set to false, a SYNCHRONIZE statement is required after the connection to load the data.

This option is useful to test the connection or browse metadata without loading all the data.

drop_on_disc or dod

No

boolean (true | false)

false

Drop automatically the schema when closing the JDBC connection.

If true, the schema is stored in the built-in engine, it is always dropped.

If true and the data is on an external database, only the current reference to the schema in memory will be dropped, but the tables will remain in the external database. This means that if you try to connect to this schema again, it will reuse the tables in the external database rather than starting from scratch (as it would when the data is loaded in memory).

ignore_unknown_elements or iue

No

boolean (true | false)

false

Ignore all elements in the XML file that do not exist in the associated DTD (Document Type Definition) or XSD (XML Schema Definition) file.

useMaxValue

No

boolean (true | false)

false

When this property is set to true, elements for which maxOccurs is not specified in the XSD are considered as maxOccurs ="unbounded". Otherwise, the driver assumes that maxOccurs=1 when maxOccurs is not specified.

generate_dtd or gd

No

yes | no | auto

auto

Defines if a DTD file must be created from the XML file structure:

  • auto: create the DTD file if the it does not exist. if the DTD exists, does nothing.

  • yes: always create the DTD file. An existing DTD will be overwritten.

  • no: never create the DTD file. The DTD file must exist.

Warning: DTD files created using this option contain only the definition of XML elements appearing in the XML file, and may not be complete.

java_encoding or je

No

string (encoding code)

UTF8

Target file encoding (for example: ISO8859_1). You will find a list of supported encoding at the following URL: https://download.oracle.com/javase/6/docs/technotes/guides/intl/encoding.doc.html.

Note that if the Java encoding is specified, the XML encoding should also be specified.

useimplicitmaxvalue or uimv

No

boolean (true | false)

false

With this property set to yes, an elements for which maxOccurs is not specified in the XSD is considered as multivalued (maxOccurs="unbounded").

xml_encoding or xe

No

string (encoding code)

UTF8

Encoding specified in the generated XML File, in the tag (for example ISO-8859-1: <?xml version="1.0" encoding="ISO-8859-1"?>. You will find a list of supported encoding at the following URL: https://download.oracle.com/javase/6/docs/technotes/guides/intl/encoding.doc.html.

Note that if the XML encoding is specified, the Java encoding should also be specified.

v1_compatibility or v1

No

boolean (true | false)

false

With this property set to true, the driver performs the XML to SQL mapping as if in version 1.x. This property is provided for compatibility.

compat_mode

No

string

v3

Indicates the compatibility with mapping modes. This property can take the following values:

  • v1 is equivalent to v1_compatibility=true which is the 1.x compatibility mode

  • v2 indicates the 10g/11g compatibility mode where the custom written XSD parser is used

    Please note that when you use a DTD or only a XML file, you must specify compat_mode=v2 in the JDBC URL. For example:

    jdbc:snps:xml?file=/tmp/myfile.xml&compat_mode=v2

    jdbc:snps:xml?f=/tmp/myfile.xml&compat_mode=v2

  • v3 indicates the compability with the XDK XSD parser.

    Please note that compat_mode=v3 is not supported when you use a DTD or only a XML file. For example, the following syntaxes are not supported:

    jdbc:snps:xml?file=/tmp/myfile.xml&compat_mode=v3

    jdbc:snps:xml?f=/tmp/myfile.xml&compat_mode=v3

If compat_mode=v3, the v1_compatibility property will be ignored.

numeric_ids or ni

No

boolean (true | false)

true

If set to true, all internal Primary and Foreign Keys are of NUMERIC type. Otherwise, they are of the VARCHAR type.

id_length or il

No

integer

10 / 30

The length of the internal Primary and Foreign Key columns. The default is 10 for NUMERIC column types and 30 for VARCHAR column.

no_batch_update or nobu

No

boolean (true | false)

false

Batch update is not used for this connection. The command to set the batch update is not sent. This prevents errors to occur for external databases that do not support this JDBC feature, or allows to debug errors related to batch update usage.

add_schema_bundle or asb

No

string

-

Additional schemas bundle file. This property indicates that additional schemas must be created at connection time. The description for these extra schemas are located in an additional schemas property file named like the add_schema_bundle property with the extension ".properties". The additional schemas property file contains a list of valid JDBC driver's URL. In this file, the property names are ignored. Only the list of values is taken into account.

All these additional schemas are created with the drop_on_disconnect option set to true by default.

Example of additional schemas property files contents:

addschema_1=jdbc:snps:xml?f=c:/myfile.xml&ro=true&s=myschema1 addschema_2=jdbc:snps:xml?file=c:/myfile2.xml&s=myschema2 addschema_3=jdbc:snps:xml?d=c:/myfile3.dtd&s=myschema3

add_schema_path or asp

No

string (directory)

-

Directory containing a set of XSD files. For each XSD file in this directory, an additional schema is created in the built-in engine or external database storage, based on this XSD. Note that no object is created in the external database storage for these additional schemas. The schema names are default generated named (5 first characters of the file name, uppercased).

Note: This option is not supported in v3 mode.

transform_nonascii or tna

No

boolean (true|false)

true

Transform Non Ascii. Set to false to keep non-ascii characters.

max_table_name_length or mtnl

No

integer

-

Maximum length of table names irrespective of the value as supported by internal/external DB.

max_column_name_length or mcnl

No

integer

-

Maximum length of column names irrespective of the value as supported by internal/external DB.

case_sens or cs

No

boolean (true | false)

true

Indicates whether the table and column names are case sensitive or not. Name comparisons are carried out accordingly.


Table B-2 lists URL samples.

Table B-2 URL Samples

URL Sample Action

jdbc:snps:xml

Connects to the default schema.

jdbc:snps:xml?f=/tmp/myfile.xml&ro=true&d=/tmp/mydtd.dtd

Open the /tmp/myfile.xml file in read only mode, using the /tmp/mydtd.dtd DTD.

jdbc:snps:xml?file=/tmp/myfile.xml

Open the /tmp/myfile.xml file in read/write mode.

jdbc:snps:xml?s=myschema

Connect directly to the schema myschema


B.3.2 Automatically Create Multiple Schemas

It is possible to automatically create additional schemas with different XML structures when creating the connection with the driver. This is performed by:

  • Declaring in the add_schema_bundle URL property a property file that contains a list of JDBC URLs corresponding to the different additional schemas to create.

  • Declaring in the add_schema_path URL property a directory that contains a set of XSD files. For each XSD file an additional schema is created in the built-in engine, based on the XML schema description.

  • Specifying additional valid driver URLs as JDBC properties, named addschema_X (X is a number). An additional schema will be created for each URL found in a JDBC property called addschema_X.

Note that all these additional schemas are automatically dropped when their last connection is closed.

B.3.3 Using an External Database to Store the Data

In most cases, the XML driver stores the relational schema mapping of the XML schema in a built-in engine. It is also possible to store the relational schema in an external relational database.

Use external storage:

  • When loading very large XML files with the XML driver into the relational schema derived by the XML driver

  • To reduce the overall time taken to process the files with the built-in engine of the XML driver

  • To avoid timeouts to the ODI repositories. Please note that the time taken to process an XML file is dependent on:

    • The complexity of the XML file structure

    • The length of XML file content

    • The host server RAM resources

    • The host server CPU resources

Before using external storage, ensure that you have understood the impacts of its usage and that you have increased the ODI timeout to values which conform to your performance requirements.

Note:

Supported RDBMS for external storage include Oracle, Microsoft SQL Server, MySQL, and Hypersonic SQL 2.0. The complete list of technologies that support external storage is available on Oracle Technical Network (OTN) :

http://www.oracle.com/technology/products/oracle-data-integrator/index.html.

These schemas are created in addition to the one that may be created with the properties specified in the JDBC driver URL.

The external storage is configured with a set of properties described in Table B-3. These properties can be passed in several ways:

Passing the Properties in the Driver URL

The properties can be directly set in the driver URL. When using this method, the properties have to be prefixed with dp_ . For example, if connecting to an Oracle database, specify the Oracle JDBC driver name in the driver parameter as follows:

dp_driver=oracle.jdbc.OracleDriver.

Setting the Properties in ODI Studio

The properties can be specified on the Properties tab of the Data Server editor in Topology Navigator. When using this method, the properties have to be prefixed with dp_. For example, if you want to set the driver parameter:

  1. In the Key column, enter dp_driver

  2. In the Value column, enter oracle.jdbc.OracleDriver if you are connecting to an Oracle database.

Setting the Properties in a Properties File

The properties can be set in an external database properties file. This properties file, also called property bundle, is a text file with the .properties extension containing a set of lines with on each line a <property>=<value> pair.This external database porperties file contains the properties of a JDBC connection to the relational database schema. The properties file is referenced using the db_props property in the JDBC URL. When using this method, note the following:

  • The properties in the properties file are not prefixed and used as described in Table B-3.

  • The db_props property is set to the name of the properties file including the .properties extension. The db_props property indicates that the schema must be loaded in a database schema whose connection information is stored in a external database properties file.

  • The properties files has to be deployed by the agent using the XML connection. The location of the properties file depends on the agent you are using:

  • The properties file must be set in the classpath of Oracle Data Integrator that uses the XML driver. Typically, you can install it with your custom drivers.

Note:

When connecting to the external database, the XML driver uses JDBC connectivity. Make sure that the JDBC driver to access this external database is also available in the ODI classpath.

It is possible to set or override the external database properties on the URL. These properties must be prefixed with the string dp_. For example:

jdbc:snps:xml?file=/temp/payload.xml&dp_driver=<external_db_driver>&dp_url=<external_db_url>

The properties for configuring external storage are described in Table B-3.

Table B-3 Properties of the External Database Properties File

Property Mandatory Type Default Description

driver

Yes

string

-

JDBC driver name.

Important: The driver class file must be in the classpath of the java application.

url

Yes

string

-

JDBC URL

user

Yes

string

-

Login used to connect the database

password

Yes

string

-

Encrypted password of the user.

Note: To encrypt the password, use the encode.bat command. See the Installing and Configuring Oracle Data Integrator for more information.

schema

Yes

string

-

Database schema storing the relational schema and the XML data.

Note for MS SQLServer that:

  • If schema is not specified, tables will be created under the default schema of the user

  • If schema is specified, tables will be created under this schema

Limitation when using v3 mode: When using an external database, make sure that the provided or calculated schema name exists. The schema driver property value must match the schema property value of the external database. Otherwise an error is raised.

catalog

Yes

string

-

For Microsoft SQL Server only. Database catalog storing the XML data & information.

drop_on_connect or doc

No

string (Y|N)

N

Drop the tables from the database schema if they already exist. If set to N the existing tables are preserved.

create_tables or ct

No

(Y | N | AUTO)

AUTO

Y: create systematically the tables in the schema.

N: never create the tables in the schema

AUTO: Create the tables if they do not exist.

create_indexes or ci

No

string (Y|N)

Y

Y: create indexes on tables' PK and FK

N: do not create the indexes. This value provides faster INSERT but dramatically slows SELECT in the data. It also saves storage space on your RDB.

numeric_scale or ns

No

integer

empty

Scale of the numeric columns generated during the XML to SQL mapping.

truncate_before_load or tbl

No

string (Y|N)

Y

Y: truncate all data when connecting

N: preserve existing data

ids_in_db or iidb

No

string (Y|N)

Y

Y: preserve identifiers (counters) in the database for a future append connection

N: do not preserve identifiers. Future append is not possible.

drop_tables_on_drop_schema or dtods

No

string (Y|N)

Y

Y: a DROP SCHEMA does not only causes the reference to the database schema to be erased from the driver, but also causes all tables to be dropped.

N: DROP SCHEMA erases the reference to the database schema from the driver, but the tables are kept in the database schema.

use_prepared_statements or ups

No

string (Y|N)

Y

Y: use the prepared statements with the database connection to perform driver operation (load/unload files).

N: do not use the prepare statement.

Processing is usually faster with prepare statement. The database and driver must support prepared statements in order to use this option.

use_batch_update or ubu

No

string (Y|N)

Y

Y: use batch update with the database connection.

N: do not use batch update.

Inserting data is usually faster with batch update. Should be set to true only if the following conditions are met:

  • The database and driver support batch update

  • The database supports prepared statements

  • The use_prepared_statements parameter is set toYes

Note: The batch update options specified here are only used to load the data in the schema. To use batch update when manipulating data in the schema, you must specify batch update options in your Java application.

batch_update_size or bus

No

integer

30

Batch update size. Records will be written in the database schema by batches of this size, if the use_batch_update property is set to true.

commit_periodically or cp

No

string (Y|N)

Y

A COMMIT will be sent regularly when loading data from the XML file into the database schema. This regular COMMIT avoids overloading of the database log when loading large XML data files.

Should be set to true only if the following conditions are met:

  • The database supports batch update

  • The database supports prepared statements

  • The use_prepared_statements parameter is set to Yes

  • The use_batch_updates parameters is set to Yes

Note: The commit options specified here are only used to load the data in the schema. To commit when performing transactions in the schema, you must specify the commit in your Java application.

num_inserts_before_commit or nibc

No

integer

1000

Interval in records between each COMMIT, if the commit_periodically property is set to true.

reserve_chars_for_column or rcfc

No

integer

3

Long XML names are truncated to fit the maximum allowed size on the RDBMS, according to the maximum allowed size for column names returned by the JDBC driver.

However, there are some situations when you will want to reserve characters to make the driver-generated names shorter. The number of reserved character is defined in the reserve_chars_for_column value.

For example, on a database with a maximum of 30 characters and with this property set to 3 (which is the default), all column names will not be larger than 27 characters.

reserve_chars_for_table or rcft

No

integer

3

Same as reserve_chars_for_column (rcfc) property but applies to names of the table created in the RDBMS schema.

varchar_length or vl

No

integer

255

Size of all the columns of the relational structure that will be used to contain string data.

This property does not apply to Annotation or Documentation elements. For those elements dlvc should be used instead.

default_type_varchar or dtvc

No

string (Y|N)

N

If set to Yes, the default datatype used in the relational schema for columns storing XML annotation and documentation elements is VARCHAR of size 255. The length of this column is specified using the dlvc property. If set to false, the LONG datatype if used. This property should be set to yes for technologies that do not support multiple LONG columns within the same table, such as Oracle.

default_length_varchar or

dlvc

No

integer

255

Default length of the VARCHAR column used for storing XML annotation and documentation elements. This properties is valid only if dtvc is set to yes.

For example:

default_length_varchar=2000 where 2000 is the new desired default column size.

numeric_length or nl

No

integer

10

Size of all the columns of the relational structure that will be used to contain numeric data.

unicode

No

boolean (true | false)

 

For MS SQL Server:

If unicode = true, nvarchar is used.

If unicode = false or not set, varchar is used.

multi_user_safe or mus

No

boolean (true | false)

false

Its usage controls the way row ids are generated. If multi_user_safe is set to true, then each ID generation is tasked to the DB. If set to false at the very beginning of the data load, retrieve the IDs which are stored in the ID table and then work off that stored data in-memory. At the end of the data load this is then pushed to the DB.


The following sample is an example of a property file for using an Oracle Database as the external storage:

driver=oracle.jdbc.OracleDriver
url=jdbc:oracle:thin:@HOST:PORT:SID
user=USER_NAME
password=ENCODED_PASSWORD
schema=USER_NAME
drop_on_connect=Y
create_tables=AUTO
create_indexes=Y
truncate_before_load=Y
ids_in_db=Y
drop_tables_on_drop_schema=Y
use_prepared_statements=Y
use_batch_update=Y
batch_update_size=30
commit_periodically=Y
num_inserts_before_commit=1000
reserve_chars_for_column=3
reserve_chars_for_table=3

The following sample is an example of a property file for using a Microsoft SQL Server database as the external storage:

driver=com.microsoft.jdbc.sqlserver.SQLServerDriver
url=jdbc:microsoft:sqlserver://SERVER_NAME:PORT;SelectMethod=cursor
user=USER_NAME
password=ENCODED_PASSWORD
schema=OWNNER_NAME
drop_on_connect=Y
create_tables=AUTO
create_indexes=Y
truncate_before_load=Y
ids_in_db=Y
drop_tables_on_drop_schema=Y
use_prepared_statements=Y
use_batch_update=Y
batch_update_size=30
commit_periodically=Y
num_inserts_before_commit=1000
reserve_chars_for_column=3
reserve_chars_for_table=3

B.4 Detailed Driver Commands

Note:

The notion of SCHEMA referred to in these commands refers to the string value set with the s=.... parameter in the XML Driver Data Server URL present in the physical architecture.

The following statements are specific to the XML driver, and allow to manage XML files and schemas. They can be launched as standard SQL statements on the JDBC connection to the XML driver.

To manipulate the data stored in the schemas, you may use standard SQL syntax. This syntax is either the built-in engine's SQL Syntax, or the SQL Syntax of the External Database engine you use.

Conventions

The following conventions are used within this document:

  • [ A ] means A is optional

  • [ A | B ] means A or B but the parameter is optional.

  • { B | C } means B or C must be used.

  • [A] [B] means a set of arguments that are not ordered.

  • ( and ) are the characters '(' and ')'.

  • keywords are in UPPERCASE

This section details the following driver specific commands:

B.4.1 CREATE FILE

If the EMPTY option is specified, create an empty XML instance file containing all ELEMENTS (including optional ELEMENTS) present in the related XSD or DTD file. However, no XML ATTRIBUTES declared in these files will be referenced in the created XML instance file.

The attributes are handled differently between compat_mode v1/v2 and v3. In v1/v2 mode attributes are not written, while in v3 mode attributes are also written out.

CREATE [EMPTY] FILE <file_name> [FROM SCHEMA <schema_name>] 
    [JAVA_ENCODING <java_encoding>  XML_ENCODING <xml_encoding>] 
    [NO_CLOSING_TAGS] [NO_DEFAULT_NS]

Parameters 

FROM SCHEMA

Specify the schema in which data will be written in the XML file.

JAVA_ENCODING

Encoding of the generated File.

XML_ENCODING

Encoding generated in the file's xml tag.

Example of generated tag: <?xml version="1.0" encoding="ISO-8859-1"?>

Note that Java and XML encoding should always be specified together.

NO_CLOSING_TAGS

If this parameter is specified, the driver generates the empty tags with closing tag. By default, the driver generates an empty element as <element></element>. with the no_closing_tags parameter, it generates <element/>.

NO_DEFAULT_NS

If this parameter is specified, the driver generates the target file without a default namespace entry.

Remarks

  • If the file name contains spaces, enclose it in double quotes

  • The encoding values should be enclosed in double quotes as they may contain special characters.

B.4.2 CREATE FOREIGNKEYS

Create physically all the foreign keys joining the tables from the relational schema in the database. This command is helpful to enforce integrity constraints on the schema.

Note:

When requested, the driver always returns "virtual" foreign keys, corresponding to the relational structure mapping. It does not return the real foreign keys enforced at database level.
CREATE FOREIGNKEYS

Remarks

After using CREATE FOREIGNKEYS, it is not possible any longer to perform a LOAD FILE.

B.4.3 CREATE XMLFILE

Generate an XML file called <file_name> from the default schema data, or from a specific schema.

CREATE XMLFILE <file_name> [FROM SCHEMA <schema_name>] 
    [JAVA_ENCODING <java_encoding> XML_ENCODING <xml_encoding>] 
    [NO_CLOSING_TAGS][NO_DEFAULT_NS] 

Parameters 

FROM SCHEMA

Specify the schema in which data will be written in the XML file.

JAVA_ENCODING

Encoding of the generated File.

XML_ENCODING

Encoding generated in the file's xml tag. Example of generated tag: <?xml version="1.0" encoding="ISO-8859-1"?>.

Note that Java and XML encoding should always be specified together.

NO_CLOSING_TAGS

If this parameter is specified, the driver generates the empty tags with closing tag. By default, the driver generates an empty element as <element></element>. with the no_closing_tags parameter, it generates <element/>.

NO_DEFAULT_NS

If this parameter is specified, the driver generates the target file without a default namespace entry.

Remarks

  • If the file name contains spaces, enclose it in double quotes

  • The encoding values should be enclosed in double quotes as they may contain special characters.

B.4.4 CREATE SCHEMA

Create in <schema_name> an empty schema or a schema with tables mapping the structure of the description file specified as <dtd/xsd_name>.

Note:

This command cannot be used on an external database.
CREATE SCHEMA <schema_name> [WITH DTD <dtd/xsd_name>] [REPLACE] 
   [ROOTELT <root element>] [READONLY] [COMPAT_MODE <compatibility mode>]
   [JAVA_ENCODING <java_encoding> XML_ENCODING <xml_encoding>]

Parameters 

WITH DTD

Specify the description file (DTD or XSD) which structure will be created in the schema.

REPLACE

Specify if an existing schema structure must be replaced with the new one.

ROOTELT

Element in the description file considered as the root of the XML file. This element name is case sensitive.

READONLY

The schema loaded cannot have data inserted, deleted or updated.

COMPAT_MODE

Indicates the compatibility with mapping modes. This property can take the following values:

  • v1 is equivalent to v1_compatibility=true wich is the 1.x compatibility mode

  • v2 is the 10g/11g mode. This is the defa ult mode.

    Please note that when you use a DTD or only a XML file, you must specify compat_mode=v2 in the JDBC URL. For example:

    jdbc:snps:xml?d=/tmp/myDTD.dtd&compat_mode=v2

    jdbc:snps:xml?f=/tmp/myfile.xml&compat_mode=v2

  • v3 indicates the compatibility with the XDK XSD parser. Please note that compat_mode=v3 is not supported when you use a DTD or only a XML file. For example, the following syntaxes are not supported:

    • jdbc:snps:xml?d=/tmp/myDTD.dtd&compat_mode=v3

    • jdbc:snps:xml?f=/tmp/myfile.xml&compat_mode=v3

    If compat_mode=v3, the v1_compatibility property will be ignored.

Note:

When using the SYNCHRONIZE command, only those DB schemas that have been created with 'v3' option will parse the DTD/XSD in the 'v3' mode. In 'v3' mode all the restrictions on schema name value corresponding with DB property for schema name etc. will apply.
JAVA_ENCODING

Encoding of the target XML file(s) generated from schema.

Note: Java and XML encoding should always be specified together.

XML_ENCODING

Encoding generated in the target files' XML tag. Example of generated tag: <?xml version="1.0" encoding="ISO-8859-1"?>.

Remarks

  • The XML file data is not loaded. This command is similar to LOAD FILE but does not load the XML file data.

  • The schema is created in READONLY mode since no XML file is associated with it.

  • The connection schema does not automatically switch to the newly created schema.

  • If the file name contains spaces, enclose the name in double quotes.

  • The encoding values should be enclosed in double quotes as they may contain special characters.

B.4.5 DROP FOREIGNKEYS

Drop all the foreign keys on the tables of the relational schema in the database. This command is helpful to drop all integrity constraints on the schema.

DROP FOREIGNKEYS

B.4.6 DROP SCHEMA

Drop an existing schema. If <schema_name> is not specified, the current schema is dropped. It is not possible to drop a schema if there are pending connections to this schema. Trying to drop a schema with existing connections causes an exception.

DROP SCHEMA [<schema_name>]

B.4.7 LOAD FILE

Load the <file_name> XML file into the specified <schema_name> XML schema. If a schema name is not specified with the ON SCHEMA parameter, one is generated with the XML file name. If a schema with the specified or generated name is found, then the properties of that schema are inherited. If a schema with the specified or generated name does not exist at runtime, a new XML JDBC URL with only the properties specified in the LOAD FILE command is created. This schema does not inherit any of the properties of the current schema.

LOAD FILE <file_name> [WITH DTD <dtd/xsd_name> | INSERT_ONLY] [ON SCHEMA <schema_name>] [REPLACE] [READONLY] [ROOTELT <root element>] [AUTO_UNLOCK] [DB_PROPS <external database properties>]

Parameters 

WITH DTD

Specify the description file (DTD or XSD) which structure will be created in the schema.

INSERT_ONLY

Adds the data from the XML file in the schema if it already exists. The new XML file should have valid description file for the existing schema.

ON SCHEMA

Force the file to be loaded in <schema_name>. Note that the current schema is not set after the command automatically to <schema_name>.

REPLACE

Specify if an existing schema structure with the same name must be replaced with the one that is being loaded.

READONLY

The schema loaded cannot have data inserted, deleted or updated.

ROOTELT

Element in the description file considered as the root of the XML file. This element name is case sensitive.

AUTO_UNLOCK

If the XML file is already locked by another driver instance, an exception occurs unless the AUTO_UNLOCK is specified. This parameter unlocks automatically the file if it is locked.

DB_PROPS

Loads the file in the external database identified by the properties file called <external database properties>.properties.

Remarks

  • Enclose the file name in double quotes.

  • When no schema is specified, the driver automatically generates a schema name from the file name.

  • The connection schema does not automatically switch to the loaded schema.

  • If the XML file is already open in another schema, an exception occurs.

B.4.8 SET SCHEMA

Set the current schema to <schema_name>.

SET SCHEMA <schema_name>

Remarks

It is necessary to specify a name for the schema.

B.4.9 SYNCHRONIZE

Synchronize data in the schema with the file data.

SYNCHRONIZE [ALL | SCHEMA <schema_name>] [FROM FILE/FROM DATABASE] 
 [IGNORE CONFLICTS]

Parameters 

ALL

Synchronizes all schemas

SCHEMA

Synchronizes only <schema_name>

FROM FILE

Forces the data to be loaded from the file to the schema. Erases all changes in the schema.

FROM DATABASE

Forces the data to be loaded from the schema to the file. Erases all changes in the file.

IGNORE CONFLICTS

If FROM FILE/DATABASE are not specified, the driver automatically determines where data have been modified (in the FILE or DATABASE) and updates the unmodified data. If both the FILE and the DATABASE have been modified, the driver issues a Conflict Error. if the IGNORE CONFLICTS parameter is used, no error is issued, and if performing a SYNCHRONIZE ALL, the following schemas will be synchronized.

Note:

A schema is marked updated only when a data modification (update, delete, insert, drop) is executed in a connection to that schema. It is not marked as updated, when the order is launched from a connection to another schema.

B.4.10 UNLOCK FILE

Unlocks <file_name> if it is locked by another instance of the driver.

UNLOCK FILE <file_name>

B.4.11 TRUNCATE SCHEMA

Clears all data from the current schema, or from <schema_name>.

TRUNCATE SCHEMA [<schema_name>]

B.4.12 VALIDATE

Verifies that the XML file <file_name> is well-formed and validates the content of the XML file <file_name> against the XML Schema (XSD) if the schema is referenced in the XML file. This command returns an exception if the file is not valid. For a full description of the validation performed, see:

http://xerces.apache.org/xerces2-j/features.html#validation.schema

VALIDATE [FILE <file_name>] [ERROR_ON_WARNING|IGNORE_ON_WARNING] 
   [ERROR_ON_ERROR|IGNORE_ON_ERROR] 
   [ERROR_ON_FATAL_ERROR|IGNORE_ON_FATAL_ERROR] [VERBOSE]

Parameters 

FILE <file_name>

Name of the XML file to validate.

ERROR_ON_WARNING | IGNORE_ON_WARNING

Ignore or generate errors on XSD validation warnings, such as values out of range. The default value is IGNORE_ON_WARNING.

ERROR_ON_ERROR | IGNORE_ON_ERROR

Ignore or generate errors on XSD validation errors, such as non conform attribute or element. The default value is ERROR_ON_ERROR.

ERROR_ON_FATAL_ERROR | IGNORE_ON_FATAL_ERROR

Ignore or generate errors on XSD validation fatal errors, such as malformed XML. The default value is ERROR_ON_FATAL_ERROR.

VERBOSE

Displays on the Java console the detailed errors and number of the line causing the error. Nothing is displayed by default on the console.

B.4.13 WRITE MAPPING FILE

Writes out the element/attribute name to table/table.column name mapping for each element/attribute to the specified file. The mapping file helps to understand the relational structure that has been created for the XSD/DTD file. This command can be used only when the schema was created in v3 mode. Otherwise exception is thrown.

WRITEMAPPINGFILE FILE <file-path> [FROM SCHEMA <schema-name>] 
           [JAVA_ENCODING <java_encoding> XML_ENCODING <xml-encoding>]

Parameters 

file_p ath

Name of the generated mapping file

FROM_SCHEMA

If the optional FROM SCHEMA parameter is not provided, the current schema will be used.

JAVA_ENCODING

Encoding of the generated file, for example: ISO8859_1. You will find a list of supported encoding at the following URL: https://download.oracle.com/javase/6/docs/technotes/guides/intl/encoding.doc.html.

Note that if the Java encoding is specified, the XML encoding should also be specified.

XML_ENCODING

Encoding in the xml tag of the generated file.

Example of generated tag: <?xml version="1.0" encoding="ISO-8859-1"?>

You will find a list of supported encoding at the following URL: https://download.oracle.com/javase/6/docs/technotes/guides/intl/encoding.doc.html.

Note that if the XML encoding is specified, the Java encoding should also be specified.

Example B-1 Mapping File

<?xml version = '1.0' encoding = 'UTF-8'?>
<personnel xmlns:x2r="http://www.oracle.com/odi/xml-mapping/" x2r:tableName="PERSONNEL"> 
   <person x2r:tableName="PERSON" id="ID" select="SELECT_">   
      <email x2r:tableName="EMAIL"></email>
      <link x2r:tableName="LINK" manager="MANAGER" subordinates="SUBORDINATES"></link>
      <name x2r:tableName="NAME">
         <given x2r:columnName="GIVEN"></given>
         <family x2r:columnName="FAMILY"></family>
      </name>
      <url x2r:tableName="URL" href="HREF"></url>
   </person>
</personnel>

B.5 SQL Syntax

The following statements are available when using the built-in engine to store the XML schema. They enable the management of the data and data structure in the schema through Standard SQL Syntax.

This section contains the following topics:

Note:

If you are using an external database, you may use the database engine querying syntax instead of this one.

B.5.1 SQL Statements

Any number of commands may be combined. You can optionally use the semicolon character (;) to separate each command.

This section details the following commands:

B.5.1.1 COMMIT

Ends a transaction on the schema and makes the changes permanent.

COMMIT [WORK]

B.5.1.2 CREATE TABLE

Create a tables and its constraints in the relational schema.

CREATE TABLE <table_name> 
  ( <columnDefinition> [, ...] [, <constraintDefinition>...])
<columnDefinition> ::=
    <column_name> <datatype> [(anything)] [[NOT] NULL] [IDENTITY] [PRIMARY KEY]
<constraintDefinition> ::=
[ CONSTRAINT <constraint_name> ]
    UNIQUE ( <column_name> [,<column>...] ) |
    PRIMARY KEY ( <column_name> [,<column_name>...] ) |
    FOREIGN KEY ( <column_name> [,<column_name>...] ) 
    REFERENCES <referenced_table> ( <column_name> [,<column_name>...] )

Remarks

  • IDENTITY columns are automatically incremented integer columns. The last inserted value into an identity column for a connection is available using the IDENTITY() function.

  • Valid datatypes are: BIT, TINYINT, BIGINT, LONGVARBINARY, VARBINARY, BINARY, LONGVARCHAR, CHAR, NUMERIC, DECIMAL, INTEGER, SMALLINT, FLOAT, REAL, DOUBLE, VARCHAR, DATE, TIME, TIMESTAMP, OBJECT

B.5.1.3 DELETE

Remove rows in a table in the relational schema. This function uses a standard SQL Syntax.

DELETE FROM <table_name> [ WHERE <expression> ]

B.5.1.4 DISCONNECT

Closes this connection.

DISCONNECT

Remarks

  • It is not required to call this command when using the JDBC interface: it is called automatically when the connection is closed.

  • After disconnecting, it is not possible to execute other queries with this connection.

B.5.1.5 DROP TABLE

Remove a table, the data and indexes from the relational schema.

DROP TABLE <table_name>

B.5.1.6 INSERT INTO

Insert one or more new rows of data into a table.

INSERT INTO <table_name> [ ( <column_name> [,...] ) ] 
    { VALUES (<expression> [,...]) | <SELECT Statement> }

B.5.1.7 ROLLBACK

Undo the changes made since the last COMMIT or ROLLBACK.

ROLLBACK

B.5.1.8 SELECT

Retrieves information from one or more tables in the schema.

SELECT [DISTINCT] { <select_expression> | <table_name>.* | * } [, ... ]
[  INTO <new_table> ]
   FROM <table_list>
[  WHERE <expression> ]
[  GROUP BY <expression> [, ...] ]
[  ORDER BY <order_expression> [, ...] ]
[  { UNION [ALL] | {MINUS|EXCEPT} | INTERSECT } <select_statement> ]
<table_list> ::=
     <table_name> [ { INNER | LEFT [OUTER] } JOIN <table_name> 
     ON <expression> ]  [, ...]
<select_expression> ::=
      { <expression> | COUNT(*) | {COUNT | MIN | MAX | SUM | AVG} 
       (<expression>) <column_alias>}
<order_expression> ::=
      { <column_number> | <column_alias> | <select_expression> } [ ASC | DESC ]

B.5.1.9 SET AUTOCOMMIT

Switches on or off the connection's auto-commit mode. If switched on, then all statements will be committed as individual transactions. Otherwise, the statements are grouped into transactions that are terminated by either COMMIT or ROLLBACK. By default, new connections are in auto-commit mode.

SET AUTOCOMMIT { TRUE | FALSE }

B.5.1.10 UPDATE

Modifies data of a table in the database.

UPDATE table SET column = <expression> [, ...] [WHERE <expression>]

B.5.1.11 Expressions, Condition and Values

<expression> ::=
     [NOT] <condition> [ { OR | AND } <condition> ]
<condition> ::=
     { <value> [ || <value> ]
      | <value> { = | < | <= | > | >= | <> | != | IS [NOT] } <value>
      | EXISTS(<select_statement>)
      | <value> BETWEEN <value> AND <value>
      | <value> [NOT] IN ( {<value> [, ...] | selectStatement } )
      | <value> [NOT] LIKE <value> [ESCAPE] value }
<value> ::=
      [ + | - ] { term [ { + | - | * | / } term ]
      | ( condition )
      | function ( [parameter] [,...] )
      | selectStatement_giving_one_value
<term> ::=
      { 'string' | number | floatingpoint | [table.]column | TRUE | FALSE | NULL }
<string> ::=
  • Starts and ends with a single '. In a string started with ' use '' to create a '.

  • LIKE uses '%' to match any (including 0) number of characters, and '_' to match exactly one character. To search for '%' itself, '\%' must be used, for '_' use '\_'; or any other escaping character may be set using the ESCAPE clause.

<name> ::=
  • A name starts with a letter and is followed by any number of letters or digits. Lowercase is changed to uppercase except for strings and quoted identifiers. Names are not case-sensitive.

  • Quoted identifiers can be used as names (for example for tables or columns). Quoted identifiers start and end with ". In a quoted identifier use "" to create a ". With quoted identifiers it is possible to create mixed case table and column names. Example: CREATE TABLE "Address" ("Nr" INTEGER, "Name" VARCHAR); SELECT * FROM "Address". Quoted identifiers are not strings.

<values> ::=
  • A 'date' value starts and ends with ', the format is yyyy-mm-dd (see java.sql.Date).

  • A 'time' value starts and ends with ', the format is hh:mm:ss (see java.sql.Time).

  • Binary data starts and ends with ', the format is hexadecimal. '0004ff' for example is 3 bytes, first 0, second 4 and last 255 (0xff).

B.5.2 SQL FUNCTIONS

Table B-4 lists the numerical functions.

Table B-4 Numerical Functions

Function Description

ABS(d)

returns the absolute value of a double value

ACOS(d)

returns the arc cosine of an angle

ASIN(d)

returns the arc sine of an angle

ATAN(d)

returns the arc tangent of an angle

ATAN2(a,b)

returns the tangent of a/b

CEILING(d)

returns the smallest integer that is not less than d

COS(d)

returns the cosine of an angle

COT(d)

returns the cotangent of an angle

DEGREES(d)

converts radians to degrees

EXP(d)

returns e (2.718...) raised to the power of d

FLOOR(d)

returns the largest integer that is not greater than d

LOG(d)

returns the natural logarithm (base e)

LOG10(d)

returns the logarithm (base 10)

MOD(a,b)

returns a modulo b

PI()

returns pi (3.1415...)

POWER(a,b)

returns a raised to the power of b

RADIANS(d)

converts degrees to radians

RAND()

returns a random number x bigger or equal to 0.0 and smaller than 1.0

ROUND(a,b)

rounds a to b digits after the decimal point

SIGN(d)

returns -1 if d is smaller than 0, 0 if d==0 and 1 if d is bigger than 0

SIN(d)

returns the sine of an angle

SQRT(d)

returns the square root

TAN(d)

returns the trigonometric tangent of an angle

TRUNCATE(a,b)

truncates a to b digits after the decimal point

BITAND(a,b)

return a & b

BITOR(a,b)

returns a | b


Table B-5 lists the string functions.

Table B-5 String Functions

Function Description

ASCII(s)

returns the ASCII code of the leftmost character of s

CHAR(c)

returns a character that has the ASCII code c

CONCAT(str1,str2)

returns str1 + str2

DIFFERENCE(s1,s2)

returns the difference between the sound of s1 and s2

INSERT(s,start,len,s2)

returns a string where len number of characters beginning at start has been replaced by s2

LCASE(s)

converts s to lower case

LEFT(s,count)

returns the leftmost count of characters of s

LENGTH(s)

returns the number of characters in s

LOCATE(search,s,[start])

returns the first index (1=left, 0=not found) where search is found in s, starting at start

LTRIM(s)

removes all leading blanks in s

REPEAT(s,count)

returns s repeated count times

REPLACE(s,replace,s2)

replaces all occurrences of replace in s with s2

RIGHT(s,count)

returns the rightmost count of characters of s

RTRIM(s)

removes all trailing blanks

SOUNDEX(s)

returns a four character code representing the sound of s

SPACE(count)

returns a string consisting of count spaces

SUBSTRING(s,start[,len])

returns the substring starting at start (1=left) with length len

UCASE(s)

converts s to upper case

LOWER(s)

converts s to lower case

UPPER(s)

converts s to upper case


Table B-6 lists the date/time functions.

Note that a date value starts and ends with a single quote ('), the format is yyyy-mm-dd (see java.sql.Date). A time value starts and ends with a single quote ('), the format is hh:mm:ss (see java.sql.Time).

Table B-6 Date/Time Functions

Function Description

CURDATE()

returns the current date

CURTIME()

returns the current time

DAYNAME(date)

returns the name of the day

DAYOFMONTH(date)

returns the day of the month (1-31)

DAYOFWEEK(date)

returns the day of the week (1 means Sunday)

DAYOFYEAR(date)

returns the day of the year (1-366)

HOUR(time)

return the hour (0-23)

MINUTE(time)

returns the minute (0-59)

MONTH(date)

returns the month (1-12)

MONTHNAME(date)

returns the name of the month

NOW()

returns the current date and time as a timestamp

QUARTER(date)

returns the quarter (1-4)

SECOND(time)

returns the second (0-59)

WEEK(date)

returns the week of this year (1-53)

YEAR(date)

returns the year


Table B-7 lists the system functions.

Table B-7 System Functions

Function Description

IFNULL(exp,value)

if exp is null, value is returned else exp

CASEWHEN(exp,v2,v2)

if exp is true, v1 is returned, else v2

CONVERT(term,type)

converts exp to another data type

CAST(term AS type)

converts exp to another data type


B.6 JDBC API Implemented Features

Table B-8 lists the JDBC API features that are implemented in the Oracle Data Integrator Driver for XML:

Table B-8 JDBC API Features

Feature Groups JDBC Version Support

Batch Update

2.0 Core

Yes

Blob/Clob

2.0 Core

Yes

JNDI DataSources

2.0 Optional

Yes

Failover support

-

Yes

Transaction SavePoints

3.0

Yes

Unicode support

-

No

Distributed Transaction

2.0 Optional

No

Connection Pooling

2.0 Optional

No

Cluster support

-

No


Table B-9 lists JDBC Java classes.

Table B-9 JDBC Java Classes

JDBC Class JDBC Version Support

Array

2.0 Core

No

Blob

2.0 Core

Yes

CallableStatement

1.0

Yes

Clob

2.0 Core

Yes

Connection

1.0

Yes

ConnectionPoolDataSource

2.0 Optional

No

DatabaseMetaData

1.0

Yes

DataSource

2.0 Optional

No

Driver

1.0

Yes

Ref

2.0 Core

No

ResultSet

1.0

Yes

ResultSetMetaData

1.0

Yes

RowSet

2.0 Optional

No

Statement

1.0

Yes

Struct

2.0 Core

No

PreparedStatement

1.0

Yes

XAConnection

2.0 Optional

No

XADataSource

2.0 Optional

No


B.7 Rich Metadata

When creating RDB structures based on XML schema, there must be flexibility to supply the driver with metadata. For example, in situations where RDB table/column names can conflict if element/attributes have same local names.

The ODI XML driver attaches an attribute in the x2r namespace (http://www.oracle.com/odi/xml-mapping) to the elements/attribute namely: x2r:tableName/x2r:columnName. If conflicting names do not have the metadata attribute, then they are appended with an incrementing number until a non-conflicting table/column name is obtained.

The new object model maintains a map between xpath and table/table.column names for each element/attribute.

If two elements with same name and same type exist in two different locations, same table is used for storing the data but FK reference to parent element is used to differentiate the data. The new implementation creates new tables. Table B-10 lists the table attributes.

Table B-10 Table Attributes

Attribute Type Description

x2r:tableName

String

To be attached to elements that resolve to RDB tables/attributes that are lists or enumerations whose local names match.

x2r:columnName

String

To be attached to attributes whose local names match or for elements that map to columns, but whose local names match with each other or with an attribute of the containing type.

x2r:columnDataType

String

Lets you provide the datatype information as a string from a mapping table that we will provide.

May only be attached to elements that the driver will map to columns or to attributes. If this parameter is provided user must also supply x2r:columnLength and/or x2r:columnPrecision as required for the datatype.

x2r:columnLength

integer

Length of the column.

By default the values hard-coded in the driver are used. VARCHAR and NUMERIC have global override option in JDBC URL. This attribute, if provided, overrides both the default value and the global overrride.

May only be attached to elements that the driver will map to columns or to attributes.

x2r:columnPrecision

integer

Precision of the column. Used by driver only for those datatypes that allow it. Same logic as for columnLength is used when determining the value to be applied.

May only be attached to elements that the driver will map to columns or to attributes.


The following sample is an example of an XSD enriched with metadata.

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:x2r="http://www.oracle.com/odi/xml-mapping/">
 <xs:element name="root">
  <xs:complexType>
   <xs:sequence>
     <!--  Example for redefining table name  -->
    <xs:element name="person" maxOccurs="unbounded" x2r:tableName="CUSTOMER">
     <xs:complexType>
      <xs:sequence>
       <!--  Example for redefining column name  -->
       <xs:element name="given" type="xs:string" x2r:columnName="FIRST"/>
       <xs:element name="last" type="xs:string"/>
       <!--  Example for redefining column length  -->
       <xs:element name="address" type="xs:string" x2r:columnLength="400"/>
       <!--  Example for redefining column type  -->
       <xs:element name="notes" type="xs:string" x2r:columnDataType="CLOB"/>
      </xs:sequence>
     </xs:complexType>
    </xs:element>
   </xs:sequence>
  </xs:complexType>
 </xs:element>
</xs:schema>

B.7.1 Supported user-specified types for different databases

Table B-11 provides the details of the supported user-specified types for different databases. Using any other type name will raise exception.

Table B-11 Supported user-specified types for databases

Type HSQL Oracle MySQL MS SQL Server

SMALLINT

X

 

X

X

INTEGER

X

 

X

 

REAL

X

   

X

NUMERIC

X

 

X

 

NUMBER

 

X

   

FLOAT

X

 

X

X

DOUBLE

X

 

X

 

DECIMAL

X

 

X

 

CHAR

X

X

X

X

NCHAR

 

X

X

X

VARCHAR

X

X

X

X

VARCHAR2

 

X

   

NVARCHAR2

 

X

   

BLOB

X

X

X

 

CLOB

X

X

   

NCLOB

 

X

   

TEXT

   

X

X

DATE

X

X

X

 

TIME

X

X

X

 

TIMESTAMP

X

X

X

X


B.8 XML Schema Supported Features

The driver supports part of the XML Schema (XSD) specification. Supported elements are listed in this section.

For more information on the XML Schema specification, see the W3C specification at http://www.w3.org/TR/xmlschema-1/.

This section contains the following topics:

B.8.1 Datatypes

The following datatypes are supported:

  • These datatypes are converted to String columns: string, normalizedString, token, nmtoken, nmtokens, anyUri, id, idref, date, datetime, time, hexBinary

  • These datatypes are converted to Integer columns: int, positiveInteger, negativeInteger, nonNegativeInteger, onPositiveInteger, long, unsignedLong, unsignedInt, short, unsignedShort, byte, unsignedByte, boolean (Boolean are converted to a numeric column with 0 or 1, but they can take "true" or "false" values from the input files)

  • These datatypes are converted to Decimal (with 2 decimal places) columns: decimal, float, double

B.8.2 Supported Elements

This section lists all schema elements. Supported syntax elements are shown in bold. Unsupported syntax elements are shown in regular font. They are ignored by the driver.

This section details the following schema elements:

Note:

XML files generated or updated using the XML driver should ideally be validated against their corresponding XSD files using the VALIDATE command after generation.

B.8.2.1 All

This element specifies that child elements can appear in any order and that each child element can occur zero or one time.

Note that child elements mandatory properties (minOccurs=1) are not managed by the driver. This should be handled by checks on the data, and by validating the XML contents against the XSD.

<all
    id=ID
    maxOccurs=1
    minOccurs=0|1
    any attributes
>
(annotation?,element*)
</all>

B.8.2.2 Any

This element enables you to extend the XML document with elements not specified by the schema.

<any
    id=ID
    maxOccurs=(nonNegativeInteger|unbounded):1
    minOccurs=nonNegativeInteger:1
    namespace=((##any|##other)|List of (anyURI|(##targetNamespace|##local))):##any
    processContents=(lax|skip|strict):strict
    any attributes
>
(annotation?)
</any>

B.8.2.3 AnyAttribute

This element enables you to extend the XML document with attributes not specified by the schema.

<anyAttribute
    id=ID
    namespace=((##any|##other)|List of (anyURI|(##targetNamespace|##local))):##any
    processContents=(lax|skip|strict):strict
    any attributes
>
(annotation?)
</anyAttribute>

B.8.2.4 AnyType

This XML Schema type is the root type for all XML Schema types.

<xsd:element name="something" type="xsd:anyType"/>

B.8.2.5 Attribute

This element defines an attribute.

<attribute
    default=string
    id=ID
    name=NCName
    type=QName
    use=optional|prohibited|required
    ref=QName
    fixed=string
    form=qualified|unqualified
    any attributes
>
(annotation?,(simpleType?))
</attribute>

Note that the use attribute of this element defines the column mapped by the driver for the attribute as mandatory or not.

B.8.2.6 AttributeGroup

This element defines a set of attributes.

<attributeGroup
    id=ID
    name=NCName
    ref=QName
    any attributes
>
(annotation?),((attribute|attributeGroup)*,anyAttribute?))
</attributeGroup>

B.8.2.7 Choice

This element allows one and only of the elements to be present within the containing element.

<choice
    id=ID
    maxOccurs=nonNegativeInteger|unbounded
    minOccurs=nonNegativeInteger
    any attributes
>
(annotation?,(element|group|choice|sequence|any)*)
</choice>

Note that the child element's unique nature are not managed by the driver. This should be handled by checks on the data, and by validating the XML contents against the XSD.

B.8.2.8 ComplexContent

This element defines extensions or restrictions on a complex type.

<complexContent
    id=ID
    mixed=true|false
    any attributes
>
(annotation?,(restriction|extension))
</complexContent>

B.8.2.9 ComplexType

This element defines a complex type.

<complexType
    name=NCName
    id=ID
    abstract=true|false
    mixed=true|false
    block=(#all|list of (extension|restriction))
    final=(#all|list of (extension|restriction))
    any attributes
>
(annotation?,(simpleContent|complexContent|((group|all|choice|sequence)?,((attribute|attributeGroup)*,anyAttribute?))))
</complexType>

B.8.2.10 Element

This element defines an element of the XML file.

<element
    name=NCName
    maxOccurs=nonNegativeInteger|unbounded
    minOccurs=nonNegativeInteger
    type=QName
    id=ID
    ref=QName
    substitutionGroup=QName
    default=string
    fixed=string
    form=qualified|unqualified
    nillable=true|false
    abstract=true|false
    block=(#all|list of (extension|restriction))
    final=(#all|list of (extension|restriction))
    any attributes
>
annotation?,((simpleType|complexType)?,(unique|key|keyref)*))
</element>

Note:

The maxOccurs and minOccurs attributes of the element are used in the XML-to-SQL mapping. If a child element is of a simple type and is monovalued (one occurrence only), then this element is mapped to a simple column in the table corresponding to its parent element. Otherwise, a table linked to the parent element's table is created.

Note that if no reference to either minOccurs or maxOccurs is mentioned in an element then the element is consider as monovalued and is transformed to a column. This behavior can be changed using the useImplicitMaxValue URL property. When this property is set to yes, an elements for which maxOccurs is not specified in the XSD is considered as multivalued (maxOccurs ="unbounded").

Note:

Using different sub-elements with the same name but with different types is not supported by XML driver. An XSD with such a structure will not be processed correctly.

B.8.2.11 Extension

This element extends an existing simpleType or complexType element

<extension
    id=ID
    base=QName
    any attributes
>
(annotation?,((group|all|choice|sequence)?,((attribute|attributeGroup)*,anyAttribute?)))
</extension>

B.8.2.12 Group

The group element is used to define a group of elements to be used in complex type definitions.

<group
    id=ID
    name=NCName
    ref=QName
    maxOccurs=nonNegativeInteger|unbounded
    minOccurs=nonNegativeInteger
    any attributes
>
(annotation?,(all|choice|sequence)?)
</group>

B.8.2.13 Import

This element is used to add multiple schemas with different target namespace to a document.

<import
    id=ID
    namespace=anyURI
    schemaLocation=anyURI
    any attributes
>
(annotation?)
</import>

B.8.2.14 Include

This element is used to add multiple schemas with the same target namespace to a document.

<include
    id=ID
    schemaLocation=anyURI
    any attributes
>
(annotation?)
</include>

B.8.2.15 List

This element defines a simple type element as a list of values of a specified data type.

<list
    id=ID
    itemType=QName
    any attributes
>
(annotation?,(simpleType?))
</list>

B.8.2.16 Restriction

This element defines restrictions on a simpleType, simpleContent, or a complexContent.

<restriction
     id=ID
     base=QName
     any attributes
>
Content for simpleType:
(annotation?,(simpleType?,(minExclusive|minInclusive|maxExclusive|maxInclusive|
totalDigits|fractionDigits|length|minLength|maxLength|enumeration|whiteSpace|
pattern)*))
Content for simpleContent:
(annotation?,(simpleType?,(minExclusive|minInclusive|maxExclusive|maxInclusive|
totalDigits|fractionDigits|length|minLength|maxLength|enumeration|whiteSpace|
pattern)*)?, ((attribute|attributeGroup)*,anyAttribute?))
Content for complexContent:
(annotation?,(group|all|choice|sequence)?, ((attribute|attributeGroup)*,anyAttribute?))
</restriction>

B.8.2.17 Schema

This element defines the root element of a schema.

<schema
     id=ID
     attributeFormDefault=qualified|unqualified
     elementFormDefault=qualified|unqualified
     blockDefault=(#all|list of (extension|restriction|substitution))
     finalDefault=(#all|list of (extension|restriction|list|union))
     targetNamespace=anyURI
     version=token
     xmlns=anyURI
     any attributes
>
((include|import|redefine|annotation)*,(((simpleType|complexType|group|
attributeGroup)|element|attribute|notation),annotation*)*)
</schema>

B.8.2.18 Sequence

This element specifies that the child elements must appear in a sequence. Each child element can occur 0 or more times.

<sequence
     id=ID
     maxOccurs=nonNegativeInteger|unbounded
     minOccurs=nonNegativeInteger
     any attributes
>
(annotation?,(element|group|choice|sequence|any)*)
</sequence>

Note the following:

  • The Sequence order is not managed by the driver. The sequence order should be handled by loading the xxx_ORDER column generated by the driver.

  • The maxOccurs and minOccurs attributes are not managed by the driver. This should be handled by checks on the data, and by validating the XML contents against the XSD.

B.8.2.19 SimpleContent

This element contains extensions or restrictions on a text-only complex type or on a simple type as content.

<simpleContent
     id=ID
     any attributes
>
(annotation?,(restriction|extension))
</simpleContent>

B.8.2.20 SimpleType

This element defines a simple type element.

<simpleType
     name=NCName
     id=ID
     any attributes
>
(annotation?,(restriction|list|union))
</simpleType>

B.8.3 Unsupported Features

The following elements and features are not supported or implemented by the XML driver.

B.8.3.1 Unsupported Elements

The following schema elements are not supported by the XML driver.

  • Key/keyRef/Unique: These elements allow the definition of constraints in the schema. These elements and their child elements (selector, field) are ignored.

  • Redefine: The redefine element redefines simple and complex types, groups, and attribute groups from an external schema. This element is not supported.

In v3 mode an error is raised, if any unsupported XSD element is encountered.

WARNING:

Elements and attributes allowed in an XML file due to an Any or AnyAttribute clause in the XSD may cause errors when the file is loaded.

B.8.3.2 Unsupported Features

Multipass parsing is supported in v3 mode. The other modes do not support multipass parsing.

B.8.3.3 Unsupported Datatypes

The following datatypes are not supported:

  • gYear

  • gYearMonth

  • gMonth

  • gMonthDay

  • gDay

  • language

  • ENTITY

  • ENTITIES

  • NOTATION

  • IDREFS