This chapter provides an overview of how Oracle Data Service Integrator works with relational data, and describes what happens when a relational data source is imported into Oracle Data Service Integrator.
The chapter also explains how SQL data types are mapped to XQuery data types and describes what happens during runtime after deploying a data-service-enabled application. The chapter further explains how queries are handled and describes the kind of performance you can expect.
This chapter covers the following topics:
Note that while the graphical-user interface tools handle many of the details, SQL developers and application-performance tuning experts should understand how Oracle Data Service Integrator works with relational data so that they can:
Note: | For simplicity’s sake, this chapter refers to the XQuery engine throughout when in fact some of the specific functionality is handled by other, ancillary sub-systems (for example, the Data Source API or other system components depicted in the “Oracle Data Service Integrator Components Architecture” figure in the Concepts Guide). |
At the core of Oracle Data Service Integrator is the data processing engine, often referred to as simply the XQuery engine—the robust, enterprise-class implementation of the XQuery language based on the standards listed in “Supported XQuery Specifications” on page 1-3, with additional enhancements as detailed in “Oracle’s XQuery Implementation” on page 2-1.
In addition to compliance with XQuery and XML recommendations, Oracle Data Service Integrator XQuery engine also complies with the ANSI/ISO standard that bridges the SQL and XML worlds (the “SQL/XML (ISO-ANSI Working Draft) XML-Related Specifications” WD 9075-14 (SQL/XML), August, 2002). As a Java application (J2EE server application), Oracle Data Service Integrator uses JDBC to generate SQL queries and submit them to the appropriate RDBMSs that comprise a data service, which means Oracle Data Service Integrator must accommodate differences in both SQL and JDBC, as follows:
Given these factors, Oracle Data Service Integrator provides two different levels of SQL support for relational database management systems (RDBMS): base support and core support, as defined in the next section.
Oracle Data Service Integrator provides two different levels of support for relational data sources:
Oracle Data Service Integrator supports SQL (relational) data sources throughout the life-cycle of a data services project, from metadata import, through query plan optimization, through runtime execution of queries and delivery of data to an end-user (or other) application. Specifically, the XQuery engine provides:
For each of the tables and views whose metadata is imported into Oracle Data Service Integrator (using Import Source Metadata feature of the GUI), two files are generated:
At runtime, the XQuery engine:
Database connections (connection pools) are registered in the JNDI (Java naming and directory interface) tree of the WebLogic Server (an administrator with privileges on the server can configure connection pool, data source, and JNDI name by which connection pools are accessible).
When sub-plan execution completes, connections are typically not released back to the WebLogic Server. The XQuery engine holds the connection for the duration of the entire XQuery — not just the duration of the SQL — enabling subsequent queries to the same relational data source to be executed using an already obtained connection (which also improves performance). Whether the XQuery engine can share connections or not depends on the underlying data source and JDBC driver (see Table 3-2).
If the data source RDBMS or JDBC driver does not support connection sharing, and if the Oracle Data Service Integrator has opened multiple connections to the same data source, the XQuery engine keeps the initial connection to a data source open during XQuery execution but releases any subsequent connections to the same data source once the SQL result is received in its entirety by the XQuery engine. The initial connection will be re-used subsequent SQL queries when the connection becomes available.
XQuery-SQL data type mappings are specific to the RDBMS version and the JDBC driver, as discussed in Base and Core RDBMS Support. The specific data type mappings for each core RDBMS and the general mappings for any base RDBMS are detailed in the “XQuery-SQL Mapping Reference.” However, XQuery and SQL differ in some respects that may affect XQuery-to-SQL translation; these differences apply to all RDBMSs:
The XQuery language defines richer data types than SQL for handling date and time information (temporal data). These data types provide more information (timezone data, for instance) or greater degree of precision (unlimited number of fractional seconds as part of a time or date, for example). The three built-in XQuery data types for data and time information are:
Minimally, every RDBMS has a single datatype that conveys both date and time data. This datatype maps to XQuery’s xs:dateTime data type. Some RDBMSs offer additional SQL data types for storing date and time data separately (see Table 3-3)
(Of all the RDBMSs supported by Oracle Data Service Integrator, only Oracle 9.x (and higher) offers data types with timezone data (TIMESTAMP WITH TIMEZONE, TIMESTAMP WITH LOCAL TIMEZONE).
DATE3
|
|||
1Supports fractional seconds up to 3 digits (miliseconds). 2Accuracy of 1 minute. 3Provides both date and time data, but supports neither fractional seconds nor timezone data (fractional-second data is truncated). |
Oracle Data Service Integrator XQuery engine maps all SQL date and time data types to XQuery data types (for example, during metadata import of a new data source) without loss of data or precision.
However, the converse is not true: depending on the specific RDBMS (and JDBC driver) for a specific data source, the XQuery engine may need to perform additional processing to minimize data loss and to handle the timezone information when mapping XQuery temporal data types to SQL.
When a query is being pushed down to an RDBMS that does not support timezone data, the Oracle Data Service Integrator XQuery engine converts date and time data into the local time of the underlying application server and removes the timezone information. The conversion occurs each time a date or time value that includes timezone data is sent to the data source, as follows:
The XQuery language supports unlimited precision for fractional seconds, while the Oracle Data Service Integrator XQuery engine supports up to 7 digits only (for fractional seconds). However, depending on the specific RDBMS, fractional second support may be far less than 7 digits. Or there may be no fractional second support at all (Oracle 8.1.x, for example). In translating from XQuery to SQL, Oracle Data Service Integrator truncates fractional seconds to the precision supported by that RDBMS.
For example, since Microsoft’s DATETIME data type supports up to 3 digits (milliseconds) for fractional time precision, when Oracle Data Service Integrator sends a datetime value to Microsoft SQL Server 2000, the value is first converted into the local time zone and then any fractional seconds are converted to the 3-digit-milliseconds allowed.
If fractional-second-precision is required (but the data source does not support it appropriately), use the fn-bea:fence()
function to disable pushdown of date and time data types and operations, so that the XQuery engine processes the time- and date-related queries. (See
Preventing SQL Pushdown on page 3-33 for more information.)
See “XQuery-SQL Mapping Reference” for more information about time and date data types for core and base RDBMS.
The XQuery language is less restrictive than the SQL language in terms of the scope of expressions and data types. For example, for most all RDBMSs, an SQL query that returns a boolean can only be used inside a WHERE clause. XQuery does not have such restrictions, and as a result, in some cases, valid XQuery expressions cannot be pushed down. Expressions and data types that cannot be pushed include:
Oracle Data Service Integrator achieves optimal performance for queries by performing SQL pushdown. Pushdown is an optimization technique that offloads processing from the XQuery engine by sending native SQL queries to the data source so that minimal result sets necessary to answer the query get processed by the XQuery engine.
SQL pushdown reduces the amount of data transported and processed by Oracle Data Service Integrator XQuery processing engine. This technique dramatically improves overall performance, especially when joining tables.
For example, a JOIN operation on two tables can be done by the underlying RDBMS, returning only the final result, rather than delivering all the data to the XQuery engine for processing the JOIN condition. Sorting criteria are also handled by the data source, eliminating the need to re-sort the data inside the XQuery engine.
For all core RDBMSs, the XQuery engine identifies the XQuery constructs and operations that can be translated into equivalent SQL operations. These include:
Not all queries can (or should) get pushed down. The XQuery engine does not pushdown:
The remainder of this section covers SQL pushdown in more detail, providing syntax samples based on the table structures shown in Figure 3-4. (For ease of reading, namespace references are not shown in the example queries.) In some cases, the query may not get pushed down as SQL, but the fragments of the query — names of columns, for example — may get pushed to the project list.
XQuery functions and operators are translated into SQL only when:
Table 3-5 shows an XQuery statement and its corresponding “pushdown” or SQL translation. (Oracle syntax is used.)
If some arguments to a function or operator are not directly pushable, but can be replaced with parameters, the XQuery engine will replace the arguments with parameters and pushdown the SQL. For example, since the XQuery’s string-join() function has no explicit SQL equivalent, it is replaced with a parameter (see Table 3-6).
Oracle Data Service Integrator translates XQuery 1.0 and XPath 2.0 aggregate functions into corresponding SQL aggregate functions (Table 3-7).
Note that the distinct-values() XQuery aggregate function in conjunction with the fn:count() function is further translated into an SQL COUNT(DISTINCT...) operation, as shown in Table 3-8. See Grouping and Aggregation for some examples of how aggregate functions in conjunction with other expressions affect the outcome of SQL pushdown.
The Oracle Data Service Integrator XQuery engine generates parameters from variables, functions, operators, and cast operations as needed for use by the SQL engine. If all arguments to a function are parameters, the entire function gets pushed as a parameter.
The functions that can be pushed down depend on the database. See the “XQuery-SQL Mapping Reference” on page 7-1 for details.
As with functions and operators, support for cast operation pushdown is RDBMS-specific, although cast pushdown is available only for core (not base) RDBMSs. The XQuery engine can pushdown cast operations if the data source RDBMS:
Table 3-8 shows an example of how a cast in XQuery would get pushed down to a Microsoft SQL Server 2000 data source.
The XQuery engine maps table columns to XML elements that are children of the corresponding row elements. Simple XQuery path expressions are recognized by the XQuery engine as column accessors. For example, $c/ZIP_CODE and $c/LAST_NAME (see Table 3-10) provide access to ZIP_CODE and LAST_NAME columns.
The Oracle Data Service Integrator XQuery engine translates XQuery constants into SQL constants only if the data source has an equivalent SQL data type. Table 3-9 shows an example of a constant used in a FLWOR expression and how that constant gets translated in the SQL statement.
Both external and internal variables in XQuery expressions can be translated into SQL parameters (in generated SQL statements) when the variable’s datatype is supported by the XQuery engine and:
Table 3-10 shows an example of variable pushdown.
For each relational data source, the precise set of expressions pushed down depends on the capabilities of the underlying RDBMS; for details, see XQuery Engine and SQL.
Each of the example XQueries shown in Table 3-11 returns elements containing values of LAST_NAME columns from a CUSTOMER table. In all cases, the SQL statement generated by the XQuery engine is the same (see Table 3-11).
The difference between the first two queries and the last two queries is that the fn:data() function is used in the query to limit the results to values only. Without the fn:data() function, the result is a list of <LAST_NAME> elements containing corresponding column values. If a column value is NULL, the element is skipped. With the fn:data() function, the result is the actual values.
An XQuery where clause is usually translated into an SQL WHERE clause. An XQuery where clause gets pushed down as SQL when:
Table 3-12 shows an example of a where clause pushdown.
However, note that if the WHERE clause follows a group by clause, the WHERE clause is translated into a HAVING clause. See Group-By with a Nested Where Clause Translates to SQL HAVING Clause).
An XQuery order by expression comprises:
The XQuery engine can pushdown SQL for ordering expressions, including properties, only when the ordering expression:
Table 3-13 shows an example of an order by clause pushdown.
Table 3-14 shows an example of the SQL pushdown that occurs when ordering by a NULLable column (ADDRESS2) in the XQuery clause when the RDBMS supports dynamic setting of NULL order.
If the data source RDBMS does not support the required empty (NULL) order, the order by will not be pushed down.
As another optimization, the Oracle Data Service Integrator XQuery engine can insert order by clauses into generated SQL statements—even when the original XQuery statement does not include them—to offload expensive sorting operations to the RDBMS. They are automatically inserted by the XQuery optimizer prior to execution. You can see these as well in the Query Plan View.
Joining data from multiple sources is a very common data integration task. In SQL terms, an inner join relates each row in one table (or view) to one or more corresponding rows in another table or view. In XQuery, an inner join is expressed as a FLWR expression comprising several for clauses that iterate over the data sources, where clauses that specify the join predicates, and a return clause returning data values.
If two relational sources are located in the same database, the inner join can sometimes be pushed down as a single SQL statement using either SQL-92 or SQL-89 syntax, depending on the RDBMS of the data source.
An inner join can be pushed down when:
Although the example in Figure 3-15 shows a simple inner join between two branches, the XQuery engine also supports n-way joins, with each branch comprising a different for statement. See also Table 3-16.
The XQuery engine interprets nested FLWR expressions (see Figure 3-17) as an outer join and can generate SQL for a data source when:
The SQL code generated by the XQuery engine depends on the SQL dialect supported by the source database (see “XQuery-SQL Mapping Reference” for details). Table 3-18 shows example SQL-92 and proprietary syntax for the query shown in Figure 3-17.
Variations of the outer-join pattern are obtained from the original query by using equivalent XQuery expressions. Figure 3-19 is an example of a query equivalent to that shown in Figure 3-17 that will also result in a SQL statement with an outer join.
A semi-join returns data from a single branch of the join condition, when the join condition is satisfied. An anti-semi-join returns data from a single branch when the join condition is false. Although the XQuery language does not have specific constructs for semi-joins and anti-semi-joins, the XQuery engine translates several specific FLWR patterns into SQL semi-join or anti-semi-join patterns, assuming that:
The XQuery interprets a FLWR query containing an inner existential quantified expression as a semi-join, translating the expression into an SQL query with the EXISTS check in the WHERE clause.
Universal quantified expressions are also supported, but their SQL generation is slightly more complicated. The XQuery engine translates FLWRs with exist() or empty() predicates in the where clause into semi-joins. Table 3-20 shows several examples of such patterns.
The XQuery engine supports several patterns for group by pushdown and aggregate function pushdown.
The Group By clause is a Oracle extension to the XQuery language (see “Generalized FLWGOR (group by)” on page 2-52 for more information). The XQuery engine implicitly adds a group by expression to some patterns to enable more efficient pushdown and query execution.
The XQuery engine translates group-by clauses into equivalent SQL GROUP BY clauses if:
Since the query shown in Figure 3-21 meets these requirements, the following SQL statement is generated:
SELECT t1."CATEGORY" AS c1, COUNT(*) AS c2
FROM "PRODUCT" t1
GROUP BY t1."CATEGORY"
The group-by pushdown is closely related to the Distinct-by Pushdown. When a group-by clause does not include a partition variable, the XQuery engine generates SQL that includes the DISTINCT keyword, as described in the next section.
An XQuery containing a Group By clause (without a partition definition), can be generated into SQL query that uses SQL’s DISTINCT keyword to eliminate duplicates in the result. For example, the XQuery statement in Table 3-22 uses a group-by clause but has no partition defined, and the SQL statement created by Oracle Data Service Integrator refines the result by using the DISTINCT keyword.
An aggregate function operating on a single column from a data source is one of the simplest aggregate patterns that the XQuery engine supports, although it does so in a slightly non-intuitive way. It uses a constant as a single grouping expression (...GROUP ...BY n). The XQuery engine can pushdown the SQL if the RDBMS supports either a GROUP BY operation on a constant or supports sub-queries in the sub-clause (see Table 3-23).
SQL Statement1
|
SQL Statement2
|
|
1RDBMS supports GROUP BY constant 2RDBMS does not support GROUP BY, but does support sub-queries in the FROM clause |
If a relational data source supports nested WHERE clauses, the XQuery engine can translate a where clause after a group-by clause into a SQL HAVING clause (see Table 3-24), provided that the where clause meets other requirements for XQuery-SQL translation.
Another common pattern supported by the XQuery engine is outer join with aggregation of the right branch, which is expressed in XQuery as nested FLWR expressions with aggregate functions in the inner level (Table 3-25).
With this type of query, in order to fully push as much of the query as possible to the data source RDBMS, the XQuery engine evaluates the outer join first and then performs the group-by on the left branch’s primary key column, to compute the aggregate. The XQuery engine can perform this optimization only if the left branch of the query has a key column. As shown in Table 3-26, the CUSTOMER does, so the optimization will be performed.
The net effect is that only the XML creation is performed in the XQuery engine.
The CASE expression, introduced in SQL:1992, provides a way to use if-then-else logic in SQL statements without having to invoke procedures. The CASE expression correlates a list of values and alternatives.
An XQuery if-then-else pattern can be translated into an SQL CASE expression if:
The then and else expressions can contain (or fully consist of) parameters. If the if-then-else expression does not depend on the data source, the entire expression is pushed as a parameter.
An example can be seen in Table 3-26.
In the typical RDBMS application, it is quite common to paginate the results — output just 20 customer records per page, for example, for printing or other purposes. XQuery meets this need with its subsequence( ) function. XQuery provides two different subsequence functions, shown in Table 3-27 and in Table 3-28.
The two-argument variant returns the remaining items of an input sequence, starting from the $startingLoc. The three-argument variant returns $length items of the input sequence starting from the $startingLoc. Table 3-29 shows several different examples of the subsequence function in the context of specific queries.
An XQuery subsequence pattern can be translated into an SQL subsequence expression if:
Oracle Data Service Integrator can pushdown the subsequence pattern to the underlying RDBMS, thereby enhancing performance, as long as the underlying RDBMS supports it.
Note: | Subsequence pushdown is not supported for PointBase, Sybase, or any base RDBMS (see “XQuery-SQL Mapping Reference” on page 7-1 for other core and base RDBMS information.) |
Oracle Data Service Integrator lets you create data services not only from relational tables and views, but also from SQL queries. These direct SQL data services, as they are called, can also be composed by the XQuery engine, and pushed down as native SQL to the target RDBMS, if:
If the RDBMS does not support sub-queries (the FROM clause), the pushdown will not occur.
For example, a user-defined SQL query, “recent_order” is configured as a relational source:
SELECT * from RECENT_ORDER
The XQuery that gets created in the data service and the resulting generated SQL that gets pushed down by the XQuery engine are shown in Table 3-30.
SQL pushdown on top of direct SQL is not limited to simple select-project queries. Any operation for which pushdown is supported for table and view sources is also supported for data services created for direct SQL queries. For example, Table 3-31 shows a join query and its generated result.
Oracle Data Service Integrator uses SQL pushdown to off-load query processing to the underlying data source RDBMS whenever possible. However, as mentioned in How the XQuery Engine Supports SQL Data Sources, SQL pushdown is not always possible, nor beneficial. For example, when two data sources are running on two different systems, or when a query combines relational data with non-relational data, SQL pushdown may not provide any performance benefit.
In cases such as these, Oracle Data Service Integrator uses special techniques to batch-process the outside portion of a query (the left branch) and send a cluster (or chunk) of data to the right branch as parameters (see Table 3-32).
The XQuery engine chooses this optimization technique (a “clustered parameter passing join,” also known as PPK) for a distributed query when:
Unless all these conditions are met, the XQuery engine cannot use this optimization technique but will instead use the single parameter join instead (PP1 join).
Developers can exercise control over SQL pushdown by using the fn-bea:fence()
function (an Oracle extension to XQuery functions and operations) to demarcate sections of XQuery code that the XQuery engine should ignore when it is evaluating query fragments for SQL pushdown.
For the example shown in Table 3-33, even though the upper-case function could be pushed down to the RDBMS, its pushdown is blocked by the fence() function and the upper-case function will be executed by the XQuery engine. Only the fragment comprising the lower-case function is included in the query plan as SQL pushdown. The result of the SQL will be returned to the XQuery engine, which will use the XQuery upper-case function on the result.
Use the fence() function whenever you want SQL to be sent as is, to the RDBMS. For example, if you are accessing an Oracle 8.5.x RDBMS that uses hints and Oracle’s rule-based optimizer, you should send the hinted SQL queries to the data source by wrapping them in the fence() function.
To circumvent SQL pushdown for specific clauses, extract those clauses into separate FLWOR expressions with the fence( ) function at the top of the clause, as shown here:
for $x in
fn-bea:fence
(
for $c in CUSTOMER()
return $c/LAST_NAME
)
order by $x
return $x
As you develop data services that use relational data sources, use the Oracle Data Service Integrator Query Plan View to see the results of using the fence( ) function (Figure 3-34). In this example, the order by clause will be executed by the XQuery engine rather than pushed down as SQL.
Note that the red triangles displayed in the SQL portions of Figure 3-34 are alerts calling attention to the fact that a where clause is missing from the XQuery statement.