This chapter offers a series of best practices for creating data services using XQuery. The chapter introduces a data service design model, and describes a conceptual model for layering data services to maximize management, maintainability, and reusability.
This chapter includes the following topics:
When designing data services, you should strive to maximize the ability to maintain, manage, and reuse queries. One approach is to adopt a layered design model that partitions services into the following levels:
Application Services. Data services at the Application Services level are defined by client application requirements. Functions defined in this layer can additionally be used to constraint queries and to aggregate data, among other tasks.
Logical Services. The Logical Services contain functions that perform general purpose logical operations and transformations on data accessed through Canonical and Physical Services.
Canonical Services. Data services defined at the Canonical Services level normalize data obtained from the Physical Services level.
Physical Services. The Physical Services are defined by the system based on introspection of physical data sources. The system creates data service functions that retrieve all rows in a table, offering the greatest flexibility for data service functions defined in higher layers. The system also defines relationships between data services, as required.
Figure 5-1 illustrates the data service design model.
Using this design model, you can design and develop data services in the following manner:
Develop the Physical Services based on introspection of physical data sources.
Define the Application Services based on precise client application requirements.
Design the Canonical Services to normalize and create relationships between data accessed using the Physical Services.
Design the Logical Services to manipulate and transform data accessed through the Canonical and Physical Services, providing general purpose reusable services to the Application Services layer.
Work through the layers from the top down, determining optimal functions for each level and factoring out reusable queries.
This section describes best practices for designing and developing services at each layer of the data service design model. Table 5-1 describes the data service design principles.
Table 5-1 Data Service Design Principles
Level | Design Principle | Description |
---|---|---|
Application Services |
Base design on client needs |
Design data services and queries at the Application Services level specifically tuned to client needs, using functions defined at the Logical and Canonical Service levels. |
Nest or relate information, as required by the application |
Use the XML practice of nesting related information in a single XML structure. Alternatively, use navigation functions to relate associated information, as required by the application. |
|
Introduce constraints at the highest level |
Oracle Data Service Integrator propagates constraints down function levels when generating queries. By keeping constraints, such as function parameters, at the highest level, you encourage reuse of lower level functions and permit the system to efficiently optimize the final generated query. |
|
Aggregate data at the highest level |
Aggregate data in functions at the highest level possible, preferably at the Application Services level. |
|
Logical Services |
Create common functions to serve multiple applications |
Design functions that provide common services required by applications. Base function design at the Logical Services level on requirements already established at the Application Services level, based on client needs. |
Refactor to reduce the number of functions |
Refactor the functions, as necessary, to reduce the overall number of functions to as few as possible. This reduces complexity, simplifies documentation, and eases future maintenance. |
|
Canonical Services |
Use function defined in the Physical Services level |
Create (public) read functions can then all be expressed in terms of the main "get all instances" function. |
Canonical Services |
Create navigation functions to represent relationships |
Use separate data services with relationships (implemented through navigation functions) rather than nesting data. For example, create navigation functions to relate customers and orders or customers and addresses instead of nesting this information. This keeps data services and their queries small, making them more manageable, maintainable, and reusable. |
Define keys to improve performance |
Defining keys enables the system to use this information when optimizing queries. |
|
Establish relationships between unique identifiers and primary keys |
Establish relationships between unique identifiers or primary keys that refer to the same data (such as Customer ID or SSN) but vary across multiple data sources. You can use either of the following methods:
|
|
Physical Services |
Employ functions that get all records |
Using protected functions that get all records at the Physical Services level provides the system with the most flexibility to optimize data access based on constraints specified in higher level functions. |
Do not perform data type transformations |
The system is unable to generate optimizations based on constraints specified at higher levels when data type transformations are performed at the Physical Services level. |
|
Do not aggregate |
Use aggregates at the highest level possible to enable the system to optimize data access. |
Table 5-2 describes implementation guidelines to apply when designing and developing data services.
Table 5-2 Data Service Implementation Guidelines
Level | Design Principle | Description |
---|---|---|
Application Services |
Use the group clause to aggregate |
When performing a simple aggregate operation (such as count, min, max, and so forth) over data stored in a relational source, use a group clause as illustrated by the following: for $x in f1:CUSTOMER() group $x as $g by 1 return count($g) instead of: count( f1:CUSTOMER() ) in order to enable pushdown of the aggregation operation to the underlying relational data source. Note that the two formulations are semantically equivalent except for the case where the sequence returned by |
Application Services |
Use |
Define function arguments and return types in data services as |
Use |
Use xs:string when casting data instead of fn:string(). The two approaches are not equivalent when handling empty input, and the use of xs:string enables cast operations to be executed by the database. |
|
Be aware of Oracle treating empty strings as NULL, and how this affects XQuery semantics |
The Oracle RDBMS treats empty strings as NULL, without providing a method of distinguishing between the two. This can affect the semantics of certain XQuery functions and operations. For example, the
When using Oracle, consider using the |
|
Use the function |
Oracle Data Service Integrator does not create elements for fields with NULL value. For example, Oracle Data Service Integrator will not create a You can therefore use the function |
|
Application Services |
Return plural for functions that contain FLWOR expressions |
When a function body contains a FLWOR expression, or references to functions that contains FLWOR, the function should return plural. For example, consider the following XQuery expression: For $c in CUSTOMER() Return <CUSTOMER> <LAST_NAME>$c/LAST_NAME</LAST_NAME> <FIRST_NAME>$c/FIRST_NAME </FIRST_NAME> <ADDRESS>{ For $a in ADDRESS() Where $a/CUSTOMER_ID = $c/CUSTOMER_ID Return $a }</ADDRESS> </CUSTOMER> Defining a one-to-one relationship between a CUSTOMER and an ADDRESS, as in the following, can block optimizations. <element name=CUSTOMER> <element name=LAST_NAME/> <element name=FIRST_NAME/> <element name=ADDRESS/> </element> |
Application Services |
Return plural for functions that contain FLWOR expressions (continued) |
This is because Oracle Data Service Integrator determines that there can be multiple addresses for one CUSTOMER. This leads the system to insert a The Query Plan Viewer shows <element name=CUSTOMER> <element name=LAST_NAME/> <element name=FIRST_NAME/> <element name=ADDRESS minOccurs="0" maxOccurs="unbounded"/> </element> |
Avoid cross product situations |
Avoid cross product (Cartesian Product) situations when including conditions. For example, the following XQuery sample results in poor performance due to a cross product situation: define fn ($p string) for $c in CUSTOMER() for $o in ORDER() where $c/id eq $p and $o/id eq $p Instead, use the following form to specify the same query: define fn ($p string) for $c in CUSTOMER() for $o in ORDER() where $c/id eq $o/id and $c/id eq $p |