Best Practices Using XQuery

This chapter offers a series of best practices for creating data services using XQuery. The chapter introduces a data service design model, and describes a conceptual model for layering data services to maximize management, maintainability, and reusability.

Introducing Data Service Design

When designing data services, you should strive to maximize the ability to maintain, manage, and reuse queries. One approach is to adopt a layered design model that partitions services into the following levels:

Using this design model, you can design and develop data services in the following manner:

Understanding Data Service Design Principles

This section describes best practices for designing and developing services at each layer of the data service design model. Table 5-2 describes the data service design principles.

Table 5-2 Data Service Design Principles
Level	Design Principle	Description
Application Services	Base design on client needs	Design data services and queries at the Application Services level specifically tuned to client needs, using functions defined at the Logical and Canonical Service levels.
	Nest or relate information, as required by the application	Use the XML practice of nesting related information in a single XML structure. Alternatively, use navigation functions to relate associated information, as required by the application.
	Introduce constraints at the highest level	AquaLogic Data Services Platform propagates constraints down function levels when generating queries. By keeping constraints, such as function parameters, at the highest level, you encourage reuse of lower level functions and permit the system to efficiently optimize the final generated query.
	Aggregate data at the highest level	Aggregate data in functions at the highest level possible, preferably at the Application Services level.
Logical Services	Create common functions to serve multiple applications	Design functions that provide common services required by applications. Base function design at the Logical Services level on requirements already established at the Application Services level, based on client needs.
Logical Services	Refactor to reduce the number of functions	Refactor the functions, as necessary, to reduce the overall number of functions to as few as possible. This reduces complexity, simplifies documentation, and eases future maintenance.
Canonical Services	Use function defined in the Physical Services level	Create (public) read functions can then all be expressed in terms of the main “get all instances” function.
Canonical Services	Create navigation functions to represent relationships	Use separate data services with relationships (implemented through navigation functions) rather than nesting data. For example, create navigation functions to relate customers and orders or customers and addresses instead of nesting this information. This keeps data services and their queries small, making them more manageable, maintainable, and reusable.
	Define keys to improve performance	Defining keys enables the system to use this information when optimizing queries.
	Establish relationships between unique identifiers and primary keys	Establish relationships between unique identifiers or primary keys that refer to the same data (such as Customer ID or SSN) but vary across multiple data sources. You can use either of the following methods: Create navigation functions to create relationships between the data. Create a new table in the database to relate the unique identifiers and primary keys.
Physical Services	Employ functions that get all records	Using protected functions that get all records at the Physical Services level provides the system with the most flexibility to optimize data access based on constraints specified in higher level functions.
	Do not perform data type transformations	The system is unable to generate optimizations based on constraints specified at higher levels when data type transformations are performed at the Physical Services level.
	Do not aggregate	Use aggregates at the highest level possible to enable the system to optimize data access.

Applying Data Service Implementation Guidelines

Table 5-3 describes implementation guidelines to apply when designing and developing data services.

Table 5-3 Data Service Implementation Guidelines
Level	Design Principle	Description
Application Services	Use the group clause to aggregate	When performing a simple aggregate operation (such as count, min, max, and so forth) over data stored in a relational source, use a group clause as illustrated by the following: `for $x in f1:CUSTOMER() group $x as $g by 1 return count($g)` instead of: `count( f1:CUSTOMER() )` in order to enable pushdown of the aggregation operation to the underlying relational data source. Note that the two formulations are semantically equivalent except for the case where the sequence returned by f1:CUSTOMER() is the empty sequence. Of course performance will be better for the pushed down statement.
Application Services	Use element(foo) instead of schema-element(foo)	Define function arguments and return types in data services as element(foo) instead of schema-element(foo). Using schema-element instead of element causes AquaLogic Data Services Platform to perform validation, potentially blocking certain optimizations.
	Use `xs:string` to cast data	Use xs:string when casting data instead of fn:string(). The two approaches are not equivalent when handling empty input, and the use of xs:string enables cast operations to be executed by the database.
	Be aware of Oracle treating empty strings as NULL, and how this affects XQuery semantics	The Oracle RDBMS treats empty strings as NULL, without providing a method of distinguishing between the two. This can affect the semantics of certain XQuery functions and operations. For example, the `fn:lower-case()` function is pushed down to the database as LOWER, though the two have different semantics when handling an empty string, as summarized by the following: `fn:lower-case()` returns an empty string LOWER in Oracle returns NULL When using Oracle, consider using the `fn-bea:fence()` function and performing additional computation if precise XQuery semantics are required.
Application Services	Return plural for functions that contain FLWOR expressions	When a function body contains a FLWOR expression, or references to functions that contains FLWOR, the function should return plural. For example, consider the following XQuery expression: `For $c in CUSTOMER() Return <CUSTOMER> <LAST_NAME>$c/LAST_NAME</LAST_NAME> <FIRST_NAME>$c/FIRST_NAME </FIRST_NAME> <ADDRESS>{ For $a in ADDRESS() Where $a/CUSTOMER_ID = $c/CUSTOMER_ID Return $a }</ADDRESS> </CUSTOMER>` Defining a one-to-one relationship between a CUSTOMER and an ADDRESS, as in the following, can block optimizations. `<element name=CUSTOMER> <element name=LAST_NAME/> <element name=FIRST_NAME/> <element name=ADDRESS/> </element>`
Application Services	Return plural for functions that contain FLWOR expressions (continued)	This is because AquaLogic Data Services Platform determines that there can be multiple addresses for one CUSTOMER. This leads the system to insert a `TypeMatch` operation to ensure that there is exactly one ADDRESS. The TypeMatch operation blocks optimizations, thus producing a less efficient query plan. The Query Plan Viewer shows `TypeMatch` operations in red and should be avoided. Instead, the schema definition for ADDRESS should indicate that there could be zero or more ADDRESSes. `<element name=CUSTOMER> <element name=LAST_NAME/> <element name=FIRST_NAME/> <element name=ADDRESS minOccurs=”0” maxOccurs=”unbounded”/> </element>`
Application Services	Avoid cross product situations	Avoid cross product (Cartesian Product) situations when including conditions. For example, the following XQuery sample results in poor performance due to a cross product situation: `define fn ($p string) for $c in CUSTOMER() for $o in ORDER() where $c/id eq $p and $o/id eq $p` Instead, use the following form to specify the same query: `define fn ($p string) for $c in CUSTOMER() for $o in ORDER() where $c/id eq $o/id and $c/id eq $p`

XQuery and XQSE Developer’s Guide

Best Practices Using XQuery

Introducing Data Service Design

Understanding Data Service Design Principles

Applying Data Service Implementation Guidelines