Consider the following tips to help you create more efficient SQL data sets:
Ensure that your query returns only the data you need for your reports. Returning excessive data risks OutOfMemory exceptions.
For example, never simply return all columns as in:
SELECT * FROM EMPLOYEES;
Always avoid the use of *.
Two best practices for restricting the data returned are:
Always select only the columns you need
For example:
SELECT DEPARTMENT_ID, DEPARTMENT_NAME FROM EMPLOYEES;
Use a WHERE clause and bind parameters whenever possible to restrict the returned data more precisely.
This example selects only the columns needed and only those that match the value of the parameter:
SELECT DEPARTMENT_ID, DEPARTMENT_NAME FROM EMPLOYEES WHERE DEPARTMENT_ID IN (:P_DEPT_ID)
The shorter the column name, the smaller the resulting XML file; the smaller the XML file the faster the system parses it.
Shorten your column names using aliases to shorten I/O processing time and enhance report efficiency.
In this example, DEPARTMENT_ID
is shortened to "id
" and DEPARTMENT_NAME
is shortened to "name
":
SELECT DEPARTMENT_ID id, DEPARTMENT_NAME nameFROM EMPLOYEES WHERE DEPARTMENT_ID IN (:P_DEPT_ID)
Although the Data Model Group Filter feature enables you to remove records retrieved by your query, this process takes place in the middle tier, which is much less efficient than the database tier.
It is a better practice to remove unneeded records through your query using WHERE clause conditions instead.
PL/SQL function calls in the WHERE clause of the query can result in multiple executions.
These function calls execute for each row found in the database that matches. Moreover, this construction requires PL/SQL to SQL context switching, which is inefficient.
As a best practice, avoid PL/SQL calls in the WHERE clause; instead, join the base tables and add filters.
The use of the system DUAL table for returning the sysdate or other constants is inefficient. You should avoid using the system DUAL table when not required.
For example, instead of:
SELECT DEPARTMENT_ID ID, (SELECT SYSDATE FROM DUAL) TODAYS_DATE FROM DEPARTMENTS WHERE DEPARTMENT_ID IN (:P_DEPT_ID)
Consider:
SELECT DEPARTMENT_ID ID, SYSDATE TODAYS_DATE FROM DEPARTMENTS WHERE DEPARTMENT_ID IN (:P_DEPT_ID)
In the first example, DUAL is not required. You can access SYSDATE directly.
Package function calls at the element, within the group or row level, are not allowed. You can include package function calls at the global element level because these functions are executed only once per data model execution request.
Example:
<dataStructure> <group name="G_order_short_text" dataType="xsd:string" source="Q_ORDER_ATTACH"> <element name="order_attach_desc" dataType="xsd:string" value="ORDER_ATTACH_DESC"/> <element name="order_attach_pk" dataType="xsd:string" value="ORDER_ATTACH_PK"/>
The following element is incorrect:
<element name="ORDER_TOTAL _FORMAT" dataType="xsd:string" value=" WSH_WSHRDPIK_XMLP_PKG.ORDER_TOTAL _FORMAT "/> <!-- This is wrong should not be called within group.--> </group>
<element name="S_BATCH_COUNT" function="sum" dataType="xsd:double" value="G_mo_number.pick_slip_number"/> </dataStructure>
It can seem desirable to create one data model with multiple data sets to serve multiple reports, but this practice results in very poor performance.
When a report runs, the data processor executes all data sets irrespective of whether the data is used in the final output.
For better report performance and memory efficiency, consider carefully before using a single data model to support multiple reports.
The data model provides a mechanism to create parent-child hierarchy by linking elements from one data set to another.
At run time, the data processor executes the parent query and for each row in the parent executes the child query. When a data model has many nested parent-child relationships slow processing can result.
A better approach to avoid nested data sets is to combine multiple data set queries into a single query using the WITH clause.
Following are some general tips about when to combine multiple data sets into one data set:
When the parent and child have a 1-to-1 relationship; that is, each parent row has exactly one child row, then merge the parent and child data sets into a single query.
When the parent query has many more rows compared to the child query. For example, an invoice distribution table linked to an invoice table where the distribution table has millions of rows compared to the invoice table. Although the execution of each child query takes less than a second, for each distribution hitting the child query can result in STUCK threads.
Example of when to use a WITH clause:
Query Q1: SELECT DEPARTMENT_ID EDID,EMPLOYEE_ID EID,FIRST_NAME FNAME,LAST_NAME LNAME,SALARY SAL,COMMISSION_PCT COMMFROM EMPLOYEES Query Q2: SELECT DEPARTMENT_ID DID,DEPARTMENT_NAME DNAME,LOCATION_ID LOCFROM DEPARTMENTS
Combine the these queries into one using WITH clause as follows:
WITH Q1 as (SELECT DEPARTMENT_ID DID,DEPARTMENT_NAME DNAME,LOCATION_ID LOC FROM DEPARTMENTS), Q2 as (SELECT DEPARTMENT_ID EDID,EMPLOYEE_ID EID,FIRST_NAME FNAME,LAST_NAME LNAME,SALARY SAL,COMMISSION_PCT COMM FROM EMPLOYEES) SELECT Q1.*, Q2.* FROM Q1 LEFT JOIN Q2 ON Q1.DID=Q2.EDID
In-line queries execute for each column for each row. For example, if a main query has 100 columns, and brings 1000 rows, then each column query executes 1000 times.
Altogether, it is 100 multiplied by 1000 times. This is not scalable and cannot perform well. Avoid using in-line sub queries whenever possible.
Avoid the following use of in-line queries. If this query returns only a few rows this approach may work satisfactorily. However, if the query returns 10000 rows, then each sub or in-line query executes 10000 times and the query would likely result in Stuck threads.
SELECT NATIONAL_IDENTIFIERS,NATIONAL_IDENTIFIER, PERSON_NUMBER, PERSON_ID, STATE_CODE FROM (select pprd.person_id,(select REPLACE(national_identifier_number,'-') from per_ national_identifiers pni where pni.person_id = pprd.person_id and rownum<2) national_identifiers,(select national_identifier_number from per_national identifiers pni where pni.person_id = pprd.person_id and rownum<2) national_ identifier,(select person_number from per_all_people_f ppf where ppf.person_id = pprd.person_id and :p_effective_start_date between ppf.effective_start_date and ppf.effective_ end_date) PERSON_NUMBER (Select hg.geography_code from hz_geographies hg where hg.GEOGRAPHY_NAME = paddr.region_2 and hg.geography_type = 'STATE') state_code
Oracle Database allows bind maximum of 1000 values per parameter.
Binding a large number of parameter values is inefficient. Avoid binding more than 100 values to a parameter.
When you create a Menu type parameter and your list of values contains many values, ensure that you enable both the Multiple Selection and Can Select All options, then also select NULL value passed to ensure that too many values are not passed.
Report consumers often must run reports that support the certain conditions.
If no parameter is selected (null), then return all.
Allow selection of multiple parameter values
In these cases the use of NVL() does not work, you should therefore use
COALESCE() for queries against Oracle Database
CASE / WHEN for Oracle BI EE (logical) queries
Example:
SELECT EMPLOYEE_ID ID, FIRST_NAME FNAME, LAST_NAME LNAME FROM EMPLOYEES WHERE DEPARTMENT_ID = NVL(:P_DEPT_ID, DEPARTMENT_ID
The preceding query syntax is correct only when the value of P_DEPT_ID
is a single value or null. This syntax does not work when you pass more than a single value.
To support multiple values, use the following syntax:
For Oracle Database:
SELECT EMPLOYEE_ID ID, FIRST_NAME FNAME, LAST_NAME LNAME FROM EMPLOYEES WHERE (DEPARTMENT_ID IN (:P_DEPT_ID) OR COALESCE (:P_DEPT_ID, null) is NULL)
For Oracle BI EE data source:
(CASE WHEN ('null') in (:P_YEAR) THEN 1 END =1 OR "Time"."Per Name Year" in (:P_YEAR))
For Oracle BI EE the parameter data type must be string. Number and date data types are not supported.
The data model provides a feature to group breaks and sort data.
Sorting is supported for parent group break columns only. For example, if a data set of employees is grouped by department and manager, you can sort the XML data by department. If you know how the data should be sorted in the final report or template, you specify sorting at data generation time to optimize document generation. The column order specified in the SELECT clause must exactly match the element orders in the data structure. Otherwise group break and sort may not work. Due to complexity, multiple grouping with multiple sorts at different group levels is not allowed.
Example: In the example shown below, sort and group break are applied to the parent group only, that is, G_1. Notice the column order in the query, data set dialog, and data structure. The SQL column order must exactly match the data structure element field order; otherwise, it may result in data corruption.
Example:
SELECT d.DEPARTMENT_ID DEPT_ID, d.DEPARTMENT_NAME DNAME, E.FIRST_NAME FNAME,E.LAST_NAME LNAME,E.JOB_ID JOB,E.MANAGER_ID FROM EMPLOYEES E,DEPARTMENTS D WHERE D.DEPARTMENT_ID = E.DEPARTMENT_ID ORDER BY d.DEPARTMENT_ID, d.DEPARTMENT_NAME
Once you define the query, you can use the data model designer to select data elements and create group breaks as shown below.
The Data Structure with breaks is:
<output rootName="DATA_DS" uniqueRowName="false"> <nodeList name="data-structure"> <dataStructure tagName="DATA_DS"> <group name="G_1" label="G_1" source="q1"> <element name="DEPT_ID" value="DEPT_ID" label="DEPT_ID" fieldOrder="1"/> <element name="DNAME" value="DNAME" label="DNAME" fieldOrder="2"/> <group name="G_2" label="G_2" source="q1"> <element name="FNAME" value="FNAME" label="FNAME" fieldOrder="3"/> <element name="LNAME" value="LNAME" label="LNAME" fieldOrder="4"/> <element name="JOB" value="JOB" label="JOB" fieldOrder="5"/> <element name="MANAGER_ID" value="MANAGER_ID" label="MANAGER_ID" fieldOrder="6"/> </group> </group> </dataStructure> </nodeList> </output>