SQL Worksheet Guidelines

Adhering to operational standards ensures that your queries align with system performance, data security, and query efficiency. Consider the following guidelines when you use SQL Worksheet to develop SQL queries for an Autonomous Database Warehouse (ADW) instance.

  • Query execution within SQL Worksheets is subject to a 5-minute timeout. You can see if the query has timed out by looking at the indicator in the lower-left bar. If you are creating complex queries which are likely to take longer than five minutes, Oracle recommends:
    • Creating the query using DSS or SQL Developer, which are not subject to timeout limitations.
    • Use the EXPLAIN PLAN statement to generate the execution plan for the statement. Executing the EXPLAIN PLAN statement internally to identify opportunities for query optimization. The execution plan is automatically displayed in the Explain Plan tab in the worksheet output pane.
    • Use the Consumer Group list to select the consumer group to run your SQL or PL/SQL code. ADW consumer groups are predefined resource management profiles like High, Medium, and Low and set different levels of CPU use. Oracle recommends LOW, but if that option is not feasible, then select MEDIUM to executing the statement. Do not use HIGH because it is reserved for administrative tasks.
  • Avoid large IN lists in the Where clause. Optimize queries with large IN lists (for example, SELECT * FROM RWD_ED_2024 WHERE person_id IN ('101', '102', '103'....) ) by joining to a table or using a CTE.
  • Avoid unnecessarily selecting all data. Retrieve only the needed columns SELECT <COLUMN_A> FROM ... instead of SELECT * FROM...
  • Use FETCH FIRST n ROWS ONLY to limit the number of rows returned.
  • Share subsets of RWD data with OAC: OAC is an enterprise cloud-based analytics and business intelligence (BI) platform, designed to let organizations analyze data, build dashboards and workbooks, and apply advanced analytics (AI/ML) without needing to manage heavy on-premises infrastructure. Oracle does not recommend connecting OAC to the full RWD dataset. Even though OAC can connect to large datasets (even datasets with hundreds of billions of rows), performance and report response time could be long. OAC is intended for smaller, curated datasets. The ORWD dataset is multiple terabytes, and some tables have more than 30 billion records. To improve performance, Oracle recommends preparing the data for use in OAC as follows:
    • Generate a subset of RWD with data for a patient cohort, yielding a smaller dataset representing 200,000 patients as opposed to the 100 million patients in the entire RWD dataset.
    • Group by attributes of interest instead of data at the most granular level to generate aggregates.
    • Once the subset of RWD is prepared and stored in a schema, use OAC to build visualizations and dashboards or workbooks based on that subset.