Manage Query Caching

Oracle Analytics Cloud maintains a local cache of query results sets in the query cache.

Topics:

About the Query Cache

The query cache enables Oracle Analytics Cloud to satisfy many subsequent query requests without accessing back-end data sources and this increases query performance. However, the query cache entries might get stale as updates occur on the back-end data sources.

Advantages of Caching

The fastest way to process a query is to skip the bulk of the processing and use a precomputed answer.

With query caching, Oracle Analytics Cloud stores the precomputed results of queries in a local cache. If another query can use those results, then all database processing for that query is eliminated. This can result in dramatic improvements in the average query response time.

In addition to improving performance, being able to answer a query from a local cache conserves network resources and processing time on the database server. Network resources are conserved because intermediate results aren't returned to Oracle Analytics Cloud. Not running the query on the database frees the database server to do other work. If the database uses a charge back system, then running less queries might also cut costs in the budget.

Another benefit of using the cache to answer a query is savings in processing time on Oracle Analytics Cloud, especially if the query results are retrieved from multiple databases. Depending on the query, there might be considerable join and sort processing in the server. If the query is already calculated, then this processing is avoided, freeing server resources for other tasks.

To summarize, query caching can dramatically improve query performance and reduce network traffic, database processing, and processing overhead.

Costs of Caching

Query caching has many obvious benefits, but also certain costs.

  • Potential for cached results being stale

  • Administrative costs of managing the cache

With cache management, the benefits typically far outweigh the costs.

Administrative Tasks Associated with Caching

Some administrative tasks are associated with caching. You must set the cache persistence time for each physical table appropriately, knowing how often data in that table is updated.

When the frequency of the update varies, you must keep track of when changes occur and purge the cache manually when necessary.

Keep the Cache Up To Date

If the cache entries aren't purged when the data in the underlying databases changes, then queries can potentially return results that are out of date.

You must evaluate whether this is acceptable. It might be acceptable to allow the cache to contain some stale data. You must decide what level of stale data is acceptable and then configure (and follow) a set of rules to reflect those levels.

For example, suppose an application analyzes corporate data from a large conglomerate, and you're performing yearly summaries of the different divisions in the company. New data doesn't materially affect the queries because the new data affects only next year's summaries. In this case, the trade-offs for deciding whether to purge the cache might favor leaving the entries in the cache.

Suppose, however, that the databases are updated three times a day and you're performing queries on the current day's activities. In this case, you must purge the cache much more often, or perhaps consider not using the cache at all.

Another scenario is that you rebuild the dataset from the beginning at periodic intervals (for example, once per week). In this example, you can purge the entire cache as part of the process of rebuilding the dataset, ensuring that you never have stale data in the cache.

Whatever your situation, you must evaluate what is acceptable for noncurrent information returned to the users.

Cache Sharing Across Users

If shared logon is enabled for a particular connection pool, then the cache can be shared across users and doesn't need to be seeded for each user.

If shared logon isn't enabled and a user-specific database login is used, then each user generates their own cache entry.

Enable or Disable Query Caching

In Oracle Analytics Cloud, the query cache is enabled by default. You can enable or disable query caching on the System Settings page.

  1. Click Console.
  2. Click System Settings.
  3. Click Performance and Compatibility .
  4. Set Cache Enable on or off.
    • On — Data query caching is enabled.
    • Off — Caching is disabled.
  5. Click Apply.
    Wait a few moments for the changes to refresh through the system.

Monitor and Manage the Cache

To manage the changes in the underlying databases and to monitor cache entries, you must develop a cache management strategy.

You need a process to invalidate cache entries when the data in the underlying tables that compose the cache entry changes, and a process to monitor, identify, and remove any undesirable cache entries.

This section contains the following topics:

Choose a Cache Management Strategy

The choice of a cache management strategy depends on the volatility of the data in the underlying databases and the predictability of the changes that cause this volatility.

It also depends on the number and types of queries that comprise your cache and the usage those queries receive. This section provides an overview of the various approaches to cache management.

Disable Caching For the System

You can disable caching for the entire system to stop all new cache entries and stop any new queries from using the existing cache. Disabling caching lets you enable it at a later time without losing any entries that are stored in the cache.

Temporarily disabling caching is a useful strategy in situations where you might suspect having stale cache entries, but want to verify if they're actually stale before purging those entries or the entire cache. If you find that the data stored in the cache is still relevant, or after you have safely purged problem entries, then you can safely enable the cache. If necessary, purge the entire cache or the cache that's associated with a particular business model before enabling the cache again.

Cache and Cache Persistence Timing For Specified Physical Tables

You can set a cacheable attribute for each physical table, enabling you to specify whether queries for that table are added to the cache to answer future queries.

If you enable caching for a table, then any query involving the table is added to the cache. All tables are cacheable by default, but some tables mightn't be good candidates to include in the cache unless you set up suitable cache persistence settings. For example, suppose that you've a table that stores stock ticker data that's updated every minute. You can specify that you want to purge the entries for that table every 59 seconds.

You can also use cache persistence settings to specify how long the entries for this table are stored in the query cache. This is useful for data sources that are updated frequently.

  1. In Model Administration Tool, in the Physical layer, double-click the physical table.

    If you use Semantic Modeler, see What Are a Physical Table's General Properties?.

  2. In the Physical Table properties dialog, in the General tab, make one of the following selections:

    • To enable caching, select Cacheable.

    • To prevent a table from being cached, deselect Cacheable.

  3. To set a cache expiration time, specify a Cache persistence time and specify a unit of measure (days, hours, minutes, or seconds). If you don't want cache entries to automatically expire, select Cache never expires.

  4. Click OK.

How Semantic Model Changes Affect the Query Cache

When you modify semantic models using Semantic Modeler or Model Administration Tool, the changes can have implications for entries that are stored in the cache. For example, if you change the definition of a physical object or a dynamic semantic model variable, cache entries that reference that object or variable might no longer be valid. These changes might result in the need to purge the cache. There are two scenarios to be aware of: when you modify your existing semantic model, and when you create (or upload) a new semantic model.

Changes to the Semantic Model

When you modify a semantic model or upload a different .rpd file, any changes that you make that affect cache entries automatically result in a purge of all cache entries that reference the changed objects. The purge occurs when you upload the changes. For example, if you delete a physical table from a semantic model, then all cache entries that reference that table are purged upon check in. Any changes made to a semantic model in the Logical layer purge all cache entries for that semantic model.

Changes to Global Semantic Model Variables

The values of global semantic model variables are refreshed by data that's returned from queries. When you define a global semantic model variable, you create an initialization block or use a preexisting one that contains a SQL query. You also configure a schedule to run the query and periodically refresh the value of the variable.

If the value of a global semantic model variable changes, then any cache entry which uses this variable in a column becomes stale, and a new cache entry is generated when data in that entry is needed again. The old cache entry isn't removed immediately, but remains until it is cleaned through the usual caching mechanism.

Strategies For Using the Cache

One of the main advantages of query caching is to improve apparent query performance.

Query caching might be valuable to seed the cache during off hours by running queries and caching their results. A good seeding strategy requires that you know when cache hits occur.

If you want to seed the cache for all users, you might seed the cache with the following query:

SELECT User, SRs

After seeding the cache using SELECT User, SRs, the following queries are cache hits:

SELECT User, SRs WHERE user = valueof(nq_SESSION.USER) (and the user was USER1)
SELECT User, SRs WHERE user = valueof(nq_SESSION.USER) (and the user was USER2)
SELECT User, SRs WHERE user = valueof(nq_SESSION.USER) (and the user was USER3)

This section contains the following topics:

About Cache Hits

When caching is enabled, each query is evaluated to determine whether it qualifies for a cache hit.

A cache hit means that Oracle Analytics Cloud was able to use cache to answer the query and didn't go to the database at all. Oracle Analytics Cloud can use the query cache to answer queries at the same or higher level of aggregation.

Many factors determine whether cache is hit. The table below describes these factors.

Factor or Rule Description

A subset of columns in the SELECT list must match

All of the columns in the SELECT list of a new query have to exist in the cached query to qualify for a cache hit, or they must be able to be calculated from the columns in the query.

This rule describes the minimum requirement to hit the cache, but meeting this rule doesn't guarantee a cache hit. The other rules listed in this table also apply.

Columns in the SELECT list can be composed of expressions on the columns of the cached queries

Oracle Analytics Cloud can calculate expressions on cached results to answer the new query, but all the columns must be in the cached result. For example, the query:

SELECT product, month, averageprice FROM sales WHERE year = 2000

hits cache on the query:

SELECT product, month, dollars, unitsales FROM sales WHERE year = 2000

because averageprice can be computed from dollars and unitsales (averageprice = dollars/unitsales).

WHERE clause must be semantically the same or a logical subset

For the query to qualify as a cache hit, the WHERE clause constraints must be either equivalent to the cached results, or a subset of the cached results.

A WHERE clause that's a logical subset of a cached query qualifies for a cache hit if the subset meets one of the following criterion:

  • A subset of IN list values. Queries requesting fewer elements of an IN list cached query qualify for a cache hit. For example, the following query:

    SELECT employeename, region
    FROM employee, geography
    WHERE region in ('EAST', 'WEST')

    qualifies as a hit on the following cached query:

    SELECT employeename, region
    FROM employee, geography
    WHERE region in ('NORTH', 'SOUTH', 'EAST', 'WEST')
  • It contains fewer (but identical) OR constraints than the cached result.

  • It contains a logical subset of a literal comparison. For example, the following predicate:

    WHERE revenue < 1000

    qualifies as a cache hit on a comparable query with the predicate:

    WHERE revenue < 5000
  • There is no WHERE clause. If a query with no WHERE clause is cached, then queries that satisfy all other cache hit rules qualify as cache hits regardless of their WHERE clause.

In addition columns that are used on the WHERE clause must be on the projection list. For example, the following query:

SELECT employeename
FROM employee, geography
WHERE region in ('EAST', 'WEST')

Doesn't result in a cache hit for the seeding query in the previous list because REGION isn't on the projection list.

Dimension-only queries must be an exact match

If a query is dimension only, meaning that no fact or measure is included in the query, then only an exact match of the projection columns of the cached query hits the cache. This behavior prevents false positives when there are multiple logical sources for a dimension table.

Queries with special functions must be an exact match

Other queries that contain special functions such as time series functions (AGO, TODATE, and PERIODROLLING), limit and offset functions (OFFSET and FETCH), relationship functions (ISANCESTOR, ISLEAF, ISROOT, and ISSIBLING), external aggregation functions, and generally filter metrics must also be an exact match with the projection columns in the cached query. In these cases, the filter must also be an exact match. For filter metrics, if the filter metric can be rewritten as a WHERE clause, then the subset cache might be leveraged.

Set of logical tables must match

To qualify as a cache hit, all incoming queries must have the same set of logical tables as the cache entry. This rule avoids false cache hits. For example, SELECT * FROM product doesn't match SELECT * FROM product, sales.

Session variable values must match, including security session variables

If the logical SQL or physical SQL statement refers to any session variable, then the session variable values must match. Otherwise, the cache isn't hit.

In addition, the value of session variables that are security sensitive must match the security session variable values that are defined in the semantic model, even though the logical SQL statement itself doesn't reference session variables. See Ensure Correct Cache Results When Using Row-Level Database Security.

Equivalent join conditions

The resultant joined logical table of a new query request has to be the same as (or a subset of) the cached results to qualify for a cache hit.

DISTINCT attribute must be the same

If a cached query eliminates duplicate records with DISTINCT processing (for example, SELECT DISTINCT...), then requests for the cached columns must also include the DISTINCT processing; a request for the same column without the DISTINCT processing is a cache miss.

Queries must contain compatible aggregation levels

Queries that request an aggregated level of information can use cached results at a lower level of aggregation. For example, the following query requests the quantity sold at the supplier and region and city level:

SELECT supplier, region, city, qtysold
FROM suppliercity

The following query requests the quantity sold at the city level:

SELECT city, qtysold
FROM suppliercity

The second query results in a cache hit on the first query.

Limited additional aggregation

For example, if a query with the column qtysold is cached, then a request for RANK(qtysold) results in a cache miss. Additionally, a query that requests qtysold at the country level can get a cache hit from a query that requests qtysold at the country, region level.

ORDER BY clause must be comprised of columns in the select list

Queries that order by columns that aren't contained in the select list result in cache misses.

Diagnosing cache hit behavior

To better assess cache hit behavior, set the ENABLE_CACHE_DIAGNOSTICS session variable to 4, as shown in the following example:

ENABLE_CACHE_DIAGNOSTICS=4
Ensure Correct Cache Results When Using Row-Level Database Security

When using a row-level database security strategy, such as a Virtual Private Database (VPD), the returned data results are contingent on the authorization credentials of the user.

Because of this, Oracle Analytics Cloud must know whether a data source is using row-level database security and which variables are relevant to security.

To ensure that cache hits only occur on cache entries that include and match all security-sensitive variables, you must correctly configure the database object and session variable objects in the Model Administration Tool, as follows:

  • Database object. In the Physical layer, in the General tab of the Database dialog, select Virtual Private Database to specify that the data source is using row-level database security.

    If you're using row-level database security with shared caching, then you must select this option to prevent the sharing of cache entries whose security-sensitive variables don't match.

  • Session Variable object. For security-related variables, in the Session Variable dialog, select Security Sensitive to identify them as sensitive to security when using a row-level database security strategy. This option ensures that cache entries are marked with the security-sensitive variables, enabling security-sensitive variable matching on all incoming queries.

Run a Suite of Queries to Populate the Cache

To maximize potential cache hits, one strategy is to run a suite of queries to populate the cache.

The following are some recommendations for the types of queries to use when creating a suite of queries with which to seed the cache.

  • Common prebuilt queries. Queries that are commonly run, particularly ones that are expensive to process, are excellent cache seeding queries. Queries whose results are embedded in dashboards are good examples of common queries.

  • SELECT lists with no expressions. Eliminating expressions on SELECT list columns expands the possibility for cache hits. A cached column with an expression can only answer a new query with the same expression; a cached column with no expressions can answer a request for that column with any expression. For example, a cached request such as:

    SELECT QUANTITY, REVENUE...
    

    can answer a new query such as:

    SELECT QUANTITY/REVENUE... 
    

    but not the reverse.

  • No WHERE clause. If there is no WHERE clause in a cached result, then it can be used to answer queries that satisfy the cache hit rules for the select list with any WHERE clause that includes columns in the projection list.

In general, the best queries to seed cache with are queries that heavily consume database processing resources and that are likely to be reissued. Be careful not to seed the cache with simple queries that return many rows. These queries (for example, SELECT * FROM PRODUCTS, where PRODUCTS maps directly to a single database table) require very little database processing. Their expense is network and disk overhead, which are factors that caching doesn't alleviate.

When Oracle Analytics Cloud refreshes semantic model variables, it examines business models to determine if they reference those semantic model variables. If they do, Oracle Analytics Cloud purges all cache for those business models. See How Semantic Model Changes Affect the Query Cache.

Use Agents to Seed the Query Cache

You can configure agents to seed the Oracle Analytics Cloud query cache.

Seeding the cache can improve response times for users when they run analyses or view analyses that are embedded on their dashboards. You can accomplish this by scheduling agents to run requests that refresh this data.

  1. In Oracle Analytics Cloud, open the Classic Home page, and select Agent (Create section).
  2. On the General tab, select Recipient for the Run As option. Personalized cache seeding uses the data visibility of each recipient to customize agent delivery content for each recipient.
  3. On the Schedule tab, specify when you want the cache to be seeded.
  4. Optional: Select Condition and create or select a conditional request. For example, you might have a business model that determines when the ETL process is complete. You could use a report based on this business model to be the conditional trigger for the cache seed to begin.
  5. On the Delivery Content tab, select an individual request or an entire dashboard page for which you want to seed the cache. Selecting a dashboard page can save time.
  6. On the Recipients tab, select individual users or groups to be the recipients.
  7. On the Destinations tab, clear all user destinations and select Oracle Analytics Server Cache.
  8. Save the agent by selecting the Save button in the upper-right corner.

The only difference between cache seeding agents and other agents is that they clear the previous cache automatically and don't appear on the dashboard as alerts.

Note:

Cache seeding agents only purge exact match queries, so stale data might still exist. Ensure that the caching strategy always include cache purging, because agent queries don't address ad-hoc queries or drills.

Use Model Administration Tool to Automatically Purge the Cache for Specific Tables

Purging the cache deletes entries from the query cache and keeps your content fresh. You can automatically purge cache entries for specific tables, by setting the Cache Persistence Time field for each table in Model Administration Tool.

Note:

If you use Semantic Modeler, see What Are a Physical Table's General Properties?

This is useful for data sources that are updated frequently. For example, if you have a table that stores stock ticker data that is updated every minute you can use the Cache Persistence Time setting to purge the entries for that table every 59 seconds. See Cache and Cache Persistence Timing For Specified Physical Tables.