27 inCache Framework

Oracle WebCenter Sites and Satellite Server ship with a caching system called the inCache framework, or simply inCache. This system is built on top of Ehcache, an open source, standards-based product from Terracotta. Compared to the legacy method of caching to the database, inCache offers improved website performance. This chapter introduces the inCache framework.

This chapter contains the following sections:

Section 27.1, "What is inCache?"
Section 27.2, "How inCache Works"
Section 27.3, "Restarting a Node"
Section 27.4, "Page Regeneration During RealTime Publishing"
Section 27.5, "Double-Buffered Caching"
Section 27.6, "inCache Features for Remote Satellite Server"
Section 27.7, "Summary"
Section 27.8, "Next Steps"

27.1 What is inCache?

inCache is a high performance, memory-based page and asset caching system that eliminates the need to cache Oracle WebCenter Sites' data in a central, shared repository and shared file system. inCache is based on Ehcache, an open source Java caching framework from Terracotta, and can be implemented on top of any caching system. (For more information about Ehcache, see http://ehcache.org/.)

The inCache framework replaces the legacy page caching framework. inCache also supports asset caching, by default. The inCache framework offers the following benefits:

High performance. Fast response times allow for a greater frequency of publishing sessions. Efficient page and asset invalidation ensures that updated content is quickly served.
Decentralized architecture. Nodes maintain their own local caches. The inCache framework eliminates database caching and shared disks.
Broadcasting system. Each individual node no longer has to have a complete view of the entire cache. Nodes listening for changes to content respond to broadcasts as necessary, depending on their individual content.
Improved linear scalability, enhanced for data on disk and in memory.
Failover and persistence. Nodes that are shut down retain data in their local caches and update themselves upon restart against a centrally managed record of invalidations.
On-demand page evaluation and invalidation. Nodes validate and update their currently cached content only when the content is requested.

This chapter describes mainly page caching. Throughout, the term "page" is used to mean rendered page rather than page asset. To follow the discussion of asset caching, we recommend first reading the rest of this chapter, especially Section 27.2, "How inCache Works," for a basic understanding of cache containers and how they function in relation to each other.

27.2 How inCache Works

In a clustered environment, inCache is implemented in a decentralized architecture outside the cluster's database. Each node has its own local cache, as shown in Figure 27-1, "Local Caches". Each Oracle WebCenter Sites node keeps its local cache current by listening for broadcasts from other Oracle WebCenter Sites nodes and communicates updates to remote Satellite Server via HTTP.

Figure 27-1 Local Caches

Description of ''Figure 27-1 Local Caches''

A node's local cache is partitioned:

The pageByQry cache stores the node's web pages.
The dependencyRepository cache stores identifiers of the assets that make up the web pages.

Note:
The pageByQry cache is also called page cache.
The dependencyRepository cache is also called dependency cache.
The notifier cache broadcasts identifiers of assets that have been modified by editorial or publishing processes. Broadcasts are initiated by the nodes on which assets are modified. How a listening node responds to a broadcast depends on the node. For example:
1. A WebCenter Sites user at node A edits asset A.
2. The notifier cache on node A broadcasts a notice of change to all other nodes.
3. Every WebCenter Sites node that contains asset A responds by invalidating the asset in its own local cache. Each node refers to its own dependencyRepository cache, marks the asset's identifier as invalid, and increments the dependency generation counter for the entire cache. Because the invalidated asset is no longer available to pages that reference the asset, the pages themselves are invalidated. However, the node does not evaluate the pages until they are requested. Herein lies the performance benefit.
  
  When a node responds to a request for a page that has a dependency on the invalidated asset, the node refers to its pageByQry cache, evaluates the page, determines the page to be invalid, flushes the page, generates the new page, serves the page to the visitor's browser, and records the new page along with its dependencies in the local cache (dependencies are recorded as identifiers of the assets that make up the page). From that point on, the same page is served from the node's local cache until the page expires or its assets are again invalidated.

WebCenter Sites and Satellite Server use the same caching framework, with one main difference: Satellite Server nodes still communicate with WebCenter Sites nodes via HTTP.

Note:

While the pageByQuery cache is used only for page caching, it illustrates how a cache container works. For asset caching, the inCache framework introduces a counterpart container called AssetCache. It works in a similar manner to pageByQuery and interacts with the dependencyRepository and notifier caches, described above. Additional information about asset caching is available in Chapter 29, "inCache for Asset Caching."

27.3 Restarting a Node

When a WebCenter Sites or Satellite Server node is shut down, data in its cache persists, but can become quickly outdated as the active nodes continue to invalidate assets. As a result, restarting a node requires updating its cache. Update on restart is ensured by a common invalidation memory, which is stored as a table in the database of the WebCenter Sites cluster and kept available to all nodes in the cluster for use upon restart. The table is named FW_InvalidationMemory.

The invalidation memory stores records of asset invalidations, specifically, identifiers of assets that have been modified or deleted during a content management or publishing process. Table growth is checked by a timer-based cleanup mechanism that runs at 15-minute intervals to purge invalidation records for the oldest period of time.

When a node is restarted, it attempts to recover information that it missed during its inactive period and therefore refers to the invalidation memory:

If invalidation records exist for the node's inactive period, the node replays them on itself.
If no assets were invalidated during the inactive period, the node continues to operate as if it were never shut down.
If invalidation records were purged by the cleanup mechanism while the node was inactive, the node's cache self-destructs and must be rebuilt.

When a remote Satellite Server is restarted, it obtains the information it missed by sending a request to WebCenter Sites for an update.

27.4 Page Regeneration During RealTime Publishing

Enabling inCache deactivates the donotregenerate flag for the RealTime publishing process. Because the flag is no longer recognized, crawling is used to regenerate pages during RealTime publishing sessions. Crawling is a computationally expensive option. If it is not implemented, pages will be regenerated only when they are requested.

The crawling option requires specifying a set of URLs to be analyzed by the WebCenter Sites page regenerator. Typically, they are the URLs of the home page and other high-traffic pages. In addition, you can specify the depth to which those pages will be crawled. For example, a depth of 1 means that the specified pages and pages they link to will be crawled, while a depth of 0 means that only the specified pages will be crawled. Crawled pages are regenerated only if their component assets have been invalidated during the publishing session, or the pages are not cached. All pagelets on the specified pages are regenerated in the process.

The list of URLs to crawl and the crawl depth must be specified in the FW_RegenCriteria table, which is created on the delivery system during the first publishing session after inCache is configured. The ft_ss parameter can be included in the URL to specify that requests are handled either directly by WebCenter Sites or by remote Satellite Server. Instructions for configuring inCache and enabling page regeneration can be found in Chapter 28, "inCache for Page Caching."

27.5 Double-Buffered Caching

WebCenter Sites' double-buffered page caching method uses the WebCenter Sites and Satellite Server caches in tandem on live web sites. Double buffering ensures that pages are always kept in cache, either on WebCenter Sites or Satellite Server, to protect WebCenter Sites from an overload of page requests and prevent the live web site from displaying blank pages and broken links.

To maintain the traditional system of double-buffered caching for the inCache framework, we ensured that remote Satellite Server continues to communicate with WebCenter Sites via http requests. Satellite Server still reads page data via http requests and caches in the usual way. However, page data now includes dependency information in the form of a comma-separated list of asset identifiers which is also streamed to remote Satellite Server.

27.6 inCache Features for Remote Satellite Server

Remote Satellite Server is used only in page caching. It can be configured to support advanced functionality such as page propagation and page regeneration in background.

This section contains the following topics:

Section 27.6.1, "Page Propagation"
Section 27.6.2, "Page Regeneration in Background"

27.6.1 Page Propagation

The page propagation option enables all WebCenter Sites nodes and Satellite Server nodes to host the same pages without each node having to regenerate the pages. Instead of referring to the database to regenerate pages, nodes receive newly generated and regenerated pages into their local caches from the nodes on which the pages were (re)generated and cached. Caching the pages triggers their propagation. Instructions for configuring page propagation can be found in Chapter 28, "inCache for Page Caching."

27.6.2 Page Regeneration in Background

Remote Satellite Server can be configured to serve invalidated pagelets while they are being regenerated by a background process. See Section 28.3.4, "Configuring for Pagelet Regeneration in Background" for more information.

27.7 Summary

The inCache framework significantly increases performance. Nodes can retain cache on disk and recover from failure. Decentralized architecture prevents bottlenecks (although the lack of a central repository for cached items can make it difficult to determine the overall state of all caches). Page propagation eliminates the need to regenerate pages, while page regeneration in background enables remote Satellite Servers to continue serving pages while their replacements are being generated.

This new framework deactivates the donotregenerate flag for RealTime publishing. Pages that must be regenerated during a publishing session must be specified in the FW_RegenCriteria table. Otherwise, pages are regenerated when they are requested.

27.8 Next Steps

For more information about page caching and configuration methods, see Chapter 28, "inCache for Page Caching." The same chapter contains information about options such as striping the disk, enabling page regeneration to occur during RealTime publishing, setting up page propagation, configuring page regeneration in background, and returning to the traditional system of page caching.

For information about asset caching, see Chapter 29, "inCache for Asset Caching."