Oracle9iAS Web Cache Administration and Deployment Guide
Release 2.0.0

Part Number A90372-04
Go To Documentation Library
Library
Go To Product List
Services
Go To Table Of Contents
Contents
Go To Index
Index

Go to previous page Go to next page

2
Oracle Web Cache Concepts

This chapter explains how Oracle Web Cache is populated with content, how that content maintains consistency, and how dynamically generated content is cached.

This chapter contains these topics:

Populating Oracle Web Cache

Oracle Web Cache uses cacheability rules to determine which documents to cache. When cacheability rules for a particular URL are first configured, those documents contained within the URL are not cached until there is a browser request for them. When the first request for a document comes in, Oracle Web Cache appends a Surrogate-Capability request-header field to the document. The Surrogate-Capability request-header field identifies that the document passed through the cache. Oracle Web Cache then sends the request to the application Web server. This is a cache miss. If the requested document is specified as one of the documents to cache, then Oracle Web Cache caches the document for subsequent requests. For a subsequent request for the document, Oracle Web Cache serves the document from its cache to the browser. This is a cache hit.

When a browser sends a GET method request with an If-Modified-Since request-header field for a cached document, Oracle Web Cache compares the time stamp used in header with the Last-Modified request-header field of the cached document to determine if the document needs to be served. If the cached document is more current than the one requested by the browser, then Oracle Web Cache serves the cached document to the browser. When the Last-Modified header does not exist, Oracle Web Cache uses the time the document entered the cache as the time stamp. If the cached document is less current than the one requested by the browser, then Oracle Web Cache sends a 304 status code to the browser.

If a document contains a cookie, then Oracle Web Cache evaluates the cookie value of the browser request and application Web server response. If the values match and there is a corresponding cacheability rule, then Oracle Web Cache caches the response. Because a session value change does not necessarily indicate a change of state on the application Web servers, session cookie values are not evaluated. For documents that use these cookies, the response is cached, regardless of whether or not the cookie values match.


Notes:

  • You can populate the cache with the aid of the Apache Benchmark tool. See the Apache documentation for further information.

  • When you stop Oracle Web Cache, all objects are cleared from the cache. In addition, all statistics are cleared.

 

See Also:

"Caching Dynamically Generated Content" for an overview of cookies 

Request and Response Header Fields

For each requested documented from the cache, Oracle Web Cache appends a Surrogate-Capability request-header field to a document's HTTP request message. The Surrogate-Capability request-header enables Oracle Web Cache to identify the operations it is capable of performing to application Web servers. The Surrogate-Capability request-header field has the following syntax:

Surrogate-Capability: orcl="operation_value"

where "operation_value" is one of the following:

For documents sent to browsers, Oracle Web Cache adds cache hit or cache miss information to the Server response-header field of the HTTP response message:

Server: Oracle9iAS Web Cache/release M|H|S /Apache/release (operating_system)

where:

In the following example, the Server field specifies that the document was a cache miss:

Server: Oracle9iAS Web Cache/2.0.0.2.0 M /Apache/1.3.12 (Unix) (Red Hat/Linux)

Using this information, you can determine whether a request was served from the cache or the application Web server.

Cache Freshness and Performance Assurance

Consistency and performance are crucial for the reliability of Oracle Web Cache. Invalidation and expiration ensure consistency between the cache and the application Web servers. With invalidation, an HTTP message is sent by specifying which documents to mark as invalid. With expiration, documents are marked as invalid after a certain amount of time in the cache. When documents are marked as invalid and a browser requests them, they are removed and then refreshed with new content from the application Web servers. You can select to remove and refresh invalid documents immediately, or base the removal and refresh on the current load of the application Web servers.

Expirations are useful if it can be accurately predicated when content will change on an application Web server or database. An invalidation message is intended for less predictable, more frequently changing content.

One could logically assume that widespread cache invalidation or expiration would negatively impact performance of the application Web servers, resulting in the generation of HTTP 503 Server Busy errors to browsers. For this reason, Oracle Web Cache intelligently serves some of the documents stale until the application Web servers have the capacity to refresh them.

Oracle Web Cache provides minimal trade-off between performance and consistency through performance assurance heuristics that determine which documents can be served stale. These heuristics are based on a number of factors including:

Validity

Validity is based on the expiration time, invalidation time, and removal time of an object.

Oracle Web Cache calculates validity by comparing the current time relative to an object's expiration/invalidation time and the object's scheduled removal time. Prior to expiration/invalidation time, the object is considered valid. Between expiration/invalidation time and removal time, the object's validity level decreases linearly. During this interim state, objects with a higher validity level have a higher propensity to be served stale. When current time reaches removal time, the object is considered totally invalid and can no longer be served stale. Scheduled removal time is something that administrators can control. When expiring/invalidating content, administrators have the option to remove objects immediately, which may be necessary for sensitive objects that should never be served stale. Likewise, where some degree of inconsistency is tolerable, administrators can specify a removal time in the near future.

Popularity

Popularity is determined by:

  • The number of times the object has been requested since insertion into the cache

  • The number of recent requests for the object

Load of the Application Web Server

The current load on the application Web server is determined by the number of open connections from Oracle Web Cache to the application Web servers, that is, the total number of pending requests to the application Web servers.

Limit on the Application Web Server

The configured limit on the application Web server load is the configured number of concurrent connections the application Web server can safely handle.

Together, these factors provide Oracle Web Cache with a logical queue of content to update from the application Web servers.

Figure 2-1 illustrates how performance assurance heuristics are used during widespread invalidation.

Figure 2-1 Performance Assurance Heuristics Graph


Text description of owcag008.gif follows
Text description of the illustration owcag008.gif

Right after invalidation, the number of fresh documents served decreases to 20 documents for each second. However, the number of fresh cache hits quickly increases over a short amount of time. This is because Oracle Web Cache refreshes the most popular documents first so that these documents have little chance of being served stale. Once the popular documents are refreshed, the less popular documents are refreshed. The total number of documents that can revalidated in a given period of time is dependent on application Web server capacity. At the end of invalidation, only fresh content is served.


Note:

Performance assurance heuristics do not apply when you configure documents to be removed and refreshed immediately. 


Caching Dynamically Generated Content

Most Web pages today are dynamically generated before delivery to the browser. Web developers frequently use database-driven technologies like Java Server Pages (JSP), Active Server Pages (ASP), PL/SQL Server Pages (PSP), Java Servlets, and Common Gateway Interface (CGI) to design their applications. These technologies are used for complex Web sites, as they are easier to modify and maintain when information is stored in a database. Examples of pages that are dynamically generated include:

Because of invalidation, Oracle Web Cache knows what documents are valid and what documents are invalid. This is especially important for dynamically generated content that changes frequently.

Most static caches and content distribution services have no mechanism to verify the consistency of dynamically generated Web pages with the data sources used to create them. Therefore, it is difficult for these services to know when content has changed. Oracle Web Cache, on the other hand, receives invalidation messages from the application Web server, containing the original content.

For dynamically generated pages, browsers pass information about themselves to the application Web server, enabling the application Web server to serve appropriate content to the browser.

The HTTP protocol has a way for browsers and application Web servers to share information, such as session or category information, in message headers that browsers pass with every request to the application Web server. This message header can contain a cookie.

Cookies are stored on the browser's file system and are often used for identifying users who revisit Web sites. Many users choose to disable cookies in their browsers out of privacy concerns. For this reason, application Web servers often embed parameter information in the URL. Oracle Web Cache accepts requests that use the following characters as delimiters for embedded URL parameters: ampersand (&), dollar sign ($), or semi-colon (;).


Note:

Examples in this guide use ampersand (&) as the delimiter. 


Oracle Web Cache is able to recognize both cookies and embedded URL parameters, enabling it to recognize cacheability rules for pages with:

Multiple Versions of the Same Document

Some pages have multiple versions, enabling categorization. Figure 2-2 shows the same document, http://store.oracle.com/cec/cstage?eccookie=&ecsid=1225&ecaction=ecproditemlistbysupersect&template=decsectview_mp.en.htm, with different prices for customers and internal Oracle employees. While customers pass a cookie name and value of ec-400-id-acctcat=WALKIN, employees pass a cookie name and value of ec-400-id-acctcat=CUSTOMER.

Figure 2-2 Multiple-Version Document


Text description of concepta.gif follows
Text description of the illustration concepta.gif

You can configure Oracle Web Cache to recognize and cache multiple-version pages by using the:

For those documents that use a cookie (sometimes referred to as a category cookie), you set cacheability rules that specify the cookie name and whether to cache versions of the document that do not use the cookie.

When a browser sends a request to an application Web server for a multiple-version document and the value of the browser's cookie matches the value of the application Web server's response, the version of the document is cached. If the cookie values do not match, then the response is not cached. Once versions of the document are cached, Oracle Web Cache uses the value of the cookie in the browser's request to serve the appropriate version of the document to the browser.

Table 2-1 shows four different versions of same URL, http://www.dot.com/page1.htm. The URL uses a cookie named user_type, which supports browser requests that contain cookie values of Customer, Internal, and Promotional. You can configure Oracle Web Cache to recognize the user_type cookie, enabling Oracle Web Cache to cache three different documents. In addition, you can configure Oracle Web Cache to cache a fourth document for those requests that do not use a cookie.

Table 2-1 Multiple-Version Document with Different Cookie Values

Version  URL  Cookie Name/Value 

http://www.dot.com/page1.htm 

user_type=Customer 

http://www.dot.com/page1.htm 

user_type=Internal 

http://www.dot.com/page1.htm 

user_type=Promotional 

http://www.dot.com/page1.htm 

No cookie  

For those documents that use HTTP request headers, you set cacheability rules that specify the HTTP request header whose values to use for disambiguation. HTTP request headers enable Web browsers to pass additional information about the request and about themselves. Oracle Web Cache uses the header to serve the appropriate version of the URL to browsers.

Table 2-2 lists the standard HTTP request-header fields supported.

Table 2-2 HTTP Request-Header Fields
Header Field  Description 

Accept 

Specifies which media types are acceptable for the response

Example: Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */* 

Accept-Charset 

Specifies which character sets are acceptable for the response

Example: Accept-Charset: iso-8859-1,*,utf-8 

Accept-Encoding 

Restricts the content-encodings that are acceptable in the response

Example: Accept-Encoding: gzip 

Accept-Language 

Specifies the set of languages that are preferred as a response

Example: Accept-Language: en 

User-Agent 

Contains information about the Web browser that initiated the request

Example: User-Agent: Mozilla/4.61 [en] (WinNT; U) 


Note:

Oracle Web Cache does not interpret the values of these HTTP request headers. If the values for two pages are different, Oracle Web Cache caches both pages separately. For example, if one request sends an HTTP request-header field of User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0) and another request sends an HTTP request-header field of User-Agent: Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt) for a different versions of Internet Explorer, Oracle Web Cache serves two pages for the two requests. 


Personalized Attributes

Many Web sites support pages with personalized attributes, such as personalized greetings like "Hello, Name," icons, addresses, or shopping cart snippets, on an otherwise generic page. You can configure the page with the personalized attributed information contained within HTML tags <!-- WEBCACHETAG--> and<!-- WEBCACHEEND--> that Oracle Web Cache can process.

Oracle Web Cache processes these tags and caches the instructions for substituting values for personalized attributes based on the information contained within a cookie or an embedded URL parameter.

This functionality enables Oracle Web Cache to use the same page for multiple users. Because only one page needs to be cached, only one application Web server request is required to initially populate the cache with the page. The initial request sets the personalized attribute cookie or embedded URL parameter. All subsequent requests for the page that pass the cookie or embedded URL parameter are served from the cache.

Figure 2-3 shows two users, Jane Doe and John Doe, accessing the same page, http://store.oracle.com/cec/cstage?eccookie=&ecaction=ecpassthru2&template=walkin1.en.htm. This page contains a personalized greeting suited for the user. The HTML code for the personalized greeting Jane Doe uses the following HTML code:

<b>
<!-- WEBCACHETAG="person01"-->
Jane Doe
<!-- WEBCACHEEND-->
</b>

The HTML code for personalized greeting John Doe uses the following HTML code:

<b>
<!-- WEBCACHETAG="person01"-->
John Doe
<!-- WEBCACHEEND-->
</b>

person01 represents the session name assigned to the person_name cookie that Jane and John pass to Oracle Web Cache. Jane passes a cookie name value pair of person_name=Jane Doe and John Doe passes a cookie name value pair of person_name=John Doe. When Oracle Web Cache receives the cookie information from Jane and John, it maps the person_name cookie to the person01 session name and substitutes the cookie value.

If, instead of cookies, the page supported embedded URL parameters, then the URL would contain the person_name parameter. For example, the page for Jane Doe could be http://store.oracle.com/cec/cstage?person_name=Jane+Doe and the page for John Doe could be http://store.oracle.com/cec/cstage?person_name=John+Doe. Oracle Web Cache is configured with the person_name01 session, which maps to the person_name embedded URL parameter. Oracle Web Cache uses the value of the embedded parameter to substitute the appropriate name.

Figure 2-3 Page with a Personalized Attribute


Text description of concept2.gif follows
Text description of the illustration concept2.gif

If a request does not contain the cookie or embedded URL parameter, Oracle Web Cache substitutes the personalized attribute with a default string. If you want to instead always require value of the personalized attribute, then set a session-related caching rule and require that the request get the cookie or embedded URL parameter settings from the application Web server.

See Also:

"Configuring Rules for Pages with Simple Personalization" 

Session Information

Some Web sites keep track of user sessions by assigning each user a unique session ID. Session IDs are typically used for Web sites with catalog pages. The session ID can be used for either session tracking or session-encoded URLs.

When a user first accesses a Web site that uses session IDs, Oracle Web Cache passes the request to the application Web server to establish the session. In turn, the application Web server assigns the user with a session ID through a cookie (sometimes referred to as a session cookie) or an embedded in the URL as a parameter. As users request pages that use session cookies or embedded URL parameters, the application Web server track the sessions. You can configure Oracle Web Cache to serve pages that support session tracking and session-encoded URLs.

Session Tracking

Because session tracking does not alter the actual content of a page, you can configure Oracle Web Cache to cache the page and serve it to multiple users.

If you configure Oracle Web Cache to cache a page that uses session information and a subsequent request for the page contains a session cookie or embedded URL parameter, then Oracle Web Cache serves the page with the user's session information from its cache.

To better understand how session tracking works, consider the HTML pages shown in Figure 2-3. When Jane Doe and John first access the Oracle Store Web site, their initial requests are sent to the application Web server, which assigns them cookie name value pairs of session_ID=33436 and session_ID=33437, respectively. If their browsers did not support cookies, then the URL for the pages could contain the session ID. For example, the page for Jane Doe would be http://store.oracle.com/cec/cstage?session_ID=33436 and the page for John Doe could be http://store.oracle.com/cec/cstage?session_ID=33437. Oracle Web Cache can be configured to cache one version of this page and other session tracking pages and serve it to multiple users. By using the value of the session_ID cookie or embedded URL parameter, Oracle Web Cache can serve the same page to both Jane Doe and John Doe.

Unlike category cookies used for multiple versions of the same URL, Oracle Web Cache ignores the values of session cookies. The response from the application Web server is cached, even if the response session cookie value does not match the request session cookie value. If you do not want the response cached when there is a value mismatch, then modify the application to instead send a non-200 status code as the response.

See Also:

"Configuring Rules for Pages with Session Tracking" 

Session-Encoded URLs

You can configure Oracle Web Cache to cache the instructions for substituting session information for one user with another based on the session information contained within a cookie or an embedded URL parameter.

Continuing with the example in "Session Tracking", assume that Jane Doe and John Doe are again assigned cookies or embedded URL parameters of session_ID=33436 and session_ID=33437 by the application Web server. The page shown in Figure 2-4 has several <A HREF=...> links that include the session_ID cookie or parameter. The Release 3 (8.1.7) under the Oracle8i Documentation heading for Jane Doe uses the following HTML code:

<A HREF="/cec/cstage?ecaction=ecproditemlistbysupersect&
ecsid=20330&eccookie=&template=decsectview_pub.en.htm&
session_ID=334326">Release 3 (8.1.7)</A>

The same link for John Doe uses the following HTML code:

<A HREF="/cec/cstage?ecaction=ecproditemlistbysupersect&
ecsid=20330&eccookie=&template=decsectview_pub.en.htm&
session_ID=334327">Release 3 (8.1.7)</A>

By using the value of the session_ID cookie or embedded URL parameter, Oracle Web Cache is able to substitute the correct session information for Jane Doe and John Doe.

Figure 2-4 Session-Encoded URLs

Text description of session.gif follows.
Text description of the illustration session.gif

Whereas a session tracking page requires that each user establish a session with the application Web server, a page with session-encoded URLs requires that only the initial request to establish a session. Once the cache is populated with the page, other requests are served from the cache, regardless if the request has a session cookie or embedded URL parameter. This has twofold effect for those requests without the session cookie or embedded URL parameter:

If you want to instead require session establishment, then set a session-related caching rule.

See Also:

"Configuring Rules for Pages with Simple Personalization" 

Content Assembly and Partial Page Caching

Oracle Web Cache provides dynamic assembly of Web pages with both cacheable and non-cacheable page fragments. It provides for assembly by enabling Web pages to be broken down into fragments of differing cacheability profiles. These fragments are each maintained as separate elements in the application Web server or content delivery network. The fragments are assembled into HTML pages as appropriate when requested by end users.

By enabling dynamic assembly of Web pages on Oracle Web Cache rather than on the application Web servers, you can choose to cache some of the fragments of assembled pages. This means that much more HTML content can be cached, then assembled and delivered by Oracle Web Cache when requested. Furthermore, page assembly can be conditional, based on information provided in HTTP request headers or end-user cookies.

Page Assembly Components

The basic structure a content provider uses to create dynamic content is a template page containing HTML fragments. As depicted in Figure 2-5, the template consists of common elements, such as a logo, navigation bars, framework, and other "look and feel" elements of the page. The HTML fragments represent dynamic subsections of the page.

Figure 2-5 Template Page


Text description of owcag026.gif follows.
Text description of the illustration owcag026.gif

The template page is associated with the URL that end users request. To include the HTML fragments, the template page is configured with Edge Side Includes (ESI) markup tags that tell Oracle Web Cache to fetch and include the HTML fragments. The fragments themselves are HTML files containing discrete text or other objects.

Each included fragment is a separate object with its own cacheability rule. Content providers may want to cache the template for several days, but only cache a particular fragment, such as an advertisement or stock quote, for a matter of seconds or minutes. Other fragments (such as a user's bank account total) may be declared non-cacheable.

Table 2-3 provides a summary of the main ESI tags.

Table 2-3 Summary of ESI Tags

Tag  Description 

<esi:include> 

Includes a HTML fragment 

<esi:choose> 

Performs conditional processing based on boolean expressions 

<esi:try> 

Specifies alternate processing when a request fails because the application Web server is not accessible 

<esi:vars> 

Permits variable substitution for environment variables 

<esi:remove> 

Specifies non-ESI markup if ESI processing is not enabled 

<!--esi...---> 

Specifies content to be processed  

Figure 2-6 shows the ESI markup language for the template page shown in Figure 2-5.

Figure 2-6 ESI Markup

<html>
<head>
<title>
Company.com
</title>
</head>
<body>
...
<!-- The following HTML comment tag with an immediate following 'esi' is a 
special ESI tag that is removed if and only if this page is processed by an ESI 
processor. -->
<!--esi

 <esi:comment text="This is the HTML source when ESI is enabled." />

 <esi:comment text="Start: The quick link section. You cannot use the standard   
HTML comments because the end of that comment tag would disrupt the HTML comment 
tag with 'esi' following the two '-'. " />

 <esi:comment text="The URI query string parameter 'sessionID' is used to carry 
session identifiers, The session ID is encoded in all links. 'type' is used to 
categorize this user."/>
 <esi:vars>
   <a href="/shopping.jsp?sessionID=$(QUERY_STRING{sessionID})&type=$(QUERY_
STRING{type})">
     <img src="/img/shopping.gif">
   </a>
   <a href="/news.jsp?sessionID=$(QUERY_STRING{sessionID})&type=$(QUERY_
STRING{type})">
     <img src="/img/news.gif">
   </a>
   <a href="/sports.jsp?sessionID=$(QUERY_STRING{sessionID})&type=$(QUERY_
STRING{type})">
     <img src="/img/sports.gif">
   </a>
   <a href="/fun.jsp?sessionID=$(QUERY_STRING{sessionID})&type=$(QUERY_
STRING{type})">
     <img src="/img/fun.gif">
  </a>
  <a href="/about.jsp?sessionID=$(QUERY_STRING{sessionID})&type=$(QUERY_
STRING{type})">
     <img src="/img/about.gif">
   </a>
 </esi:vars>

 <esi:comment text="End: The quick link section" />
...
 <h3>Local Weather</h3>
 <esi:include src="/weather.jsp?sessionID=$(QUERY_
STRING{sessionID})&type=$(QUERY_STRING{type})" />
...

 <h3>Stock Quotes</h3>
 <esi:try>
   <esi:attempt>
     <esi:include src="/CompanyStack.jsp?sessionID=$(QUERY_ 
STRING{sessionID})&type=$(QUERY_STRING{type})" />
   </esi:attempt>
   <esi:except>
     The company stock quote is temporarily unavailable.
   </esi:except>
 </esi:try>
...

 <h3>What's New at Company</h3>
 <!-- This section is a static file that does not carry session information -->
 <esi:include src="/whatisnew.html" />

...

 <h3>Today's News</h3>
 <esi:choose>

   <esi:when test="$(QUERY_STRING{type}) == 'Sport'">
     <h4>Sport News</h4>
     <esi:include src="/SportNews.jsp?sessionID=$(QUERY_
STRING{sessionID})&type=$(QUERY_STRING{type})" />
   </esi:when>

   <esi:when test="$(QUERY_STRING{type}) == 'Career'">
     <h4>Financial News</h4>
     <esi:include src="/FinancialNews.jsp?sessionID=$(QUERY_
STRING{sessionID})&type=$(QUERY_STRING{type})" />
   </esi:when>

   <esi:otherwise>
     <h4>General News</h4>
     <esi:include src="/DefaultNews.jsp?sessionID=$(QUERY_
STRING{sessionID})&type=$(QUERY_STRING{type})" />
   </esi:otherwise>

 </esi:choose>

...

-->

<!-- This is the HTML source when ESI is disabled. -->
<esi:remove>
Alternative HTML source that does not use ESI goes here. This tag enables you   
disable ESI on the fly without redeveloping or redeployging a different home 
page. 
</esi:remove>

</body>
</html>

ESI Features

ESI can be used with HTML, XML, and any Web programming technology. The ESI language includes the following features:

ESI for Java (JESI)

Edge Side Includes for Java (JESI) is a specification and custom JSP tag library that developers can use to automatically generate ESI code using JSP syntax. Even though JSP developers can always use ESI, JESI provides an even easier way for JSP developers to express the modularity of pages and the cacheability of those modules, without requiring developers to learn a new syntax.

See Also:

 

Go to previous page Go to next page
Oracle
Copyright © 2001 Oracle Corporation.

All Rights Reserved.
Go To Documentation Library
Library
Go To Product List
Services
Go To Table Of Contents
Contents
Go To Index
Index