ALI Portlet Development

Portlet Caching

THE IMPORTANCE OF CACHING CANNOT BE OVERSTATED. If every portlet had to be freshly generated for each request, portal performance could become unacceptably slow. Portlets should not require heavy or frequent processing.  Caching is the functionality that allows the Portal Server to request portlet content, save this content, and return the saved content to users when appropriate. Caching is indexed on the settings sent by the portlet.

Caching on the Portal Server can be set in two ways: programmatically through HTTP headers and/or using the administrative settings in the Portlet Web Service Editor. You should always implement caching programmatically, although the administrator can still choose to override caching through administrative settings.

This page explains how caching should be implemented in portlet development. For a full explanation of HTTP caching, see RFC 2616 (http://www.w3.org/Protocols/rfc2616/rfc2616.html)

Portlet Caching Framework

Caching is the temporary storage of Web objects (such as HTML documents) for later retrieval.  There are three significant advantages to caching:

Efficient caching makes every Web application faster and less expensive. The only portlets that should not be cached are those in which data must be continuously updated.

The Portal Server relies on caching to improve performance. Portlet content is cached by the Portal Server and returned when later requests match the cache’s existing settings.

Note: By default, the page ID is not factored in when content is added to the Portal Server cache, so you could get a cached copy of a portlet from page A while on page B. You can choose to add the page ID to the cache key by sending it to the portlet. For details, see Portlet Cache Key below.

How Does Portlet Caching Work?

When the Portal Server processes a request for a portal page, it looks individually at each portlet on the page and checks it against the cache:

  1. The Portal Server assembles a cache key used to uniquely identify each portlet in the cache.

  2. The Portal Server checks the cache for a matching cache key entry:

  3. The response comes back from the remote server; the Portal Server checks for caching headers:

The portal caches gatewayed content to complement, not replace, browser caching. Public content is accessible to multiple users without any user-specific information (based on HTTP headers). The gateway calculates the cache headers sent to the browser to ensure that the content is properly cached on the client side.

In version 5.0 and above, the portlet cache contains sections of finished markup and sections of markup that require further transformation. Post-cache processing means content can be more timely and personalized. Adaptive tags enable certain portlets (i.e., Community banners) to be cached publicly for extended periods of time and yet contain user specific and page-specific information, as well as the current date and time.

Portlet Cache Key

The cache key for a Portlet entry consists of the following:

The data below can be added to the cache key by setting options in the Web Service editor (on the Advanced Settings page).

The data below is deliberately not included in the cache key:

Caching and Gatewayed Content

The Portal Server caches all text (i.e., nonbinary) content returned by GET requests. Even if gateway caching is disabled (via PTSpy), portlet caching still takes place.

Gatewayed content can be cached by a proxy server or by the user’s browser. Beware browser caching of gatewayed content; it is a good idea to clear your browser cache often during development. (In Internet Explorer, select Tools | Internet Options and click the Delete Files button on the General tab under Temporary Internet Files.)

An incorrectly set Expires header can cause browsers to cache gatewayed content. See HTTP Headers below for details on using the Expires header.

Creating a Caching Strategy

Portlet caching is controlled both by the programmer and by the administrator who registers the Portlet object using the Portlet editor. A portlet’s caching strategy should take all possibilities into account and use the most efficient combination for its specific functionality.

Each and every portlet needs a tailored caching strategy to fit its specific functionality. A portlet that takes too long to generate can degrade the performance of every My Page that displays it.

These questions can help you determine the appropriate caching strategy:

Determine how often portlet content must be updated, dependent on data update frequency and business needs. Find the longest time interval between data refreshes that will not negatively affect the validity of the content or the business goals of the portlet.

Since caching is indexed on the settings used by a portlet, new content is always requested when settings change (assuming that no cached content exists for that combination of settings).

There are two common situations in which a developer might mistakenly decide that a portlet cannot be cached:

Configuring Cache Settings

There are many ways to control caching on the Portal Server. The sections that follow explain basic HTTP cache headers and portal settings. For sample code that shows how to implement caching within the structure of an application, see the sample code on the ALUI Developer Center on dev2dev. Note: In ALI 6.0, portlet caching cannot exceed 15 minutes.

HTTP Headers

RFC 2616 provides thorough guidelines and requirements for HTTP caching. There are two major types of HTTP caching headers: Expiration and ETag/Last-Modified.

Expires

The Expires header specifies when content will expire, or how long content is “fresh.” After this time, the Portal server will always check back with the remote server to see if the content has changed.

Most Web servers allows setting an absolute time to expire, a time based on the last time that the client saw the object (last access time), or a time based on the last time the document changed on your server (last modification time).

In JSP, setting caching to forever using the Expires header is as simple as using the code that follows:

JSP

<%
response.setDateHeader("Expires",Long.MAX_VALUE);
%>

The .NET Framework’s System.Web.HttpCachePolicy class provides a range of methods to handle caching, but it can also be used to set HTTP headers explicitly (see MSDN for API documentation: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpref/html/frlrfsystemwebhttpcachepolicyclasssetexpirestopic.asp).

The Response.Cache.SetExpires method allows you to set the Expires header in a number of ways. The following code snippet sets it to forever:

.NET

Response.Cache.SetExpires(DateTime.Now.AddYears(100000000));

Note: In .NET, the Web Form page (.aspx) can also use standard ASP methods to set HTTP headers.

Don’t use Expires = 0 to prevent caching. The Expires header is sent by the remote server and passed through to the browser by the Portal Server. Unless the time on all three machines is synchronized, an Expires=0 header can mistakenly return cached content. For example, imagine the following situation:

Assume the remote server is three minutes ahead of the user's machine (i.e., it's 12:03 on remote server when it is 12:00 for the user). The user views a portlet page at 12:00 (client time) and gets an Expires = 0 header from the remote server, telling content to expire immediately, at 12:03 (remote server time). However, the browser on the user's machine caches the page, since from the browser's standpoint, the expires date is in the future. When the user looks at the same page again, at 12:01, the browser shows the cached copy, not the new page.

This situation commonly happens with gatewayed preference pages. If the time is always synchronized between the Portal Server, remote server, and the user's machine, none of this will happen. However, that cannot be realistically achieved; even if time on Portal Server and remote server is synchronized, time on the user's machine might still not be accurate.

To solve this problem, instead of using Response.Expires = 0, set the Expires header to a fixed date that is definitely in the past. For example, set the date to January 1, 1970. In JSP, the response.setDateHeader method takes in a header name and an integer that specifies the date in terms of milliseconds since midnight on January 1, 1970. The following code expires the page on January 1, 2003.

JSP

<%
response.setDateHeader("Expires","01/01/1970");
%>

.NET

Response.Cache.SetExpires(DateTime.Parse("01/01/1970"));

Cache-Control

The Cache-Control header can be used to expire content immediately or disable caching altogether. The value of this header determines whether cached portlet content can be shared among different users. I

In JSP, use the setHeader method to configure the Cache-Control header:

JSP

<%
response.setHeader("Cache-Control","public");
%>

The example below expires the content immediately using the maximum age header.

JSP

<%
response.setHeader("Cache-Control","max-age=0");
%>

In the .NET Framework, the Cache-Control header is accessed through the System.Web.HttpCachePolicy class. To set the header to public, private or no-cache, use the Response.Cache.SetCacheability method:

.NET

Response.Cache.SetCacheability(HttpCacheability.Public);

To set a maximum age for content in .NET, use the Response.Cache.SetMaxAge method. The example below expires the content immediately.

.NET

TimeSpan ts = new TimeSpan(0,0,0);
R
esponse.Cache.SetMaxAge(ts);

To set the header to must-revalidate in .NET, use the Response.Cache.SetRevalidation method.

.NET

Response.Cache.SetRevalidation(HttpCacheRevalidation.AllCaches);

Pragma

A value of no-cache in the Pragma header prevents caching only when used over a secure connection. However, a Pragma=no-cache tag is treated identically to Expires=-1 if used in a nonsecure page; the page is cached, but immediately expired.

Note: There are some known issues in IE with disabling caching using the Pragma tag, due to browser buffering. See the Microsoft Knowledge Base for details.

Last-Modified

The Last-Modified response header specifies the last time a change was made in the returned content, in the form of a time stamp. When an object stored in the cache includes a Last-Modified header, the Portal Server can use this value to ask the remote server if the object has changed since the last time it was seen. The Portal Server sends the value from the Last-Modified header to the remote server in the If-Modified-Since Request header. The portlet code  on the remote server uses this header to determine if the content being requested has changed since this date, and responds with either fresh content or a 304 Not Modified Response. If the Portal Server receives the latter, it displays the cached content.

JSP portlets can access the value in the If-Modified-Since request header using the getLastModified(HttpServletRequest req) method provided by the Java class HttpServlet.

In .NET, the Response.Cache.SetLastModified method allows you to set the Last-Modified header to the date of your choice. Alternately, the SetLastModifiedFromFileDependencies method sets the header based on the time stamps of the handler’s file dependencies.

.NET

Response.Cache.SetLastModified(DateTime.Now);

Note: In order for validators such as the Last-Modified header to work, the server must generate valid Last-Modified headers and the server clock must be reliable.

ETag

The ETag header is new to HTTP 1.1; it is very similar to Last-Modified. The main difference is that ETag does not have to be a time stamp. ETag values are unique identifiers that are generated by the server and changed every time the object is modified.

The remote server sends the ETag header to the Portal Server with portlet content. When another request is made for the same content, the Portal Server sends the value in the ETag header back to the remote server in the If-None-Match header. The portlet then determines, based on the value received, whether to send back fresh content or a 304 Not Modified response.

Some methods in the JSP IDK use the ETag header to implement caching. The methods use an entity key based on the Portlet ID and Community ID or allow the developer to pass in a unique key.

In .NET, use the Response.Cache.SetETag method to pass in the string to be used as the ETag. The SetETagFromFileDependencies method creates an ETag by combining the file names and last modified timestamps for all files on which the handler is dependent.

Portlet Settings

On the HTTP Configuration page of the Portlet Web Service editor, the portal administrator can set minimum and maximum amounts of time for validation of cached portlet content. The default is a minimum of 0 seconds and a maximum of 20 days. (For details on the Portlet Web Service editor, see ALI Portlet Configuration.)

The minimum and maximum caching settings in the Portlet editor affect caching as follows.

Note: Setting the Cache-Control header to “no-cache” overrides editor settings; content will never be cached.

For example, the minimum caching time for a particular portlet is set to ten minutes, and the maximum caching time is set to one hour. Client A requests the portlet content. Five minutes later, Client B, with an identical set of preferences, requests the same content. Five minutes is under the minimum caching time set in the Portlet editor, so cached content is returned, no matter what type of programmatic caching has been implemented by the portlet. (Remember, the Portal Server only abides by headers if cached content was generated between the minimum and maximum caching times set in the editor. An Expires header set to two minutes does not refresh the cache in this example.) If no copies of the content existed for Client B’s particular collection of settings or no content was cached, the remote server would be called to generate content that matched that group of settings.

To continue the example, Client A requests the portlet content again, and there is a matching copy of the content in the cache that is 15 minutes old. This is over the minimum caching time and under the maximum. In this case, whether or not new content is generated depends on the HTTP headers sent by the portlet. If the portlet has not specified any caching programmatically, the Portal Server asks the remote server for fresh content. If the portlet set the Expires header to 30 minutes, new content is not generated. If ETag or Last-Modified caching was implemented, new content is only returned if content has changed.

Finally, Client A requests the same content two hours later, and the matching copy was generated more than an hour before. Since this is over the maximum caching time set in the Portlet editor, the Portal Server requests new content from the remote server, regardless of the caching specified programmatically by the portlet. Of course, if the portlet has implemented ETag or Last-Modified caching, new content is only returned if content has changed.

If your portlet requires specific editor settings for its caching strategy, you must include this information in your Installation Guide.

Always use programmatic caching. Using HTTP headers to control caching is always preferable. Administrators can override some programmatic caching, but they cannot be relied upon to set caching correctly. For example, instead of requiring the portal administrator to set a minimum caching setting, send an Expires header

Caching and Development

While caching is an integral and necessary part of portlet design, it is helpful to disable it while developing and debugging. Otherwise, it can be very difficult to view the results of any modifications you have made. To disable the caching implemented by the Portal Server, go to the HTTP Configuration page of the Portlet Web Service editor (shown under Portlet Settings above) and set the minimum and maximum caching times to 0. Clear the checkbox marked “Suppress errors where possible (use cached content instead).”

Note: After the code has been developed and debugged, make sure to turn caching on and test the performance of your portlet.  For details on troubleshooting portlets, see Portlet Debugging.  If you using the ALI Logging Utilities to debug caching, turn on all types of tracing for the OpenKernel.OpenHttp.Cache component.

 Next: Portlet Testing Checklist