Oracle iPlanet Web Proxy Server 4.0.14 Administration Guide

Caching HTTP Documents

HTTP documents offer caching features that documents of the other protocols do not. However, by setting up and configuring the cache properly, you can ensure that your Proxy Server will cache HTTP, FTP, and Gopher documents effectively.


Note –

Proxy Server 4 does not support caching HTTPS documents.


All HTTP documents have a descriptive header section that the Proxy Server uses to compare and evaluate the document in the proxy cache and the document on the remote server. When the proxy does an up-to-date check on an HTTP document, the proxy sends one request to the server that tells the server to return the document if the version in the cache is out of date. Often, the document has not changed since the last request and therefore is not transferred. This method of checking to see if an HTTP document is up-to-date saves bandwidth and decreases latency.

To reduce transactions with remote servers, the Proxy Server enables you to set a Cache Expiration setting for HTTP documents. The Cache Expiration setting provides information to the proxy to estimate whether the HTTP document needs an up-to-date check before sending the request to the server. The proxy makes this estimate based on the HTTP document’s Last-Modified date found in the header.

With HTTP documents, you can also use a Cache Refresh setting. This option specifies whether the proxy always does an up-to-date check, which would override an Expiration setting or whether the proxy waits a specific period of time before doing a check. The following table shows what the proxy does if both an Expiration setting and a Refresh setting are specified. Using the Refresh setting decreases latency and saves bandwidth considerably.

Table 12–1 Using the Cache Expiration and Cache Refresh settings With HTTP

Refresh setting  

Expiration setting  

Results  

Always do an up-to-date check 

(Not applicable) 

Always do an up-to-date check 

User-specified interval 

Use document’s “expires” header 

Do an up-to-date check if interval expired 

 

Estimate with document’s Last-Modified header 

Smaller value* of the estimate and expires header 


Note –

* Using the smaller value guards against getting stale data from the cache for documents that change frequently.


Setting the HTTP Cache Refresh Interval

If you decide that you want your Proxy Server to cache HTTP documents, determine whether it should always do an up-to-date check for documents in the cache or whether it should check based on a Cache Refresh setting (up-to-date check interval). For HTTP documents, a reasonable refresh interval would be four to eight hours, for example. The longer the refresh interval, the fewer the number of times the proxy connects with remote servers. Even though the proxy does not do up-to-date checking during the refresh interval, users can force a refresh by clicking the Reload button in the client. This action makes the proxy force an up-to-date check with the remote server.

You can set the refresh interval for HTTP documents on either the Set Cache Specifics page or the Set Caching Configuration page. The Set Cache Specifics page enables you to configure global caching procedures, and the Set Caching Configuration page enables you to control caching procedures for specific URLs and resources.

Setting the HTTP Cache Expiration Policy

You can also set up your server to check if the cached document is up-to-date by using a last-modified factor or explicit expiration information only.

Explicit expiration information is a header found in some HTTP documents that specifies the date and time when that file will become outdated. Not many HTTP documents use explicit Expires headers, so you should estimate based on the Last-modified header.

If you decide to have your HTTP documents cached based upon the Last-modified header, you need to select a fraction to use in the expiration estimation. This fraction, known as the LM factor, is multiplied by the interval between the last modification and the time that the last up-to-date check was performed on the document. The resulting number is compared with the time since the last up-to-date check. If the number is smaller than the time interval, the document is not expired. Smaller fractions make the proxy check documents more often.

For example, suppose you have a document that was last changed ten days ago. If you set the last-modified factor to 0.1, the proxy interprets the factor to mean that the document is probably going to remain unchanged for one day (10 * 0.1 = 1). The proxy would, in that case, return the document from the cache if the document was checked less than a day ago.

In this same example, if the cache refresh setting for HTTP documents is set to less than one day, the proxy does the up-to-date check more than once a day. The proxy always uses the value, cache refresh or cache expiration, that requires the more frequent update.

You can set the expiration setting for HTTP documents on either the Set Cache Specifics page or the Set Caching Configuration page. The Set Cache Specifics page enables you to configure global caching procedures and the Set Caching Configuration page enables you to control caching procedures, for specific URLs and resources.

Reporting HTTP Accesses to the Remote Server

When a document is cached by Proxy Server, it can be accessed many times before it is refreshed again. For the remote server, sending one copy to the proxy that will cache it represents only one access, or “hit.” The Proxy Server can count how many times a given document is accessed from the proxy cache between up-to-date checks and then send that hit count back to the remote server in an additional HTTP request header (Cache-Info) the next time the document is refreshed. This way, if the remote server is configured to recognize this type of header, it receives a more accurate account of how many times a document is accessed.