Oracle iPlanet Web Proxy Server 4.0.14 Administration Guide

Setting Cache Specifics

You can enable caching and control which types of protocols your Proxy Server will cache by setting the cache specifics. Cache specifics include the following items:


Note –

Setting the specifics for a large cache is time-consuming and may cause the administration interface to time-out. Therefore, if you are creating a large cache, use the command line utilities to set cache specifics. For more information on the cache command line utilities, see Using the Cache Command-Line Interface.


ProcedureTo Set Cache Specifics

  1. Access the Server Manager, and click the Caching tab.

  2. Click the Set Cache Specifics link.

    The Set Cache Specifics page is displayed.

  3. You can enable or disable the cache by selecting the appropriate option.

    The cache is enabled by default.

  4. Provide the working directory.

    By default the working directory is present under the proxy instance. This location can be changed. For more information, see Creating a Cache Working Directory.

  5. Click the partition configuration link.

    The Add/Edit Cache Partitions page is displayed. You can add a new cache partition or edit existing cache paritions. Cache size is the maximum size the cache is allowed to grow. The maximum cache size is 32Gbytes. For more information, see Setting Cache Size.

  6. Click the cache capacity configuration link.

    The Set Cache Capacity page is displayed. You can set the cache capacity on the Set Cache Capacity page.

  7. Select the Cache HTTP to enable caching of HTTP documents.

    If you decide that you want your proxy server to cache HTTP documents, determine whether it should always do an up -to-date check for the documents in the cache or whether it should check based on an interval. You can also enable or disable the Proxy Server from reporting cache hits to the remote server. For more information, see Caching HTTP Documents. The available options are:

    • Select the Always Check That The Document Is Up To Date option to ensure that the HTTP document is always up-to- date.

      • Select the number of hours from the Check Only If Last Check More Than drop-down list to specify the refresh interval for the proxy server. The up-to-date check is performed using any one of the following options:

        • Use Last-modified Factor. The last modified header that is sent by the origin server along with the document.

        • Use Only Explicit Expiration Information. The proxy server uses the Expires header to decide if the cache entry is fresh or stale.

        Select the Never Report Accesses To Remote Server option to prevent the proxy server from reporting the number of accesses to the remote server.

      • Select the Report Cache Hits To Remote Server option to track the number of times a document was accessed and report it back to the remote server.

  8. Set the refresh interval for cached FTP documents by selecting the Yes; Reload If Older Than checkbox and also set the time interval by selecting the value from the drop-down list. For more information, see Caching FTP and Gopher Documents.

  9. You can set the refresh interval for cached Gopher documents. Select the Yes; Reload If Older Than checkbox and also set the time interval by selecting the value from the drop-down list. For more information, see Caching FTP and Gopher Documents.

  10. Click OK.

  11. Click Restart required. The Apply Changes page is displayed.

  12. Click the Restart Proxy Server button to apply the changes.

Creating a Cache Working Directory

The cache files are under cache partitions. The working directory you specify on the Set Cache Specifics page is often the parent directory for the cache. All cached files appear in an organized directory structure under the caching directory. If you change the cache directory name or move it to another location, you have to provide the proxy with the new location.

You can extend the cache directory structure to multiple file systems so that you can have a large cache structure divided on multiple smaller disks instead of keeping it all on one large disk. Each proxy server must have its own cache directory structure, that is, cache directories cannot be concurrently shared by multiple proxy servers.

Setting Cache Size

The cache size indicates the partition size. Cache size should always be less than the cache capacity as it is the maximum size to which the cache can grow. The sum of all the partition sizes must be less than or equal to the cache size.

The amount of disk space available for the proxy cache has a considerable effect on cache performance. If the cache is too small, the Cache GC must remove cached documents to make room on the disk more often, and documents must be retrieved from content servers more often. These activities slow performance.

Large cache sizes are more efficient because the more cached documents, the less the network traffic load and the faster the response time the proxy provides. Also, the GC removes cached documents if users no longer need them. Barring any file system limitations, cache size can never be too large. The excess space simply remains unused.

You can also have the cache split on multiple disk partitions.

Caching HTTP Documents

HTTP documents offer caching features that documents of the other protocols do not. However, by setting up and configuring the cache properly, you can ensure that your Proxy Server will cache HTTP, FTP, and Gopher documents effectively.


Note –

Proxy Server 4 does not support caching HTTPS documents.


All HTTP documents have a descriptive header section that the Proxy Server uses to compare and evaluate the document in the proxy cache and the document on the remote server. When the proxy does an up-to-date check on an HTTP document, the proxy sends one request to the server that tells the server to return the document if the version in the cache is out of date. Often, the document has not changed since the last request and therefore is not transferred. This method of checking to see if an HTTP document is up-to-date saves bandwidth and decreases latency.

To reduce transactions with remote servers, the Proxy Server enables you to set a Cache Expiration setting for HTTP documents. The Cache Expiration setting provides information to the proxy to estimate whether the HTTP document needs an up-to-date check before sending the request to the server. The proxy makes this estimate based on the HTTP document’s Last-Modified date found in the header.

With HTTP documents, you can also use a Cache Refresh setting. This option specifies whether the proxy always does an up-to-date check, which would override an Expiration setting or whether the proxy waits a specific period of time before doing a check. The following table shows what the proxy does if both an Expiration setting and a Refresh setting are specified. Using the Refresh setting decreases latency and saves bandwidth considerably.

Table 12–1 Using the Cache Expiration and Cache Refresh settings With HTTP

Refresh setting  

Expiration setting  

Results  

Always do an up-to-date check 

(Not applicable) 

Always do an up-to-date check 

User-specified interval 

Use document’s “expires” header 

Do an up-to-date check if interval expired 

 

Estimate with document’s Last-Modified header 

Smaller value* of the estimate and expires header 


Note –

* Using the smaller value guards against getting stale data from the cache for documents that change frequently.


Setting the HTTP Cache Refresh Interval

If you decide that you want your Proxy Server to cache HTTP documents, determine whether it should always do an up-to-date check for documents in the cache or whether it should check based on a Cache Refresh setting (up-to-date check interval). For HTTP documents, a reasonable refresh interval would be four to eight hours, for example. The longer the refresh interval, the fewer the number of times the proxy connects with remote servers. Even though the proxy does not do up-to-date checking during the refresh interval, users can force a refresh by clicking the Reload button in the client. This action makes the proxy force an up-to-date check with the remote server.

You can set the refresh interval for HTTP documents on either the Set Cache Specifics page or the Set Caching Configuration page. The Set Cache Specifics page enables you to configure global caching procedures, and the Set Caching Configuration page enables you to control caching procedures for specific URLs and resources.

Setting the HTTP Cache Expiration Policy

You can also set up your server to check if the cached document is up-to-date by using a last-modified factor or explicit expiration information only.

Explicit expiration information is a header found in some HTTP documents that specifies the date and time when that file will become outdated. Not many HTTP documents use explicit Expires headers, so you should estimate based on the Last-modified header.

If you decide to have your HTTP documents cached based upon the Last-modified header, you need to select a fraction to use in the expiration estimation. This fraction, known as the LM factor, is multiplied by the interval between the last modification and the time that the last up-to-date check was performed on the document. The resulting number is compared with the time since the last up-to-date check. If the number is smaller than the time interval, the document is not expired. Smaller fractions make the proxy check documents more often.

For example, suppose you have a document that was last changed ten days ago. If you set the last-modified factor to 0.1, the proxy interprets the factor to mean that the document is probably going to remain unchanged for one day (10 * 0.1 = 1). The proxy would, in that case, return the document from the cache if the document was checked less than a day ago.

In this same example, if the cache refresh setting for HTTP documents is set to less than one day, the proxy does the up-to-date check more than once a day. The proxy always uses the value, cache refresh or cache expiration, that requires the more frequent update.

You can set the expiration setting for HTTP documents on either the Set Cache Specifics page or the Set Caching Configuration page. The Set Cache Specifics page enables you to configure global caching procedures and the Set Caching Configuration page enables you to control caching procedures, for specific URLs and resources.

Reporting HTTP Accesses to the Remote Server

When a document is cached by Proxy Server, it can be accessed many times before it is refreshed again. For the remote server, sending one copy to the proxy that will cache it represents only one access, or “hit.” The Proxy Server can count how many times a given document is accessed from the proxy cache between up-to-date checks and then send that hit count back to the remote server in an additional HTTP request header (Cache-Info) the next time the document is refreshed. This way, if the remote server is configured to recognize this type of header, it receives a more accurate account of how many times a document is accessed.

Caching FTP and Gopher Documents

FTP and Gopher do not include a method for checking to see whether a document is up-to-date. Therefore, the only way to optimize caching for FTP and Gopher documents is to set a Cache Refresh interval. The Cache Refresh interval is the amount of time the Proxy Server waits before retrieving the latest version of the document from the remote server. If you do not set a Cache Refresh interval, the proxy will retrieve these documents even if the versions in the cache are up to date.

If you are setting a cache refresh interval for FTP and Gopher, choose one that you consider safe for the documents the proxy gets. For example, if you store information that rarely changes, use a high number for several days. If the data changes constantly, you will want the files to be retrieved at least every few hours. During the refresh time, you risk sending an out-of-date file to the client. If the interval is short enough, for example, a few hours, you eliminate most of this risk while getting noticeably faster response time.

You can set the cache refresh interval for FTP and Gopher documents on either the Set Cache Specifics page or the Set Caching Configuration page. The Set Cache Specifics page enables you to configure global caching procedures, and the Set Caching Configuration page enables you to control caching procedures for specific URLs and resources. For more information about using the Set Cache Specifics page, see Setting Cache Specifics. For more information about using the Set Caching Configuration page, see Configuring the Cache.


Note –

If your FTP and Gopher documents vary widely (some change often, others rarely), use the Set Caching Configuration page to create a separate template for each kind of document (for example, create a template with resources ftp://.*.gif) and then set a refresh interval that is appropriate for that resource.