1 Understanding Reverse Proxying

This chapter provides a general introduction to Oracle Web Cache and its role in providing secure reverse proxying.

This chapter includes the following topics:

1.1 About the Web Tier

The Web tier of a J2EE application server is responsible for interacting with the end users, such as Web browsers primarily in the form of HTTP requests and responses. It is the outermost tier in the HTTP stack, closest to the end user. At the highest level, the Web tier performs four basic tasks:

  • Interprets client requests

  • Dispatches those requests to an object (for example, an enterprise Java bean) that encapsulates business logic

  • Selects the next view for display,

  • Generates and delivers the next view

The Web tier receives each incoming HTTP request and invokes the requested business logic operation in the application. Based on the results of the operation and state of the model, the next view is selected to display. The selected view is transmitted to the client for presentation.

Oracle Web Cache is a content-aware server accelerator, or reverse proxy, for the Web tier that improves the performance, scalability, and availability of Web sites that run on any Web server or application server, such as Oracle HTTP Server and Oracle WebLogic Server.

Oracle Web Cache is the primary caching mechanism provided with Oracle Fusion Middleware. Caching improves the performance, scalability, and availability of Web sites that run on Oracle Fusion Middleware by storing frequently accessed URLs in memory.

By storing frequently accessed URLs in memory, Oracle Web Cache eliminates the need to repeatedly process requests for those URLs on the application Web server and database tiers. Unlike legacy proxies that handle only static objects, Oracle Web Cache caches both static and dynamically generated content from one or more application Web servers. Because Oracle Web Cache can cache more content than legacy proxies, it provides optimal performance by greatly reducing the load on application Web server and database tiers. As an external cache, Oracle Web Cache is also an order of magnitude faster than object caches that run within the application tier.

Because Web Cache is fully compliant with HTTP 1.0 and 1.1 specifications, it can accelerate Web sites that are hosted by any standard Web servers, such as Apache Tomcat and Microsoft IIS. In Oracle Fusion Middleware, Oracle Web Cache resides in front of one or more instances of Oracle HTTP Server. Responses to browser based HTTP requests are directed to the Oracle HTTP Server instance and transmitted through Oracle Web Cache. The Oracle Web Cache instance can handle any Web content transmitted with the standard HTTP protocol.

1.1.1 Reverse Proxying

You can configure Oracle Web Cache as a reverse proxy to origin servers, such as Oracle HTTP Server.

A reverse proxy appears to be the content server to clients but internally retrieves its objects from other back-end origin servers as a proxy. A reverse proxy acts as a gateway to the origin servers. It relays requests from outside the firewall to origin servers behind the firewall, and delivers retrieved content back to the client.

Figure 1-1 shows an overview of how reverse proxy Web caching works. Oracle Web Cache has an IP address of 144.25.190.241 and the application Web server has an IP address of 144.25.190.242.

The steps for browser interaction with Oracle Web Cache are as follows:

  1. A browser sends a request to a Web site named www.company.com:80.

    This request in turn generates a request to Domain Name System (DNS) for the IP address of the Web site.

  2. DNS returns the IP address of the load balancer for the site, that is, 144.25.190.240.

  3. The browser sends the request for a Web page to the load balancer. In turn, the load balancer sends the request to Oracle Web Cache server 144.25.190.241.

  4. If the requested content is in its cache, then Oracle Web Cache sends the content directly to the browser. This is called a cache hit.

  5. If Oracle Web Cache does not have the requested content or the content is stale or invalid, it hands the request off to application Web server 144.25.190.242. This is called a cache miss.

  6. The application Web server sends the content to Oracle Web Cache.

  7. Oracle Web Cache sends the content to the client and stores a copy of the page in cache.

    A page stored in the cache is removed when it becomes invalid or outdated.

Figure 1-1 Web Server Acceleration

Description of Figure 1-1 follows
Description of "Figure 1-1 Web Server Acceleration"

1.2 Request Flow in Web Tier

Figure 1-2 shows further details of the request flow within the Web tier.

Figure 1-2 Request Flow to Oracle Web Cache within the Web Tier

Description of Figure 1-2 follows
Description of "Figure 1-2 Request Flow to Oracle Web Cache within the Web Tier"

As shown in Figure 1-2, the following occurs within the Web tier:

  1. The incoming browser request is analyzed for the correct HTTP format.

  2. The browser request is then further analyzed to determine if it is in HTTPS format:

    1. If the browser request is in HTTPS format, SSL decryption is performed.

    2. If the browser request is not in HTTPS format, the request is parsed.

  3. After the request is understood, it is filtered by a set of prescribed filtering rules.

  4. A cache lookup is performed to see if the HTTP request was sent previously and is present in the cache.

    If the request is present in the cache, a cache hit, the request is compressed and the content is sent directly to the browser.

    If the request is not present in the cache, a cache miss, then either:

    1. The request is sent directly to a single origin server.

    2. The request is sent to load-balanced origin servers.

Each load balanced origin server pings each Oracle Web Cache server on a periodic basis to check the status of the cache. The load balancer distributes any incoming requests among cache cluster members. If Oracle Web Cache does not have the requested content or the content is stale or invalid, it hands the request off to the application Web server. The application Web server sends the content to Oracle Web Cache. Oracle Web Cache sends the content to the client and stores a copy of the page in cache.

The proxy server is placed in a less secure zone, the Demilitarized Zone (DMZ), instead of the origin server.

Caching rules determine which objects to cache. When you establish a caching rule for a particular URL, those objects contained within the URL are not cached until there is a client request for them. When a client first requests an object, Oracle Web Cache sends the request to the origin server. This request is a cache miss. Because this URL has an associated caching rule, Oracle Web Cache caches the object for subsequent requests. When Oracle Web Cache receives a second request for the same object, Oracle Web Cache serves the object from its cache to the client. This request is a cache hit.

When you stop Oracle Web Cache, the cache clears all objects. In addition, Oracle Web Cache clears and resets statistics.

1.2.1 HTTP Traffic Management

You can deploy Oracle Web Cache inside or outside a firewall. Deploying Oracle Web Cache inside a firewall ensures that HTTP traffic enters the DMZ, but only authorized traffic from the application Web servers can directly interact with the database. When deploying Oracle Web Cache outside a firewall, the throughput burden is placed on Oracle Web Cache rather than the firewall. The firewall receives only requests that must go to the application Web servers. This topology requires securing Oracle Web Cache from intruders.

Security experts disagree about whether caches should be placed outside the DMZ. Oracle recommends that you check your company's policy before deploying Oracle Web Cache outside the DMZ.

1.2.2 Request Filtering and Routing

Request filtering checks either the normalized request (for most filter types) or the original raw un-normalized request (for the following format filter rules: null byte, strict encoding, and double encoding). If a match is found on a rule and it is a deny rule, then the request is denied. If the match is for an allow rule, then the request is allowed. For a deny rule, if the rule is in monitor only mode, then the request is logged (to the audit log and the event log), but the request is not denied.

For more information about request filtering, see Chapter 4, "Configuring Request Filtering."

1.2.3 Origin Server Load Balancing and Failover

Origin server load balancing is a feature in which HTTP requests are distributed among origin servers so that no single origin server is overloaded.

Oracle Web Cache supports load balancing and failover detection for application Web servers.

Oracle Web Cache ensures that cache misses are directed to the most available, highest-performing Web server in the server farm. A capacity heuristic guarantees performance and provides surge protection when the application Web server load increases.

For more information about load balancing and failover, see Section 3.1.

1.2.4 Caching

Caching improves the performance, scalability, and availability of Web sites that run on Oracle Fusion Middleware by storing frequently accessed URLs in memory, Oracle Web Cache eliminates the need to repeatedly process requests for those URLs on the application Web server and database tiers. Unlike legacy proxies that handle only static objects, Oracle Web Cache caches both static and dynamically generated content from one or more application Web servers. Because Oracle Web Cache can cache more content than legacy proxies, it provides optimal performance by greatly reducing the load on application Web server and database tiers. As an external cache, Oracle Web Cache is also an order of magnitude faster than object caches that run within the application tier.

Oracle Web Cache sits in front of application Web servers, caching their content, and providing their content to clients that request it. When Web browsers access the Web site, they send HTTP protocol or HTTPS protocol requests to Oracle Web Cache. Oracle Web Cache, in turn, acts as a virtual server on behalf of the application Web servers. If the requested content has changed, Oracle Web Cache retrieves the new content from the application Web servers. The application Web servers may retrieve their content from an Oracle database. Oracle Web Cache can be deployed on its own dedicated tier of computers or on the same computer as the application Web servers.

Web caching provides the following benefits for Web-based applications:

  • Performance: Running on inexpensive hardware, caching combined with compression can increase the throughput of a Web site by several orders of magnitude. In addition, Oracle Web Cache significantly reduces response time to client requests by storing objects in memory and by serving compressed versions of objects to clients that support the GZIP encoding method. See Section 1.2.5 for more information about compression.

  • Scalability: In addition to unparalleled throughput, Oracle Web Cache can sustain thousands of concurrent client connections, meaning that visitors to a site see fewer application Web server errors, even during periods of peak load.

  • High availability: Oracle Web Cache supports load balancing and failover detection for application Web servers. These features ensure that cache misses are directed to the most available, highest-performing Web server in the server farm. Moreover, a patent-pending capacity heuristic guarantees performance and provides surge protection when the application Web server load increases.

  • Cost savings: Better performance, scalability and availability translates into cost savings for Web site operators. Because fewer application Web servers are required to meet the challenges posed by traffic spikes and denial of service attacks, Oracle Web Cache offers a simple and inexpensive means of reducing a Web site's cost for each request.

  • Developer productivity: Application developers can use Oracle Web Cache to cache content rather than design and develop application-specific caches.

For more information about caching, see Chapter 6, "Caching and Compressing Content."

1.2.5 Compression

Oracle Web Cache can compress both cacheable and non-cacheable objects. You can specify compression settings from either Oracle Enterprise Manager Fusion Middleware Control or the compress control directive of the Surrogate-Control response-header field. Oracle Web Cache provides compression configuration at both the site and caching-rule level. If you enable compression for a site, then Oracle Web Cache performs automatic compression for that site. Fine tuning of compression settings can be done by configuring individual caching rules.

Oracle Web Cache correctly handles compression of different types of content and different types of browsers. It enables compression automatically for common compressible content types such as HTML, Javascript, or cascading style sheets (CSS). It disables compression automatically where compression either breaks the application in browsers, or does not provide any gain. These files types include GIF, JPEG, and PNG images, or files that are already compressed with utilities like WinZip or GZIP. Similarly, Oracle Web Cache disables compression for Netscape 4 browsers and for some file types for Internet Explorer 5.5 browsers due to known bugs with these browsers.

Because compressed objects are smaller, they are delivered faster to browsers with fewer round-trips, reducing overall latency. Compressed content is then expanded by browsers that support the GZIP compression in the Accept-Encoding request-header field.

On average, Oracle Web Cache can compress text files by a factor of 4. For example, 300 KB files are compressed down to 75 KB.

For more information about compression, see:

  • Section 2.11.3 for instructions on configuring compression at the site level

  • Section 2.11.3.1 for instructions on disabling compression for all requests

  • Section 6.8.1 for instructions on configuring compression for individual caching rules

  • Section 6.10 for instructions on configuring the Surrogate-Control response-header field

1.2.6 Session Binding

Oracle Web Cache supports sites that use a session ID or session cookie to bind user sessions to a given origin server to maintain state for a period. To use the session binding feature, the origin server itself must maintain state, that is, it must be stateful. A site binds user sessions by including session data in the HTTP header or body it sends to a client in such a way that the client is forced to include it with its next request. This data is transferred between the origin server and the client through Oracle Web Cache either with an embedded URL parameter or through a cookie, which is a text string that is sent to and stored on the client. Oracle Web Cache does not process the value of the parameter or cookie; it simply passes the information back and forth between the origin server and the client.

For more information about session binding, see Section 3.2.

Note:

If an origin server cannot accept any more connections because of the load, Oracle Web Cache disables session binding to that origin server and attempts to connect to another origin server.

1.3 Compatibility with Oracle Fusion Middleware Components

Table 1-1 describes Oracle Web Cache compatibility with several Oracle Fusion Middleware components. It is not an exhaustive list.

Table 1-1 Compatibility with Other Oracle Fusion Middleware Components

Component Description

Oracle HTTP Server

In Oracle Fusion Middleware, Oracle Web Cache resides in front of one or more instances of Oracle HTTP Server. Responses to browser based HTTP requests are directed to the Oracle HTTP Server instance and transmitted through Oracle Web Cache. The Oracle Web Cache instance can handle any Web content transmitted with the standard HTTP protocol.

See Also: Oracle Fusion Middleware Administrator's Guide for Oracle HTTP Server

Oracle Business Intelligence Discoverer

Oracle BI Discoverer is closely integrated with Oracle Web Cache to improve Discoverer Viewer's overall scalability, performance, and availability. Oracle BI Discoverer uses ESI Surrogate-Control headers to govern cacheability of other non-configured responses. Because of this integration, the load on mid-tier and database servers in Oracle BI Discoverer deployments is reduced, more Discoverer Viewer users are able to access the system concurrently, and those users experience significantly better response times for workbook operations and common business intelligence queries.

See Also: Oracle Business Intelligence Discoverer Configuration Guide

Oracle Forms Services

You can deploy Oracle Web Cache as a load balancer with Oracle Forms Services applications.

See Also: Oracle Fusion Middleware Forms Services Deployment Guide

Oracle Portal

Oracle Web Cache has been closely integrated with Oracle Portal to improve its overall scalability, performance, and availability. Oracle Portal ships with several pre-defined caching and invalidation policies that ensure optimal use of Oracle Web Cache. Oracle Web Cache controls have been built into the Oracle Portal administrative user interface and can also be specified by content providers through the Portal Developer Kit (PDK).

See Also: Oracle Fusion Middleware Administrator's Guide for Oracle Portal