Sun Java System Web Proxy Server 4.0.9 Administration Guide

How Reverse Proxying Works

You can use two different methods for reverse proxying. One method takes advantage of Proxy Server’s security features to handle transactions. The other method uses caching to provide load balancing on a heavily used server. Both of these mehtods differ from the conventional proxy usage because they do not operate strictly on a firewall.

Proxy as a Stand-in for a Server

If you have a content server that has sensitive information that must remain secure, such as a database of credit card numbers, you can set up a proxy outside the firewall as a stand–in for your content server. When outside clients try to access the content server, they are sent to the proxy server instead. The real content resides on your content server, safely inside the firewall. The proxy server resides outside the firewall, and appears to the client to be the content server.

When a client makes a request to your site, the request goes to the proxy server. The proxy server then sends the client’s request through a specific passage in the firewall to the content server. The content server passes the result through the passage back to the proxy. The proxy sends the retrieved information to the client, as if the proxy were the actual content server, as shown in Figure 14–1. If the content server returns an error message, the proxy server can intercept the message and change any URLs listed in the headers before sending the message to the client. This behavior prevents external clients from getting redirection URLs to the internal content server.

In this way, the proxy provides an additional barrier between the secure database and the possibility of malicious attack. In the unlikely event of a successful attack, the perpetrator is more likely to be restricted only to the information involved in a single transaction, as opposed to having access to the entire database. The unauthorized user can not get to the real content server because the firewall passage allows only the proxy server to have access.

Figure 14–1 Reverse Proxy Process

Diagram showing a reverse proxy that appears like the
content server.

You can configure the firewall router to allow a specific server on a specific port (in this case, the proxy on its assigned port) to have access through the firewall without allowing any other machines in or out.

Secure Reverse Proxying

Secure reverse proxying occurs when one or more of the connections between the proxy server and another machine use the Secure Sockets Layer (SSL) protocol to encrypt data.

Secure reverse proxying has many uses:

Secure reverse proxying causes each secure connection to be slower due to the overhead involved in encrypting your data. However, because SSL provides a caching mechanism, two connecting parties can reuse previously negotiated security parameters, dramatically reducing the overhead on subsequent connections.

The three ways to configure a secure reverse proxy are:

Figure 14–2 Secure client connection to proxy

Diagram showing a secure client connection to proxy.

Figure 14–3 Secure Proxy Connection to Content Server

Diagram showing a secure proxy connection to content

Figure 14–4 Secure Client Connection to Proxy and Secure Proxy Connection to Content Server

Diagram showing a secure client connection to proxy and
a secure proxy connection to content server.

For information about how to set up each of these configurations, see Setting up a Reverse Proxy.

In addition to SSL, the proxy can use client authentication, which requires that a computer making a request to the proxy provides a certificate or other form of identification to verify its identity.

Proxying for Load Balancing

You can use multiple proxy servers within an organization to balance the network load among web servers. This model takes advantage of the caching features of the proxy server to create a server pool for load balancing. In this case, the proxy servers can be on either side of the firewall. If you have a web server that receives a high number of requests per day, you could use proxy servers to take the load off the web server and make the network access more efficient.

The proxy servers act as go-betweens for client requests to the real server. The proxy servers cache the requested documents. If you have more than one proxy server, DNS can route the requests randomly using a “round-robin” selection of their IP addresses. The client uses the same URL each time, but the route the request takes might go through a different proxy each time.

The advantage of using multiple proxies to handle requests to one heavily used content server is that the server can handle a heavier load, and more efficiently than it could alone. After an initial start-up period in which the proxies retrieve documents from the content server for the first time, the number of requests to the content server can drop dramatically.

Only CGI requests and occasional new requests must go all the way to the content server. The rest can be handled by a proxy. For example, suppose that 90% of the requests to your server are not CGI requests, which means they can be cached, and that your content server receives 2 million hits per day. In this situation, if you connect three reverse proxies and each of them handles 2 million hits per day, about 6 million hits per day would then be possible. The 10% of requests that reach the content server could add up to about 200,000 hits from each proxy per day, or only 600,000 total, which is far more efficient. The number of hits could increase from approximately 2 million to 6 million, and the load on the content server could decrease correspondingly from 2 million to 600,000. Your actual results would depend upon your situation.

Figure 14–5 Proxy Used for Load Balancing

Diagram showing a proxy used for load balancing where
all requests go to a central DNS server that routes the requests to any proxy