|Sun ONE Web Proxy Server 3.6 SP3 Administrator's Guide - NT Version|
Chapter 8 Reverse Proxy
This chapter describes how to use Sun ONE Web Proxy Server as a reverse proxy. Reverse proxy is the name for certain alternate uses of a proxy server. It can be used outside the firewall to represent a secure content server to outside clients, preventing direct, unmonitored access to your server's data from outside your company. It can also be used for replication; that is, multiple proxies can be attached in front of a heavily used server for load balancing. This chapter describes the alternate ways that iPlanet Web Proxy Server can be used inside or outside a firewall.
How Reverse Proxying Works
Reverse proxying with Sun ONE Web Proxy Server uses caching features to provide load balancing on a heavily used server. This model of reverse proxying differs from conventional proxy usage in that it doesn't operate strictly on a firewall.
Proxy as a Stand-in for a Server
If you have a content server that has sensitive information that must remain secure, such as a database of credit card numbers, you can set up a proxy outside the firewall as a stand-in for your content server. When outside clients try to access the content server, they are sent to the proxy server instead. The real content resides on your content server, safely inside the firewall. The proxy server resides outside the firewall, and appears to the client to be the content server.
When a client makes a request to your site, the request goes to the proxy server. The proxy server then sends the client's request through a specific passage in the firewall to the content server. The content server passes the result through the passage back to the proxy. The proxy sends the retrieved information to the client, as if the proxy were the actual content server (see Figure 8-1). If the content server returns an error message, the proxy server can intercept the message and change any URLs listed in the headers before sending the message to the client. This prevents external clients from getting redirection URLs to the internal content server.
In this way, the proxy provides an additional barrier between the secure database and the possibility of malicious attack. In the unlikely event of a successful attack, the perpetrator is more likely to be restricted to only the information involved in a single transaction, as opposed to having access to the entire database. The unauthorized user can't get to the real content server because the firewall passage allows only the proxy server to have access.
Figure 8-1    A reverse proxy appears to be the real content server
You can configure the firewall router to allow a specific server on a specific port (in this case, the proxy on its assigned port) to have access through the firewall without allowing any other machines in or out.
Proxying for Load Balancing
You can use multiple proxy servers within an organization to balance the network load among web servers. This model lets you take advantage of the caching features of the proxy server to create a server pool for load balancing. In this case, the proxy servers can be on either side of the firewall. If you have a web server that receives a high number of requests per day, you could use proxy servers to take the load off the web server and make the network access more efficient.
The proxy servers act as go-betweens for client requests to the real server. The proxy servers cache the requested documents. If there is more than one proxy server, DNS can route the requests randomly using a "round-robin" selection of their IP addresses. The client uses the same URL each time, but the route the request takes might go through a different proxy each time.
The advantage of using multiple proxies to handle requests to one heavily used content server is that the server can handle a heavier load, and more efficiently than it could alone. After an initial start-up period in which the proxies retrieve documents from the content server for the first time, the number of requests to the content server can drop dramatically.
Only CGI requests and occasional new requests must go all the way to the content server. The rest can be handled by a proxy. Here's an example. Suppose that 90% of the requests to your server are not CGI requests (which means they can be cached), and that your content server receives 2 million hits per day. In this situation, if you connect three reverse proxies, and each of them handles 2 million hits per day, about 6 million hits per day would then be possible. The 10% of requests that reach the content server could add up to about 200,000 hits from each proxy per day, or only 600,000 total, which is far more efficient. The number of hits could increase from around 2 million to 6 million, and the load on the content server could decrease correspondingly from 2 million to 600,000. Your actual results would depend upon your situation.
Figure 8-2    Proxy used for load balancing
Setting up a Reverse Proxy
To set up a reverse proxy, you need two mappings: a regular and a reverse mapping.
- The regular mapping redirects requests to the content server. When a client requests a document from the proxy server, the proxy server needs a regular mapping to tell it where to get the actual document.
You shouldn't use a reverse proxy with a proxy that serves autoconfiguration files. This is because the proxy could return the wrong result. See Chapter 12 for more information on using autoconfiguration files with a reverse proxy.
- The reverse mapping makes the proxy server trap for redirects from the content server. The proxy intercepts the redirect and then changes the redirected URL to map to the proxy server. For example, if the client requests a document that was moved or not found, the content server will return a message to the client explaining that it can't find the document at the requested URL. In that returned message, the content server adds an HTTP header that lists a URL to use to get the moved file. In order to maintain the privacy of the internal content server, the proxy can redirect the URL using a reverse mapping.
Suppose you have a web server called http://http.site.com/ and you want to set up a reverse proxy server for it. You could call the reverse proxy http://proxy.site.com/.
You would create a regular mapping and a reverse mapping as follows:
- In the Server Manager, choose URLs|Create Mappings. In the form that appears, enter information for a single mapping. For example:
Source prefix: http://proxy.site.com/
Source destination: http://http.site.com
- Click OK. Return to the form and create the second mapping:
Source prefix: http://http.site.com
Source destination: http://proxy.site.com/
- To make the change, click the OK button.
Once you click the OK button, the proxy server adds one or more additional mappings. To see the mappings, click the link called View/Edit Mappings. Additional mappings would be in the following format:
These additional automatic mappings are for users who connect to the reverse proxy as a normal server. The first mapping is to catch users connecting to the reverse proxy as a regular proxy. Depending on the setup, usually the second is the only one required, but it doesn't cause problems in the proxy to have them both.
If the web server has several DNS aliases, each alias should have a corresponding regular mapping. If the web server generates redirects with several DNS aliases to itself, each of those aliases should have a corresponding reverse mapping.
CGI applications still run on the origin server; the proxy server never runs CGI applications on its own. However, if the CGI script indicates that the result can be cached (by implying a non-zero time-to-live by issuing a Last-modified or Expires header), the proxy will cache the result.
When authoring content for the web server, keep in mind that the content will be served by the reverse proxy, too, so all links to files on the web server should be relative links. There must be no reference to the host name in the HTML files; that is, all links must be of the form:
as opposed to a fully qualified host name, such as: