Previous     Contents     Index     DocHome     Next     
iPlanet Web Proxy Server 3.6 Administrator's Guide - Unix Version



Chapter 10   Filtering Content Through the Proxy


This chapter describes how to filter URLs so that your proxy server either doesn't allow access to the URL or modifies the HTML and JavaScript content it returns to the client. This chapter also describes how you can restrict access through the proxy based on the web browser (user agent) that the client is using.

The proxy server lets you use a URL filter file to determine which URLs the server supports. For example, instead of manually typing in wildcard patterns of URLs to support, you can create or purchase one text file that contains URLs you want to restrict. This feature lets you create one file of URLs that you can use on many different proxy servers.

You can also filter URLs based on their MIME type. For example, you might allow the proxy to cache and send HTML and GIF files but not allow it to get binary or executable files because of the risk of computer viruses.



Filtering URLs



You can use a file of URLs to configure what content the proxy server retrieves. You can set up a list of URLs the proxy always supports and a list of URLs the proxy never supports.

For example, if you're an Internet service provider who runs a proxy server with content appropriate for children, you might set up a list of URLs that are approved for viewing by children. You can then have the proxy server retrieve only the approved URLs; if a client tries to go to an unsupported URL, either you can have the proxy return the default "Forbidden" message or you can create a custom message explaining why the client could not access that URL.

To restrict access based on URLs, you need to create a file of URLs to allow or restrict. You can do this through the Server Manager. Once you have the file, you can set up the restrictions. These processes are discussed in the following sections.


Creating a Filter File of URLs

A filter file is a file that contains a list of URLs. The filter files the proxy server uses are plain text files with lines of URLs in the following pattern:

protocol://host:port/path/filename

You can use regular expressions in each of the three sections: protocol, host:port, and path/filename. For example, if you want to create a URL pattern for all protocols going to the netscape.com domain, you'd have the following line in your file:

.*://.*\.netscape\.com/.*

This line works only if you don't specify a port number. For more information on regular expressions, see "Understanding Regular Expressions" on page 44.



Note When these regular expression patterns get written to the obj.conf file as ppath parameters, back slash characters are replaced by double back slashes. For example, the above pattern, when written in the obj.conf file, would appear as: .*://.*\\.netscape\\.com/.*



You can use the Server Manager forms to create a file. If you want to create your own file without using the Server Manager, you should use the Server Manager forms to create an empty file, and then add your text in that file or replace the file with one containing the regular expressions.

To create a file using the Server Manager:

  1. In the Server Manager, choose Filters|URL Filters. In the URL Filter Access Restriction form that appears, choose New Filter from the drop-down list next to the Create/Edit URL Filter button.

  2. Type a name for the filter file in the text box to the right of the drop-down list and then click the Create/Edit URL Filter button.

  3. The Filter Editor form appears. Use the Filter Content scrollable text box to enter URLs and regular expressions of URLs. The Reset button clears all the text in this field.

    For more information on regular expressions, see "Understanding Regular Expressions" on page 44.

  4. When finished, click OK and confirm your changes.

    The proxy server creates the file and returns you to the URL Filter Access Restriction form. The filter file is created in the server-root/admserv/proxy-id.


Setting Default Access for a Filter File

Once you have a filter file that contains the URLs you want to use, you can set the default access for those URLs.

To set default access for a filter file:

  1. In the Server Manager, choose Filters|URL Filters.

  2. Choose the template you want to use with the filters.

    Typically, you'll want to create filter files for the entire proxy server, but you might want one set of filter files for HTTP and another for FTP.

  3. Use the URL filter to allow list to choose a filter file that contains the URLs you want the proxy server to support.

  4. Use the URL filter to deny list to choose a filter file that contains the URLs to which you want the proxy server to deny access.

  5. Choose the text you want the proxy server to return to clients who request a denied URL. You can choose one of two options:

    • You can send the default "Forbidden" message that the proxy generates.

    • You can send a text or HTML file with customized text. Type the absolute path to this file using the text box on the form.



Restricting Access to Specific Web Browsers

You can restrict access to the proxy server based on the type and version of the client's web browser. For example, you can specify that all proxy server users must use Netscape Navigator 3.0. Restriction occurs based on the user-agent header that all web browsers send to servers when making requests.

To restrict access to the proxy based on the client's web browser:

  1. In the Server Manager, choose Filter|User-Agent.

  2. Check the allow only User-agents matching radio button.

  3. Type a regular expression that matches the user-agent string for the browsers you want the proxy server to support. If you want to specify more than one client, enclose the regular expression in parentheses and use the | character to separate the multiple entries. For more information on regular expressions, see "Understanding Regular Expressions" on page 44.



Request Blocking

You may want to block file uploads and other requests based on the upload content type.

To block requests based on MIME type:

  1. From the Server Manager, choose Filters|Request Blocking. The Request Blocking form appears.

  2. Click the radio button for the type of request blocking you want. The options are:

    • disabled - disables request blocking

    • multipart MIME (file upload) - blocks all file uploads

    • MIME types matching regular expression - blocks requests for MIME types that match the regular expression you enter. For more information on creating regular expressions, see "Understanding Regular Expressions" on page 44.

  3. Choose whether you want to block requests for all clients or for user-agents that match a regular expression you enter.

  4. Click the radio button for the methods for which you want to block requests. The options are:

    • any method with request body - blocks all requests with a request body, regardless of the method

    • only for:
      POST - blocks file upload requests using the POST method
      PUT - blocks file upload requests using the PUT method

    • methods matching - blocks all file upload requests using the method you enter

  5. Click OK.



Suppressing Outgoing Headers

You can configure the proxy server to remove outgoing headers from the request (usually for security reasons). For example, you might want to prevent the "from" header from going out because it reveals the user's email address (although Netscape Navigator does not send the from header unless specifically configured to do so). Or, you might want to filter out the user-agent header so external servers can't determine what web browsers your organization uses. You may also want to remove logging or client-related headers that are to be used only in your intranet before a request is forwarded to the Internet.

This feature doesn't affect headers that are specially handled or generated by the proxy itself or that are necessary to make the protocol work properly (such as If-Modified-Since and Forwarded).

Although it's not possible to stop the forwarded header from originating from a proxy, this isn't a security problem. The remote server can detect the connecting proxy host from the connection. In a proxy chain, a forwarded header coming from an inner proxy can be suppressed by an outer proxy. Setting your servers up this way is recommended when you don't want to have the inner proxy or client host name revealed to the remote server.

To suppress outgoing headers:

  1. In the Server Manager, choose Filters|Suppress Outgoing Headers.

  2. In the form that appears, type a regular expression that matches the headers you want to suppress. For example, to suppress the from and user-agent headers, type (from|user-agent). The headers you type are not case-sensitive. For more information on regular expressions, see "Understanding Regular Expressions" on page 44.



Filtering by MIME Type

You can configure the proxy server to block certain files that match a MIME type. For example, you could set up your proxy server to block any executable or binary files so that any clients using your proxy server can't download a possible computer virus.

If you want the proxy server to support a new MIME type, in the Server Manager, choose System Settings|MIME Types and add the type. See "Creating MIME Types" on page 59 for more information.

You can combine filtering MIME types with templates, so that only certain MIME types are blocked for specific URLs. For example, you could block executables coming from any computer in the .edu domain.

To filter by MIME type:

  1. In the Server Manager, choose Filters|MIME Filters.

  2. Choose the template you want to use for filtering MIME types, or make sure you're editing the entire server.

  3. In the Current filter text box, you can type a regular expression that matches the MIME types you want to block.

    For example, to filter out all applications, you could type (application/.*) for the regular expression. This is faster than checking each MIME type for every application type (as described in the following step). The regular expression is not case-sensitive. For more information on regular expressions, see "Understanding Regular Expressions" on page 44.

  4. Check the MIME types you want to filter. When a client attempts to access a file that is blocked, the proxy server returns a "forbidden" message.

  5. Click OK to submit the form. Be sure to save and apply your changes.



Filtering out HTML Tags

The proxy server lets you specify HTML tags you want to filter out before passing the file to the client. This lets you filter out objects such as Java applets and JavaScript embedded in the HTML file. To filter HTML tags, you specify the beginning and ending HTML tags. The proxy then substitutes blanks for all text and objects in those tags before sending the file to the client.



Note The proxy stores the original (unedited) file in the cache, if the proxy is configured to cache that resource.



To filter out HTML tags:

  1. In the Server Manager, choose Filters|HTML Tag Filters.

  2. In the form that appears, choose the template you want to modify. You might choose HTTP, or you might choose a template that specifies only certain URLs (such as those from hosts in the .edu domain).

  3. Check the filter box for any of the default HTML tags you want to filter. These are the default tags:

    • APPLET usually surrounds Java applets.

    • SCRIPT indicates the start of JavaScript code.

    • IMG specifies an inline image file.

  4. You can enter any HTML tags you want to filter. Type the beginning and ending HTML tags.

    For example, to filter out forms, you could type FORM in the Start Tag box (the HTML tags are not case-sensitive) and /FORM in the End Tag box. If the tag you want to filter does not have an end tag, such as OBJECT and IMG, you can leave the End Tag box empty.

  5. Click OK to submit the form. You need to save and apply your changes and restart the proxy before the filtering will begin.


Previous     Contents     Index     DocHome     Next     
Copyright © 2001 Sun Microsystems, Inc. Some preexisting portions Copyright © 2001 Netscape Communications Corp. All rights reserved.

Last Updated September 27, 2001