Oracle iPlanet Web Proxy Server 4.0.14 Administration Guide

About Templates

A template is a collection of URLs, called resources. A resource might be a single URL, a group of URLs that have something in common, or an entire protocol. You name and create a template and then you assign URLs to that template by using regular expressions. In this way, you can configure the proxy server to handle requests for various URLs differently. Any URL pattern you can create with regular expressions can be included in a template. The following table lists the default resources and provides some ideas for other templates.

Table 16–1 Resource regular expression wildcard patterns

Regular expression pattern  

What it configures  

ftp://.*

All FTP requests 

http://.*

All HTTP requests 

https://.*

All secure HTTP requests 

gopher://.*

All Gopher requests 

connect://.*:443

All SSL (secure) transactions to HTTPS port. 

http://home\.example\.com.*

All documents on the home.example.com web site.

.*\.gif.*

Any URL that includes the string .gif

.*\.edu.*

Any URL that includes the string .edu

http://.*\.edu.*

Any URL going to a computer in the .edu domain

Understanding Regular Expressions

Proxy Server allows you to use regular expressions to identify resources. Regular expressions specify a pattern of character strings. In the proxy server, regular expressions are used to find matching patterns in URLs.

The following example shows a regular expression:

[a-z]*://[^:/]*\.abc\.com.*

This regular expression would match any documents from the .abc.com domain. The documents could be of any protocol and could have any file extension.

The following table lists the regular expressions and their corresponding meanings.

Table 16–2 Regular expressions and their meanings

Expression  

Meaning  

Matches any single character except a newline. 

x?

Matches zero or one occurrences of regular expression x.

x*

Matches zero or more occurrences of regular expression x.

x+

Matches one or more occurrences of regular expression x.

x{n,m}

Matches the character x where x occurs at least n times but no more than m times.

x{n,}

Matches the character x where x occurs at least n times.

x{n}

Matches the character x where x occurs exactly n times.

[abc]

Matches any of the characters enclosed in the brackets. 

[^abc]

Matches any character not enclosed in the brackets. 

[a-z]

Matches any characters within the range in the brackets. 

x

Matches the character x where x is not a special character.

\x

Removes the meaning of special character x.

"x"

Removes the meaning of special character x.

xy

Matches the occurrence of regular expression x followed by the occurrence of regular expression y.

x|y

Matches either the regular expression x or the regular expression y.

^

Matches the beginning of a string. 

$

Matches the end of a string. 

(x)

Groups regular expressions. 

This example illustrates how you can use some of the regular expressions in Understanding Regular Expressions.

[a-z]*://([^.:/]*[:/]|.*\.local\.com).*

As noted in Understanding Regular Expressions, the backslash can be used to escape or remove the meaning of special characters. Characters such as the period and question mark have special meanings, and therefore, must be escaped if they are used to represent themselves. The period, in particular, is found in many URLs. So, to remove the special meaning of the period in your regular expression, you need to precede it with a backslash.

Understanding Wildcard Patterns

You can create lists of wildcard patterns that enable you to specify which URLs can be accessed from your site. Wildcards can be in the form of regular expressions or shell expressions, depending on usage. As a general rule:

You can specify several URLs by using regular expression wildcard patterns. Wildcards enable you to filter by domain name or by any URL with a given word in the URL. For example, you might want to block access to URLs that contain the string “careers.” To do this, you could specify http://.*careers.* as the regular expression for the template.