24 Configuring Vanity URLs

Starting with Oracle WebCenter Sites 11.1.1.8.0, WebCenter Sites allows for the creation of vanity URLs for all asset types. Vanity URLs can be short, readable, and easy manage URLs. They are usually made up of human readable information often including the name, or other descriptive information about the content of the web page the vanity URL points to.

Vanity URLs do not replace the existing URLs in the product. They exist alongside of the existing URLs, as either may be used to access the content. A vanity URL can be auto-generated as well as custom created by a content contributor in the WebCenter Sites Contributor interface.

An example "normal" WebCenter Sites URL could be

http://www.example.com/cs/Satellite?c=Page&cid=3451344545&pagename=avisports/Page/HomeLayout 

where the vanity URL for the same asset could be

http://www.example.com/Running/Home

This shows how the Vanity URL is useful, being both simpler to type (particularly on a Mobile device), as well as easier for a user to read and recall later.

Vanity URLs consist of two parts. First, the WebRoot, which is the control of how the vanity URL is interpreted in different environments. The other part of a vanity URL is the URL Pattern or Free Form entry. The URL Pattern is a rule defined for an entire asset type, while the Free Form entry is created individually by the content contributor for each asset.

This chapter contains the following sections:

24.1 WebRoots

The WebRoot is used to control how the vanity URL is interpreted. WebRoots are like assets. Once configured, a WebRoot must be published to the destinations for it to work in delivery system.

WebRoots come in two forms: absolute and relative. An absolute WebRoot must contain the entire URL prefix (including host and port information), it may also optionally contain a PATH prefix, it is unique to every server. A relative WebRoot contains only information related to the PATH and does not contain any information on the host or port. Both URLs are handled identically by WebCenter Sites; however, with relative WebRoot only a single WebRoot is needed across multiple environments, for instance development, staging, and production environments. With absolute WebRoot each of these environments will have unique WebRoot.

To eliminate this limitation, the concept of VirtualRoot is supported. VirtualRoot requires setting environment identifier in the futuretense.ini in order to identify that they are valid for the given environment, if the parameter is missing then the WebRoot is used. It is important to determine and understand the types of WebRoots as you may have reason to have both absolute and relative roots defined at the same time. Table 24-1 lists the advantages ad disadvantages of these types of WebRoots.

Table 24-1 Advantages and Disadvantages of WebRoot Types

WebRoot Type Advantages Disadvantages

Absolute WebRoot

  • Full URL is displayed to the Content Contributor

  • A unique URL is required for every environment

  • Virtual roots require setting a WebCenter Sites property

  • Difficult to test vanity URLs prior to publishing

Relative WebRoot

  • Single WebRoot works across all Systems

  • Vanity URLs can be easily tested before publishing

  • Content Contributors will not see the full URL, only the Path is displayed

  • May require additional steps in rewriting the URL

Combination

  • Full URL is displayed to the Content Contributor

  • Single WebRoot works across all Systems.

  • Vanity URLs can be easily tested before publishing

  • More URLs are present and need to be stored/published

  • Content contributors will see both Relative as well as absolute WebRoot URLs. Since there are more URLs it may be confusing to the Content Contributor.


Examples of the different types of WebRoots required for a URL

The following examples use the same vanity URL used in the introduction — http://www.example.com/Running/Home — and will illustrate using and accessing it from management and production servers.

The following vanity URLs are used for each environment:

  • Production: http://www.example.com/Running/Home

  • Management: http://management.example.com/Running/Home

For more information on the Path portion of Running/Home path of the URL, see Section 24.2, "Generating Vanity URLs."

The server name used for the production server is http://www.example.com/ and the server name used for the management server is http://management.example.com/.

For an absolute WebRoot, a unique WebRoot for each environment is required, with the root URL as http://www.example.com. In futuretense.ini set the sites.environment property to management. The site will be AVISports.

It is important to note that the value specified in WebRoot for the virtual root URI and the value specified in the sites.environment property must be the same.

For a relative WebRoot, a single Web Root is used all environments. The root URL is /. The site will be AVISports.

To create a new WebRoot

  1. From the Admin tab, expand the WebRoots node. Double-click Add New.

    The WebRoot edit form will display, as in Figure 24-1, but with all fields blank.

    Figure 24-1 WebRoot Edit Form

    Description of Figure 24-1 follows
    Description of ''Figure 24-1 WebRoot Edit Form''

  2. Enter a name in the Host Name field. This is the unique identifier for the WebRoot instance. This can be used in render tags to get a specific vanity URL of an asset.

    This value cannot be changed once the WebRoot is saved.

  3. Enter a URL in the Root URL field. The value for the Root URL can be either absolute or relative. For an absolute WebRoot, typically, this is the URL that is used to render the asset in delivery environment, but for the relative WebRoot this would be /<Optional PATH Prefix>.

    For the Content Contributor this is the URL that is used as a prefix when displaying the defined vanity URLs. For example:

    • Absolute Root URL using only the hostname

      • Production URL: http://www.example.com

      • Complete vanity URL: http://www.example.com/Running/Home

    • Absolute Root URL including both the hostname and prefix for the path

      • Production URL: http://www.example.com/Prefix

      • Complete vanity URL: http://www.example.com/Prefix/Running/Home

    • Absolute Root URL including the hostname for a mobile site

      • Production URL: http://www.example.mobi

      • Complete vanity URL: http://www.example.mobi/Running/Home

    • Relative Root URL

      • Production URL: /cs/example

      • Complete vanity URL: http://<Current Host Name>/cs/example/Running/Home

    • Relative Root URL

      • Production URL: /

      • Complete vanity URL: http://<Current Host Name>/Running/Home

  4. Enter one or more virtual URLs. The virtual URL has two parts:

    • Environment: The identifier for the other environments. Environments must be identified by setting property sites.environment=<name> in futuretense.ini.

    • Root URL: This is the URL that is used to render the asset in other environments such as editorial, staging, and so forth.

    When first displayed, there is only one row for adding an environment and root URL. Additional lines of Environment and Root URL fields are added by clicking the Add New button.

    The virtual URL entered must have an absolute Root URL. For example:

    • For a management URL of http://management.example.com/Running/Home enter:

      • Environment: management

      • Root URL: management.example.com

      • sites.environment=management in futuretense.ini

    • For a staging URL of http://staging.example.com/Running/Home enter:

      • Environment: staging

      • Root URL: staging.example.com

      • sites.environment=staging in futuretense.ini

    • For a development URL (including port) of http://dev.example.com:8080/Running/Home enter:

      • Environment: dev

      • Root URL: dev.example.com:8080

      • sites.environment=dev in futuretense.ini

    Note:

    Virtual Root URL is not used in display in the contributor UI. It is used only for rendering the web pages in different environments, and only when the Environment matches the value provided in the sites.environment property in futuretense.ini.
  5. Select one or more sites. The WebRoot can be enabled for specific sites using the Sites field.

    In the proceeding examples we would set the Site to AVIsports, since these URLs are unique to this site.

  6. Once all information is added, click the Save icon. The WebRoot is created.

Once the WebRoot is created for the first time, it will be available as an asset type in the Admin tab under the Asset Type node.

To view a WebRoot

  1. From the Admin tab, expand the WebRoots node. Double-click one of the listed WebRoots.

    The WebRoot Content form will display, as in Figure 24-2, showing the saved information for that WebRoot.

Figure 24-2 WebRoot Content Form

Description of Figure 24-2 follows
Description of ''Figure 24-2 WebRoot Content Form''

To edit an existing WebRoot

  1. From the Admin tab, expand the WebRoots node. Double-click one of the listed WebRoots.

    The WebRoot edit form will display, as in Figure 24-2.

  2. Click the Edit icon to edit the WebRoot.

    The WebRoot Edit for will display, as in Figure 24-3.

    Figure 24-3 WebRoot Edit Form

    Description of Figure 24-3 follows
    Description of ''Figure 24-3 WebRoot Edit Form''

  3. The stored information will display in the Host Name, Root URL, Virtual Root URL, and Sites fields.

    The Host Name field is the unique identifier for the WebRoot instance. This can be used in render tags to get a specific vanity URL of an asset.

    The Root URL field is typically the URL used to render the asset in delivery environment. This is the URL that is used for display for the Content Contributor.

    The Virtual Root URL has two parts:

    • Environment: The identifier for the other environments. Environments must be identified by setting property sites.environment=<name> in futuretense.ini.

    • Root URL: This is the URL that is used to render the asset in other environments such as editorial, staging, and so forth.

    To add additional virtual URLs, additional lines of Environment and Root URL fields are added by clicking the Add New button.

    Select one or more sites. The WebRoot can be enabled for specific sites using the Sites field.

  4. Once all information is added, click the Save icon. The WebRoot is updated.

To Approve and Publish WebRoot

WebRoot can be approved for publishing in the same way other assets are. This is done by clicking the green checkmark icon when previewing the WebRoot.

WebRoots are not set as blocking assets. The administrator must approve and publish the WebRoot separately.

24.2 Generating Vanity URLs

Vanity URLs can be constructed in two ways. One way a vanity URL can be constructed manually, in a one-off manner by Content Contributors (see the Oracle Fusion Middleware WebCenter Sites User's Guide for more information). The other method is described in this section, and automatically generates a vanity URL from a pattern.

This section contains the following topics:

24.2.1 Configuring Auto-Generated URLs

Vanity URLs based upon patterns are automatically generated for individual assets. Once a pattern is registered, the vanity URL is generated when the asset is saved for the first time. Patterns can be created for every asset type; however, these patterns are only valid for a single site. The same pattern must be created multiple times if multiple sites share the asset type.

While creating vanity URLs for assets, users edit the default vanity URLs of the assets and Oracle WebCenter Sites allows the creation of duplicate vanity URLs. Therefore, the auto generated URLs are changed to user generated URLs and users can decide if they want to delete the URLs, redirect them to another URL, and so on.

Note:

Changes to patterns are only processed when an asset is created or saved. If you modify a pattern or create a new one it only applies to assets created after the pattern is saved.

To configure an auto-generated vanity URL

  1. On the Admin tab, expand the Asset Types node, then expand the node for the specific asset type you are creating an auto-generated URL for. Under the specific asset type node, expand URL Pattern and double-click Add New.

    The URL Pattern Form displays.

    Figure 24-4 URL Pattern Form - Asset

    Description of Figure 24-4 follows
    Description of ''Figure 24-4 URL Pattern Form - Asset''

    Figure 24-5 URL Pattern Form - Blob (Detail)

    Description of Figure 24-5 follows
    Description of ''Figure 24-5 URL Pattern Form - Blob (Detail)''

  2. Enter a name for the pattern in the Name field. This name is only used to reference this pattern and is unique; it may be changed at any time.

  3. In the Pattern field, enter the rule that defines the vanity URL. Lists of valid attributes and functions which may be used are displayed on the right side of the screen. For more information on creating patterns, see Section 24.2.2, "Working with URL Patterns."

  4. In the Blob Headers field (available only if a blob is selected), enter the rule used to define the blob headers in the vanity URL

  5. The Site is the site the pattern will be applied for, and will only list sites for which the asset type is enabled. The selection of the site will also narrow down the selections for the Host (WebRoot), Subtype, Template (asset only), and Wrapper (asset only).

    Note:

    Patterns defined will remain even if the asset type is later disabled for that site; however, the use of these URLs might result in a broken page.
  6. The Host is the WebRoot which the vanity URL is applied to. Only WebRoots that are defined for this site or "any" are displayed. The pattern will not work without a WebRoot.

  7. The Subtype is the subtype of the asset type which the vanity URL is applied to. The selection of Subtype further restricts the available items for Template and Wrapper.

    Note:

    The Subtype field for blobs will not show Any as an option for flex assets. The Subtype field for blobs will not appear for basic assets.
  8. The Field is the blob to be rendered by this URL. This field appears only when a blob is selected. There may be multiple items available if the asset contains more than one blob entry.

    Selecting the Is Downloadable checkbox will have the browser display a save dialogue, when the vanity URL is accessed, instead of attempting to view the blob object within the page.

  9. Device Group provides a list of possible device groups currently defined. This field appears only when an asset is selected. All device groups are formatted into lists depending upon the extension and regardless of their current state within the site. For more information on device groups, see the Oracle Fusion Middleware WebCenter Sites Developer's Guide.

  10. The Template is the template for the URL and all templates that are valid for the asset type, subtype and site are displayed regardless of the existence of a mobile variation. All templates that are valid will be shown, however not all templates may have a variation for the specified Device Group. It is up to the administrator to ensure the proper variation exists when using a device group other than the default.

  11. The Wrapper will provide a list of all wrapper elements that are valid for the selected Template. If none exist then this field will be disabled.

  12. At this point you may click Save to save the pattern and continue working, or evaluate the URL pattern.

    The Evaluate URL Pattern section of the form allows you to see an example of what a pattern generated vanity URL will look like. This section may be used either before or after saving the pattern; however, it is recommended to evaluate the URL pattern before saving to validate your pattern.

    Note:

    If no asset of the given type exists then you will be unable to use this feature.
  13. The Asset drop-down allows you to select an asset currently defined for the pattern type. You may also use the "look ahead" feature to type in an asset's name. Once an asset is selected, WebCenter Sites will evaluate the created pattern and show how this asset's URL will appear.

    Remember that this is an example of the URLs final appearance. The URL is not yet valid, until the pattern has been saved and the asset in question has also been edited and saved.

24.2.2 Working with URL Patterns

Automatic patterns are the heart of vanity URLs. There is a another component to vanity URLs which allows contributors to create custom URLs for each asset (see Oracle Fusion Middleware WebCenter Sites User's Guide for more information), however, as an administrator pattern matched URL will likely make up the bulk of the vanity URLs present on a site. Patterns take their form from a combination of paths and variables, where the variables are either fields from the asset or predefined functions.

Note:

All the parts of URL are URL encoded before the save.

The actual variables which are present will differ based on the asset type selected, but all asset types will share some common variables. Table 24-2 lists the most common variables.

Table 24-2 Common Asset Type Variables

Variable Name Variable Type

createdby

string

createddate

date

description

string

enddate

date

fwtags

array

id

long

locale

string

name

string

startdate

date

status

string

subtype

string

template

string

updatedby

string

updateddate

date


Clicking a variable will cause it to appear at the end of the pattern field. It does not make a difference where the cursor is, the new pattern is always the last item.

Variables are specified by ${<Name>}. For example, to call the ID of an asset, use ${id}.

Variables of types string, int, and long can be used directly, without modification.

Variables of type array need to be specified such that only a single value is extracted. For example, to get the first tag defined for an asset, use ${fwtags[0]}, or use the function listAsPath. Array numbering begins with 0 and using a value outside of bounds of the array will result in "null."

Variables of type date will need to be formatted; a function formatDate is provided for this.

Other types, such as url, may not be usable or may require the use of a function to convert to a form that is acceptable for generating a URL. Use the Evaluate URL Pattern option to assist you with these variables.

Variables are defined using Java standard types so that normal Java functions can also be added to a variable. For example, on the string variable "name," name.toLowerCase() will convert the name to lower case.

This section contains the following topics:

24.2.2.1 Functions

Functions are useful in converting and formatting data to include in the vanity URL. There are predefined functions in WebCenter Sites will will be useful in creating vanity URLs.

Clicking a function will cause it to appear at the end of the pattern field. It does not make a difference where the cursor is, the new pattern is always the last item.

Predefined functions

  • spaceToUnderscore(java.lang.String)

    Converts spaces to _. For example: Page 001 would become Page_001.

  • formatDate(java.util.Date, java.lang.String)

    Formats a date to a number and slash format. For example, if updatedate was "Thu Feb 28 12:57:12 UTC 2013" using the formatDate function we can display it as "2/28/2013". The string used to format the date should match that used by Java's SimpleDateFormat function.

  • spaceToDash(java.lang.String)

    Converts a space to -. For example, Page 001 would become Page-001.

  • listAsPath(java.util.List, int)

    Converts an array to a list, for the first x items. For instance, if an array looked like [1, 2, 3 ,4, 5, 6] then using listAsPath with x = 4 the resulting list would appear as 1\2\3\4.

    Note that x, defined by int, is required.

  • getFileName(com.fatwire.assetapi.data.BlobObject)

    Gets the fIle name for a blob object as text. For instance, if the blob name was xyz.png this would return the text xyz.png. Normally this is the very last parameter used, so that the correct file extension is present.

    This function cannot be used for binary fields of the Asset Maker asset.

  • property(java.lang.String, java.lang.String, java.lang.String)

    Reads a property from ini file or a property file.This uses ics.GetProperty. Parameters for this method are 1) property name, 2) default value if the property is not present and 3) comma separated list of file names.

Functions are specified using the format ${f:<name>(<variables>)}, so for the spaceToUnderscore example given above the function would be written as

${f:spaceToUnderscore("Page 001")}

If you make a mistake or pass in an invalid parameter, then in the Evaluate URL Pattern section a null is returned for that part of the path. While only the outermost variable must contain ${}, all functions must begin with f:.

24.2.2.2 Examples of Using Patterns

The examples are based off AVISports article asset type. The asset name is "Baker Likely to Stay With Stars."

Figure 24-6 URL Pattern Form - Asset

Description of Figure 24-6 follows
Description of ''Figure 24-6 URL Pattern Form - Asset''

Note:

The first part of the URL is always the WebRoot. This will be the local system name and context root in the case of AVI Sports. The next part, Avi, is the filter prefix (for information on Filters see Mod_rewrite section). Only the actual unique path is provided in these examples.

Given the asset "Baker Likely to Stay With Stars," there are a number of options available to generate a URL. Each example shows the function and the resulting URL.

  1. Start with a simple name pattern

    • /Article/${name}

    • /Article/Baker+Likely+to+Stay+With+Stars

  2. Convert the name to lower case

    • /Article/${name.toLowerCase()}

    • /Article/baker+likely+to+stay+with+stars

  3. Remove the spaces from this URL

    • /Article/${f:spaceToUnderscore(name)}

    • /Article/Baker_Likely_to_Stay_With_Stars

  4. Use multiple variables

    • /Article/${id}/${createdby}

    • /Article/1328196049037/fwadmin

  5. Use a function within a function

    • /Article/${(f:formatDate(createddate,"yy/MM/dd")}

    • /Article/12-04-27

24.2.2.3 Deleting Patterns

Patterns can be deleted via the navigation tree.

To delete a pattern

  1. From the Admin tab, expand Asset Types. Double-click the asset type containing the pattern.

    The selected asset type will expand.

  2. Double-click URL Pattern.

    Figure 24-7 Saved URL Patterns by Asset Type

    Description of Figure 24-7 follows
    Description of ''Figure 24-7 Saved URL Patterns by Asset Type''

    The currently saved patterns are displayed, and can be inspected, edited, or deleted as any other asset.

24.2.3 Publishing URL Patterns

Patterns are automatically published with the asset type. There is no special approval or publishing needed for URL patterns.

24.3 Using Vanity URL with System Tools

Using the System Tools node, an Admin can view all presently defined vanity URLs.

This sceiont contains the following topics:

24.3.1 Viewing All Vanity URLs

To view created vanity URLs

  1. From the Admin tab, expand the System Tools node.

  2. Double-click URL.

    The URL Utility Form is displayed.

Since every URL defined (for all sites) are present on this list, it will be necessary to filter it using the search box provided to locate URL(s) of interest. From this menu you can manually delete a vanity URL, or check if it is a problem or confirm where a conflict is originating. However, if the URL is based upon a pattern then it will be automatically regenerated when the asset is next saved.

24.3.2 Resolving Vanity URL Conflicts Using System Tools

Vanity URLs are unique in that there cannot be more than one with the same path (even across WebCenter Sites).

For instance, if there are two sites and each site has a Rule pattern — /Page/${name} — and if the sites use the same name for a page (for instance, "Home"), then a conflict will occur.

This means that whichever page is saved first will succeed and create a vanity URL, but, the second page will fail to create a vanity URL. When this occurs it might be necessary to use the System Tool to locate what is blocking the creation of the Vanity URL and if needed remove the existing URL to allow the new one to be created. This issue can be mitigated by ensuring that the WebRoot is unique to a single site.

24.4 Resolving Vanity URLs Using a Rewriter Filter

There are two ways to resolve a vanity URL. The first is by using a web server and rewriting the URL. The second way is designed for instances where a web server is not present such as often in development, and relies upon a filter.

WebCenter Sites contains a URL rewriter filter that is used, by default, only for sample sites. This rewriter filter can be configured to be used along with a WebRoot for any site. The filter will intercept any call through it (based upon a string match it begins with) and pass the call through to WebCenter Sites as a vanity URL filter.

In simple terms, it allows vanity URLs to operate without a rewriting rule on the web server. The major limitation of using this filter is that the context root, and then a unique prefix, must be given to ensure that WebCenter Sites knows which URLs to treat as vanity URLs.

Note:

"FSII" and "Avi" are the parameters pre-configured for URL rewrite filter in the web.xml when sample sites are installed.

To enable the rewriter filter

  1. Copy the following code into web.xml after the final existing </filter>. Then replace Avi with a unique string (for instance the site name), then restart WebCenter Sites.

    Note:

    Multiple SitePrefix can be added by specifying comma separated list in the param-value. For each SitePrefix ensure a WebRoot is created inside Sites that matches the prefix as shown in step 2.
    <filter>
      <filter-name>URLRewriteFilter</filter-name>
      <filter-class>COM.FutureTense.Servlet.URLRewriteFilter</filter-class>
        <init-param>
          <param-name>SitePrefix</param-name>
          <param-value>Avi</param-value>
        </init-param>
    </filter>
    <filter-mapping>
      <filter-name>URLRewriteFilter</filter-name>
      <url-pattern>/*</url-pattern>
    </filter-mapping>
    
  2. Create a WebRoot inside WebCenter Sites that matches the prefix. For example, if using "Avi" then you need a WebRoot beginning with /<contextroot>/Avi. If using a relative WebRoot then this all that is necessary. If using an absolute WebRoot prefix, the Hostname and Port are required as well. Make sure to select the site for which this WebRoot is defined before saving.

  3. Once saved, create a single URL using the new WebRoot.

  4. Verify that the vanity URL works in a browser.

For example:

Assume a relative Web Root of /Avi and a path of /Running/Home, the resulting URL using the filter would look like:

http://<Sites Host>:<Sites Port>/<Context Root>/Avi/Running/Home

Since a filter is used, the use of a prefix (specifically Avi in this example) is required both in the filter and in the WebRoot.

Note:

Vanity URLs support query parameters. While required parameters have been removed from vanity URLs it is understood that sometimes additional data must be passed as part of the URL. In such cases the existing parameters will need to be concatenated onto the existing URL. For instance, if passing in time=true then the URL above sent to sites would look like:
http://<Sites Host>:<Sites Port>/<Context root>/Avi/Running/Home?time=true

24.5 Vanity URL for Mobile Sites

In most cases, those using a mobile-specific site will want to use a different URL for mobile users. Not only will mobile users benefit from a shorter URL, but there is often a separate domain and different URLs. To set up a different domain and different URL for the mobile pages of a website the following steps should be followed

  1. WebRoot must be set for the Mobile domain names. The Root URL of the WebRoot should have mobile domain names.

  2. User should choose the right host names when creating vanity URLs for mobile device groups. This will ensure that the links are correctly generated for the mobile domain names.

  3. All other steps including Web Server URL Rewrite are same as mentioned in Section 24.1, "WebRoots,"Section 24.2, "Generating Vanity URLs," and Section 24.3, "Using Vanity URL with System Tools."

24.6 Using the Web Server with Vanity URLs

Using a web server with URL rewriting will allow you to access vanity URLs in a similar way to the filter. and you will also have all the control and power that URL rewriting entails.

When using web server, all rewritten vanity URLs must be split into two sections before sending them to WebCenter Sites. One section is the WebRoot, the other is the path.

Creating an example: use the WebRoot http://www.example.com and a pattern that generates /Running/Home for the URL http://www.example.com/Running/Home . In order to pass this URL in from a web server we must break up the URL into the WebRoot and path. These are then passed, separately, as query parameters to a special filter in WebCenter Sites The vanity URL filter used to handle rewritten URLs is named "Sites", and is accessed by <content root>/Sites.

The two parts of the URL are broken up into lookuphost, and lookuppage. The lookuphost refers to the WebRoot (full server name, port and prefix if absolute WebRoot, or only the prefix if relative WebRoot). The lookuppage refers to the remainder of the PATH after the prefix. Using the above URL as an example, the URL passed to WebCenter Sites would need to be rewritten to look like:

http://<Sites Host>:<Sites Port>/<contextpath/Sites?lookuppage=/Running/Home &lookuphost=http://www.example.com

Note:

Vanity URLs support query parameters. While required parameters have been removed from vanity URLs, it is understood that sometimes extra data must be passed in as part of the URL. In such cases the existing parameters will need to be concatenated onto the existing URL.

For instance, if passing in time=true then the URL above sent to sites would look like:

http://<Sites Host>:<Sites Port>/<contextpath/Sites?lookuppage=/Running/Home &lookuphost=http://www.example.com&time=true

24.7 Using mod_rewrite with Vanity URLs

Every web server and site will be unique, so the following should be viewed as guide lines and examples as to how to use mod_rewrite (which is used in OHS, Apache and IBM HTTP Server to rewrite URLs). Your system will likely require modifications to the rules to operate correctly.

There are two methods to use mod_rewrite. It can be used to either directly transform the URL for the Sites filter, or it could be used in conjunction with the URL rewriter filter to perform the same action but in a way that will allow for much simpler mod_rewrite rules (likely closer to what existing sites have used previously to remove the context root). Examples of both methods will be provided. To determine which is right for you - first decide if the extra step (and hence cost) of having the Sites filter break apart the vanity URLs is worth the simplified rewrite rules.

Note:

The guide will assume that you are familiar with mod_rewrite. If not, start by reviewing the Oracle documentation located at:

http://docs.oracle.com/cd/B12037_01/server.101/b12255/confmods.htm#1009986

All subsequent steps are based off Apache 2.4.x manually compiled. If using a version shipped with an OS the actual changes needed to enable and configure mod_rewrite may differ, but the rules and concepts will not.

This section contains the following topics:

24.7.1 Setting Up mod_rewrite

First enable mod_rewrite by editing httpd.conf and uncommenting the following line:

LoadModule rewrite_module modules/mod_rewrite.so

Second, enable the Rewrite engine, by adding this line to httpd.conf:

RewriteEngine On

Third, for debugging, only enable logging on mod_rewrite (this is very high cost operation so only use it when debugging):

LogLevel alert rewrite:trace6

Save the changes to httpd.conf; mod_rewrite is now enabled (without any rules). Lastly ensure that the web server starts.

Note:

There are web pages that allow for much easier construction and testing. It is suggested that you use one of these sites when initially writing up the mod_rewrite rules.

Setup for Provided Examples

In order to provide some examples of mod_rewrite rules, we first need to define what the environment looks like. Below is a sample based upon the vanity URLs used in this guide; however, you should be able to use the same elements and just change out the values for your environment as a starting point.

The vanity URLs a customer is expected to see.

  • Production:

    http://www.example.com/Page/Surfing
    http://www.example.com/Article/Baker_Likely_to_Stay_With_Stars
    http://www.example.com/Home
    http://www.example.com/Home?time=true
    
  • Development:

    http://dev.example.com:8080/Running/Home
    

Generic Sites information:

  • Production context root: /cs

  • Production rewrite filter: /AVISPORTS

  • Development context root: /csdev

  • Development rewrite filter: /Avi

Web Roots Defined for each Server:

  • Production - absolute: http://www.example.com/

  • Development - relative: /Avi

URL structure:

  • Production: http://www.example.com/Page/Surfing

    • WebRoot: http://www.example.com

    • Path: /Page/Surfing

  • Production: http://www.example.com/Article/Baker_Likely_to_Stay_With_Stars

    • WebRoot: http://www.example.com

    • Path: /Article/Baker_Likely_to_Stay_With_Stars

  • Production: http://www.example.com/Home

    • WebRoot: http://www.example.com

    • Path: /Home

  • Production: http://www.example.com/Home?time=true

    • WebRoot: http://www.example.com

    • Path: /Home

    • Query Parameters: time=true

  • Development: http://dev.example.com:8080/Running/Home

    • WebRoot: /Avi

    • Path: /Running/Home

There are two filters. The first is the Sites vanity URL filter, the second is the Sites rewrite filter. When using the rewrite filter it passes the data to the vanity URL filter transparently.

When using the Sites vanity URL filter, the URLs are expected in this manner:

  • Production:

    • http://www.example.com/Page/Surfing

      lookuphost=http://www.example.com &lookuppage=/Page/Surfing 
      
    • http://www.example.com/Article/Baker_Likely_to_Stay_With_Stars

      lookuphost=http://www.example.com &lookuppage=/Article/Baker_Likely_to_Stay_With_Stars 
      
    • http://www.example.com/Home

      lookuphost=http://www.example.com &lookuppage=/Home 
      
    • http://www.example.com/Home?time=true

      lookuphost=http://www.example.com &lookuppage=/Home&time=true 
      
  • Development:

    • http://dev.example.com:8080/Running/Home

      lookuphost=/cs/Avi&lookuppage=/Running/Home
      

When using the WebCenter Sites rewriter filter, the URLs are expected in this manner:

  • Production:

    • http://www.example.com/Page/Surfing

      /cs/AVISPORTS/Page/Surfing
      
    • http://www.example.com/Article/Baker_Likely_to_Stay_With_Stars

      /cs/AVISPORTS/Article/Baker_Likely_to_Stay_With_Stars
      
    • http://www.example.com/Home

      /cs/AVISPORTS/Home
      
    • http://www.example.com/Home?time=true

      /cs/AVISPORTS/Home?time=true
      
  • Development:

    • http://dev.example.com:8080/Running/Home

      /csdev/Avi/Running/Home
      

24.7.2 Converting with mod_rewrite

After choosing some URLs and breaking them down to component parts, it is now necessary to convert these URLs using mod_rewrite. The rules for mod_rewrite consist of two parts: the first is a regular expression which matches the URL provided to the web server, and the second is how it will appear when passed on. If the first part fails to match, then the URL is passed on to the next rewrite rule; if no defined rule matches, then the URL is passed on unchanged. Thus the order of the rules become important if more than a single rule is present.

Note:

1. mod_rewrite is an optional component that you have previously enabled. If you create a mod_rewrite rule without actually enabling mod_rewrite then the rule is skipped. There are no errors and nothing is rewritten. For this reason it is suggested to ensure debugging is active during development and monitor the error.log for details related to mod_rewrite.

What is discussed in this section only covers a very small portion of what mod_rewrite can handle and for most sites a more complex rule is likely to be required. For more information, see the vendor's documentation.

The component parts of a mod_rewrite rule

This example will use the first URL for production: http://www.example.com/Page/Surfing

Create thisrule for it:

RewriteRule ^(.*)$ /cs/Sites?lookuphost=http://www.example.com&lookuppage=$1 [L]

This rule has four component parts:

  1. RewriteRule — a statement that the following entry is a mod_rewrite rule

  2. ^(.*)$ — a regular expression, which breaks down as follows:

    • ^ — start of line indicator

    • (.*) — s select match all text indicator; placing it in parentheses also saves that text to a variable

    • $ — end of line indicator

    This expression then indicates that between the start and end of string save all text to a variable.

  3. /cs/Sites?lookuphost=http://www.example.com&lookuppage=$1 — the WebCenter Sites context root, filter, and path for vanity URL, which breaks down further as

    • /cs/Sites — the Sites Context Root (cs) followed by the vanity URL filter (Sites)

    • ? — indicates that parameters coming after this are query parameters

    • lookuphost=http://www.example.com

      • lookuphost — provides the vanity URL filter with the name of the WebRoot

      • http://www.example.com — the WebRoot for the production server (it is an absolute WebRoot).

    • lookuppage=$1

      • lookuppage — the path portion of the vanity URL (in this case /Page/Surfing)

      • $1 — a replacement variable, for the first piece of text extracted from the input URL. In this case /Page/Surfing which was matched by the (.*)

  4. [L] — informs mod_rewrite that this is the end of the rules. All mod_rewrite rules must end with this.

24.7.3 Using the Sites Filter with mod_rewrite

The following rule will handle /Page/Surfing, but it will also handle any other page sent in, extracting everything after the host to the end of the line.

RewriteRule ^(.*)$ /cs/Sites?lookuphost=http://www.example.com& lookuppage=$1 [L]

So the rule will also handle:

http://www.example.com/Article/Baker_Likely_to_Stay_With_Stars
http://www.example.com/Home

However, this rule will not handle

http://www.example.com/Home?time=true

as this URL has a query parameter being passed in along with the host and path. To handle the query parameter a mod_rewrite built in variable "%{QUERY_STRING}" must be used, thus the previous rule:

/cs/Sites? lookuphost=http://www.example.com& lookuppage=$1

Must be updated to:

/cs/Sites? lookuphost=http://www.example.com& lookuppage=$1&%{QUERY_STRING} 

If no Query parameters are passed in the URL then %{QUERY_STRING} will be empty. Thus it will not only work for the three previous URLs, but also for this URL.

Next we look at the Development URL:

http://dev.example.com:8080/Running/Home . 

This must be passed in to the vanity URL filter as

lookuphost=/cs/Avi&/cs/Sites?lookuppage=/Running/Home . 

Since mod_rewrite does not deal with host and port by default (there are special parameters that contains this information), the rewrite rule is very similar to this rule:

RewriteRule ^(.*)$ /cs/Sites?lookuppage=$1&lookuphost=/cs/Avi [L]

In this case since development utilizes a relative WebRoot, we need to adjust the name, since the rest of the URL is unchanged the same rule still applies.

24.7.4 Using the Rewriter Filter with mod_rewrite

When using the rewrite filter, URLs do not need to be broken down the way they are for the vanity URL filter. Instead, the Sites Context Root and rewriter Filter Prefix are added to the URL (This is very similar to using WebCenter Sites prior to the introduction of vanity URLs,so it is likely that modify existing rules can be easily modified to support vanity URLs.

Using the example URL

http://www.example.com/Page/Surfing

Sites expects

/cs/AVISPORTS/Page/Surfing 

To handle this, create a new mod_rewrite rule:

RewriteRule ^(.*)$ /cs/AVISPORTS$1 [L]

This rule simply takes what is given and prefixes /cs/AVISPORTS to it. The same rule works for:

http://www.example.com/Article/Baker_Likely_to_Stay_With_Stars
http://www.example.com/Home

To handle

http://www.example.com/Home?time=true 

add the %{QUERY_STRING} parameter to the rule so that it will become:

RewriteRule ^(.*)$ /cs/AVISPORTS$1?%{QUERY_STRING}  [L]

24.7.5 Using Static Content with mod_rewrite

One of the biggest issues with using mod_rewrite and vanity URLs is accessing existing CSS and static images. Often these are stored either on the application server or on the web server in a directory named after the site. This means that often the rewrite rules will result is making them inaccessible.

Using the example of

http://www.example.com/Page/Surfing 

and assume that the CSS files are kept in a directory named AVISPORTS_STATIC.

To handle this a second mod_rewrite rule is required. THis rule will preceed the former rule which ensures that calls to this directory are not rewritten.

This can be done as:

RewriteRule ^.*/AVISPORTS_STATIC/(.*)$ /cs/CSPerformance/$1 [L,PT].  

With these two rules, the first ensures that all static pages and CSS are not converted to a vanity URL, and the second rule (which was constructed in Section 24.7.2, "Converting with mod_rewrite") sends the remaining URLs to the vanity URL filter.

RewriteRule ^.*/AVISPORTS_STATIC/(.*)$ /cs/CSPerformance/$1 [L,PT]
RewriteRule ^(.*)$ /cs/Sites?lookuphost=http://www.example.com&lookuppage=$1 [L]