To translate URLs from dynamic to static (or vice versa) requires some complex parsing logic and pattern matching. Both the ItemLink servlet bean and the SEO jump servlet construct URLs using properties that specify the format of the URL and the type of visitor viewing the page.

An important aspect of URL recoding is the use of URL templates. These templates are Nucleus components that the ItemLink servlet bean and the jump servlet use when they construct URLs. URL templates include properties that specify the format of the URLs, the browser types supported, and how to parse requests.

The URL template classes consist of atg.repository.seo.UrlTemplate, which is an abstract base class, and its two subclasses:

In addition, the atg.repository.seo package has a UrlTemplateMapper interface that is used by ItemLink to map repository item descriptors to URL templates. The package also includes a UrlTemplateMapperImpl implementation class for this interface.

Configuring URL Templates

The UrlTemplate base class has several key properties that are inherited by the DirectUrlTemplate and IndirectUrlTemplate subclasses. The following list summarizes these properties. Some of the properties are described in more detail in subsequent sections.

maxUrlLength
The maximum number of characters in a generated URL.

supportedBrowserTypes
List of browser types supported by this template. Each entry must match the name of an atg.servlet.BrowserType component. See Specifying Supported and Excluded Browser Types.

excludedBrowserTypes
List of browser types that are explicitly not supported by this template. Each entry must match the name of an atg.servlet.BrowserType instance. See Specifying Supported and Excluded Browser Types.

webAppRegistry
The web application registry that contains the context paths for registered web applications.

The IndirectUrlTemplate class has additional properties not found in the DirectUrlTemplate class. These properties are summarized in the following list. Note that these properties are used only by the SEO jump servlet, and not by the ItemLink servlet bean.

indirectRegex
The regular expression pattern the jump servlet uses to extract parameter values from static request URLs. See Using Regular Expression Groups.

regexElementList
An ordered list where each list element specifies the parameter type of the corresponding regular expression element in indirectRegex. See Using Regular Expression Groups.

forwardUrlTemplate
The URL format used by the jump servlet to generate a dynamic URL for forwarding a static request URL. Like the urlTemplateFormat property, this is expressed using the same syntax as java.text.MessageFormat, but uses parameter names instead of parameter numbers as placeholders.

useUrlRedirect
If true, the jump servlet redirects the request to a dynamic URL rather than forwarding it. Default is false, which means that forwarding is used.

Specifying URL Formats

The urlTemplateFormat property of the DirectUrlTemplate and IndirectUrlTemplate classes is used to specify the format of the URLs generated by the ItemLink servlet bean. In addition, the urlTemplateFormat property of the IndirectUrlTemplate class is used by the jump servlet to determine how to interpret static request URLs created by the servlet bean.

The value of urlTemplateFormat should include placeholders that represent properties of repository items. ItemLink fills in these placeholders when it generates a URL. The jump servlet uses them to extract the property values from a static request URL.

The placeholder format is a parameter name (which typically represents a property of a repository item) inside curly braces. For example, a dynamic URL for displaying a product on an ATG Commerce site might be specified in a direct URL template like this:

urlTemplateFormat=\
  /catalog/product.jsp?prodId\={item.id}&catId\={item.parentCategory.id}

A dynamic URL generated using this format might look like this:

/catalog/product.jsp?prodId=prod1002&catId=cat234

The static URL equivalent in an indirect URL template might look like this:

urlTemplateFormat=/jump/product/{item.id}/{item.parentCategory.id}\
  /{item.displayName}/{item.parentCategory.displayName}

Note that this URL format includes the displayName properties of the repository item and its parent category, and also the repository IDs of these items. The displayName properties provide the text that a web spider can use for indexing. The repository IDs are included so that if an incoming request has this URL, the SEO jump servlet can extract the repository IDs and use them to fill in placeholders in the dynamic URL it generates. In addition, the URL begins with /jump to enable the jump servlet to detect it as a static URL (as described in Specifying Context Paths).

A static URL generated using this format might look like this:

/jump/product/prod1002/cat234/Q33+UltraMountain/Mountain+Bikes
Encoding Parameter Values

By default, the SEO components use URL encoding when they insert parameter values in placeholders. This ensures that special characters in repository item property values do not make the URL invalid. For example, the value of a displayName property will typically include spaces, which are not legal characters in URLs. Therefore, each space is encoded as a plus sign (+), which is a legal character.

In some cases, it is necessary to insert a parameter value unencoded. For example, some repository properties represent partial URL strings, and therefore need to be interpreted literally. To support this, the placeholder syntax allows you to explicitly specify whether to encode a parameter. For example:

{item.template.url,encode=false}

For parameters that should be encoded, you can explicitly specify encode=true; however, this is not necessary, because encode defaults to true.

Another way to specify that a parameter should not be encoded is to use square brackets rather that curly braces. For example:

[item.template.url]
Specifying Context Paths

When developing a site that uses URL recoding for SEO, you must be careful about whether the generated URLs should include the application’s context path. Dynamic URLs must include the context path (so that these URLs are properly interpreted by Oracle ATG Web Commerce’s request-handling pipeline). Static URLs do not need the context path (the URL is never actually interpreted by the pipeline), and it is better to omit it because it may interfere with the jump servlet’s ability to detect static URLs in requests. This is because the urlTemplateFormat property of an indirect URL template will typically start with a special string (such as /jump) that enables the jump servlet to detect these URLs. The jump servlet should then be configured to use URI-mapping to detect these URLs, as described in Configuring the SEO Jump Servlet.

Therefore, when ItemLink generates an indirect URL, it is undesirable for the application’s context path to be prepended to the URL. To avoid including the context path, links created with the <dsp:a> tag should use the href attribute, not the page attribute. The page attribute prepends the application’s context path to the generated URL, but the href attribute does not.

However, using the href attribute means that the context path will not automatically be prepended to the dynamic URLs generated by ItemLink. Also, since static URLs won’t have the context path, the jump servlet will not be able to include this information in the dynamic URLs it forwards inbound requests to. Therefore, you should include the context path when you configure the following:

  • The urlTemplateFormat property of each direct URL template

  • The forwardUrlTemplateFormat property of each indirect URL template

There are two ways you can specify the context path:

  • Explicitly include the context path when you set the property.

  • Specify the name of a registered web application. There must be a single colon character after the web application name to denote that it is a web application name.

Specifying the name of a registered web application rather than including the context path itself has the advantage that you don’t need to know what the context path actually is. Also, if the context path changes, you don’t need to update each URL template component. The main disadvantage is that you need to know what web application registry the web application is registered with, and set the webAppRegistry property of each URL template component to this value.

Note that for a multisite application that uses a path-based URL strategy, you should not configure URL templates to include the context path in generated URLs. See URL Recoding for Multisite Applications.

When generating a direct URL (either with ItemLink using a direct template, or the jump servlet using the forward URL in an indirect template), the following logic is used to determine the context path:

  1. If a web application name occurs in the first part of the URL with the format webAppName:restOfUrl, the web application is resolved using the web application registry specified in the webAppRegistry property of the template. The web application’s context path is then used to replace the webAppName placeholder.

  2. If there is a colon in the first part of the URL but no web application name before the colon, the context path of the default web application is used. The default web application is specified in the defaultWebApp property of the ItemLink servlet bean or of the jump servlet (the former if generating a direct URL for a page link, the latter if generating a forwarding URL for an inbound request).

  3. Otherwise, the context path is assumed to already be present.

Specifying Supported and Excluded Browser Types

Both the ItemLink servlet bean and SEO jump servlet can be configured to use multiple URL templates. The actual template used for any given request is partly determined by examining the User-Agent property of the HTTP request and finding a template that supports this browser type.

The supportedBrowserTypes and excludedBrowserTypes properties of a URL template are mutually exclusive. You can configure an individual template to support a specific set of browser types, or to exclude a specific set of browser types, but not both. A typical configuration is to set excludedBrowserTypes to robot in direct URL templates, and set supportedBrowserTypes to robot in indirect URL templates. This will ensure that web spiders will see indirect URLs, and human visitors will see direct URLs.

The supportedBrowserTypes or excludedBrowserTypes property is a list of components of class atg.servlet.BrowserType. (Note that to add a component to the list, you specify the name property of the component, rather than the Nucleus name of the component.) The Oracle ATG Web Commerce platform includes a number of BrowserType components, which are found in Nucleus at /atg/dynamo/servlet/pipeline/BrowserTypes. You can also create additional BrowserType components. For more information, see Customizing a Request-Handling Pipeline.

Using Regular Expression Groups

When a static URL is part of an incoming request, the SEO jump servlet parses the URL to extract parameter values, which it then uses to fill in placeholders in the dynamic URL it generates. To extract the parameter values, the servlet uses regular expression groups, which you specify using the indirectRegex property of the indirect URL component.

For example, suppose you have a URL format that looks like this:

urlTemplateFormat=/jump/product/{item.id}/{item.parentCategory.id}\
  /{item.displayName}/{item.parentCategory.displayName}

The regular expression pattern for this format might be specified like this:

indirectRegex=/jump/product/([^/].*?)/([^/].*?)/([^/].*?)/([^/].*?)$

This pattern tells the jump servlet how to extract the parameter values from a static URL. In addition, the servlet needs information about how to interpret the parameters. Some parameters may be simple String values, while others may represent the ID of a repository item. If the parameter is a repository item ID, the servlet needs to determine the item type and the repository that contains the item.

Therefore the indirect URL template also includes a regexElementList property for specifying each parameter type. This property is an ordered list where the first element specifies the parameter type of the first regular expression, the second element specifies the parameter type of the second regular expression, and so on.

The syntax for each parameter type entry in the list is:

paramName | paramType [| additionalInfo]

The paramName is used to match the parameter with placeholders in the direct URL that the servlet forwards the request to.

Valid values for paramType are:

The optional additionalInfo field can be used to specify additional details if paramType is id. (This field should be omitted if paramType is string.) The syntax of additionalInfo takes one of the following forms:

repositoryName:itemDescriptorName
itemDescriptorName

The parameter type list for the regular expression pattern shown above would look similar to this:

item | id | /atg/commerce/catalog/ProductCatalog:product
parentCategory | id | /atg/commerce/catalog/ProductCatalog:category
displayName | string
parentCategoryDisplayName | string
Configuring URL Template Mappers

URL template mappers are used by the ItemLink servlet bean to map repository item descriptors to URL templates. The servlet bean has an itemDescriptorNameToMapperMap property that maps item descriptors to URL template mappers. For example:

itemDescriptorNameToMapperMap=\
    product=/atg/repository/seo/ProductTemplateMapper,\
    category=/atg/repository/seo/CategoryTemplateMapper

Each template mapper component has a templates property that specifies one or more templates to use for rendering static URLs, and a defaultTemplate property that specifies the template to use for rendering dynamic URLs. So, in this example, the product item descriptor is associated with the templates listed by the ProductTemplateMapper component, and the category item descriptor is associated with the templates listed by the CategoryTemplateMapper component. When ItemLink generates a link to a specific repository item, it uses this mapping to determine the URL template to use.


Copyright © 1997, 2012 Oracle and/or its affiliates. All rights reserved.

Legal Notices