To translate URLs from dynamic to static (or vice versa) requires some complex parsing logic and pattern matching. Both the ItemLink
servlet bean and the SEO jump servlet construct URLs using properties that specify the format of the URL and the type of visitor viewing the page.
An important aspect of URL recoding is the use of URL templates. These templates are Nucleus components that the ItemLink
servlet bean and the jump servlet use when they construct URLs. URL templates include properties that specify the format of the URLs, the browser types supported, and how to parse requests.
The URL template classes consist of atg.repository.seo.UrlTemplate
, which is an abstract base class, and its two subclasses:
atg.repository.seo.DirectUrlTemplate
defines the format of the direct (dynamic) URLs created by theItemLink
servlet bean for human site visitors.atg.repository.seo.IndirectUrlTemplate
defines the format of the indirect (static) URLs created byItemLink
servlet bean for web spiders. It is also used by the SEO jump servlet to determine how to translate these static URLs back to dynamic URLs.
In addition, the atg.repository.seo
package has a UrlTemplateMapper
interface that is used by ItemLink
to map repository item descriptors to URL templates. The package also includes a UrlTemplateMapperImpl
implementation class for this interface.
Configuring URL Templates
The UrlTemplate
base class has several key properties that are inherited by the DirectUrlTemplate
and IndirectUrlTemplate
subclasses. The following list summarizes these properties. Some of the properties are described in more detail in subsequent sections.
urlTemplateFormat
The URL format used by theItemLink
servlet bean to generate page links. The format is expressed injava.text.MessageFormat
syntax, but uses parameter names instead of numbers as placeholders. See Specifying URL Formats.
maxUrlLength
The maximum number of characters in a generated URL.
supportedBrowserTypes
List of browser types supported by this template. Each entry must match the name of anatg.servlet.BrowserType
component. See Specifying Supported and Excluded Browser Types.
excludedBrowserTypes
List of browser types that are explicitly not supported by this template. Each entry must match the name of anatg.servlet.BrowserType
instance. See Specifying Supported and Excluded Browser Types.
webAppRegistry
The web application registry that contains the context paths for registered web applications.
The IndirectUrlTemplate
class has additional properties not found in the DirectUrlTemplate
class. These properties are summarized in the following list. Note that these properties are used only by the SEO jump servlet, and not by the ItemLink
servlet bean.
indirectRegex
The regular expression pattern the jump servlet uses to extract parameter values from static request URLs. See Using Regular Expression Groups.
regexElementList
An ordered list where each list element specifies the parameter type of the corresponding regular expression element inindirectRegex
. See Using Regular Expression Groups.
forwardUrlTemplateFormat
The URL format used by the jump servlet to generate a dynamic URL for forwarding a static request URL. Like theurlTemplateFormat
property, this is expressed using the same syntax asjava.text.MessageFormat
, but uses parameter names instead of parameter numbers as placeholders.
defaultUrl
The URL to use if a valid dynamic URL cannot be created because the repository item specified in the static URL does not exist. If the value of this property is not set explicitly, it defaults to/index.jsp
.
useUrlRedirect
Iftrue
, the jump servlet redirects the request to a dynamic URL rather than forwarding it. Default isfalse
, which means that forwarding is used.
Specifying URL Formats
The urlTemplateFormat
property of the DirectUrlTemplate
and IndirectUrlTemplate
classes is used to specify the format of the URLs generated by the ItemLink
servlet bean. In addition, the forwardUrlTemplateFormat
property of the IndirectUrlTemplate
class is used by the jump servlet to translate static URLs created by the servlet bean back to dynamic URLs.
The values of urlTemplateFormat
and forwardUrlTemplateFormat
should include placeholders that represent properties of repository items. ItemLink
fills in the urlTemplateFormat
placeholders when it generates a static or dynamic URL. The jump servlet fills in the forwardUrlTemplateFormat
placeholders to construct a dynamic URL.
The placeholder format is a parameter name (which typically represents a property of a repository item) inside curly braces. For example, a dynamic URL for displaying a product on a Core Commerce site might be specified in a direct URL template like this:
urlTemplateFormat=\
/catalog/product.jsp?prodId\={item.id}&catId\={item.parentCategory.id}
A dynamic URL generated using this format might look like this:
/catalog/product.jsp?prodId=prod1002&catId=cat234
The static URL equivalent in an indirect URL template might look like this:
urlTemplateFormat=/jump/product/{item.id}/{item.parentCategory.id}\
/{item.displayName}/{item.parentCategory.displayName}
Note that this URL format includes the displayName
properties of the repository item and its parent category, and also the repository IDs of these items. The displayName
properties provide the text that a web spider can use for indexing. The repository IDs are included so that if an incoming request has this URL, the SEO jump servlet can extract the repository IDs and use them to fill in placeholders in the dynamic URL it generates. In addition, the URL begins with /jump
to enable the jump servlet to detect it as a static URL (as described in Specifying Context Paths).
A static URL generated using this format might look like this:
/jump/product/prod1002/cat234/Q33+UltraMountain/Mountain+Bikes
Encoding Parameter Values
By default, the SEO components use URL encoding when they insert parameter values in placeholders. This ensures that special characters in repository item property values do not make the URL invalid. For example, the value of a displayName
property will typically include spaces, which are not legal characters in URLs. Therefore, each space is encoded as a plus sign (+), which is a legal character.
In some cases, it is necessary to insert a parameter value un-encoded. For example, some repository properties represent partial URL strings, and therefore need to be interpreted literally. To support this, the placeholder syntax allows you to explicitly specify whether to encode a parameter. For example:
{item.template.url,encode=false}
For parameters that should be encoded, you can explicitly specify encode=true
; however, this is not necessary, because encode
defaults to true
.
Another way to specify that a parameter should not be encoded is to use square brackets rather that curly braces. For example:
[item.template.url]
Specifying Context Paths
When developing a site that uses URL recoding for SEO, you must be careful about whether the generated URLs should include the application’s context path. Dynamic URLs must include the context path (so that these URLs are properly interpreted by the Oracle Commerce Platform’s request-handling pipeline). Static URLs do not need the context path (the URL is never actually interpreted by the pipeline), and it is better to omit it because it may interfere with the jump servlet’s ability to detect static URLs in requests. This is because the urlTemplateFormat
property of an indirect URL template will typically start with a special string (such as /jump
) that enables the jump servlet to detect these URLs. The jump servlet should then be configured to use URI-mapping to detect these URLs, as described in Configuring the SEO Jump Servlet.
Therefore, when ItemLink
generates an indirect URL, it is undesirable for the application’s context path to be prepended to the URL. To avoid including the context path, links created with the <dsp:a>
tag should use the href
attribute, not the page
attribute. The page
attribute prepends the application’s context path to the generated URL, but the href
attribute does not.
However, using the href
attribute means that the context path will not automatically be prepended to the dynamic URLs generated by ItemLink
. Also, since static URLs will not have the context path, the jump servlet will not be able to include this information in the dynamic URLs it forwards inbound requests to. Therefore, you should include the context path when you configure the following:
The
urlTemplateFormat
property of each direct URL template.The
forwardUrlTemplateFormat
property of each indirect URL template.
There are two ways you can specify the context path:
Explicitly include the context path when you set the property.
Specify the name of a registered web application. There must be a single colon character after the web application name to denote that it is a web application name.
Specifying the name of a registered web application rather than including the context path itself has the advantage that you do not need to know what the context path actually is. Also, if the context path changes, you do not need to update each URL template component. The main disadvantage is that you need to know what web application registry the web application is registered with, and set the webAppRegistry
property of each URL template component to this value.
Note that for a multisite application that uses a path-based URL strategy, you should not configure URL templates to include the context path in generated URLs. See URL Recoding for Multisite Applications.
When generating a direct URL (either with ItemLink
using a direct template, or the jump servlet using the forward URL in an indirect template), the following logic is used to determine the context path:
If a web application name occurs in the first part of the URL with the format
webAppName
:
restOfUrl
, the web application is resolved using the web application registry specified in thewebAppRegistry
property of the template. The web application’s context path is then used to replace thewebAppName
placeholder.If there is a colon in the first part of the URL but no web application name before the colon, the context path of the default web application is used. The default web application is specified in the
defaultWebApp
property of theItemLink
servlet bean or of the jump servlet (the former if generating a direct URL for a page link, the latter if generating a forwarding URL for an inbound request).Otherwise, the context path is assumed to already be present.
Specifying Supported and Excluded Browser Types
Both the ItemLink
servlet bean and SEO jump servlet can be configured to use multiple URL templates. The actual template used for any given request is partly determined by examining the User-Agent
property of the HTTP request and finding a template that supports this browser type.
The supportedBrowserTypes
and excludedBrowserTypes
properties of a URL template are mutually exclusive. You can configure an individual template to support a specific set of browser types, or to exclude a specific set of browser types, but not both. A typical configuration is to set excludedBrowserTypes
to robot
in direct URL templates, and set supportedBrowserTypes
to robot
in indirect URL templates. This will ensure that web spiders will see indirect URLs, and human visitors will see direct URLs.
The supportedBrowserTypes
or excludedBrowserTypes
property is a list of components of class atg.servlet.BrowserType
. (Note that to add a component to the list, you specify the name
property of the component, rather than the Nucleus name of the component.) The Oracle Commerce Platform includes a number of BrowserType
components, which are found in Nucleus at /atg/dynamo/servlet/pipeline/BrowserTypes
. You can also create additional BrowserType
components. For more information, see Customizing a Request-Handling Pipeline.
Using Regular Expression Groups
When a static URL is part of an incoming request, the SEO jump servlet parses the URL to extract parameter values, which it then uses to fill in placeholders in the dynamic URL it generates. To extract the parameter values, the servlet uses regular expression groups, which you specify using the indirectRegex
property of the indirect URL component.
For example, suppose you have a URL format that looks like this:
urlTemplateFormat=/jump/product/{item.id}/{item.parentCategory.id}\
/{item.displayName}/{item.parentCategory.displayName}
The regular expression pattern for this format might be specified like this:
indirectRegex=/jump/product/([^/].*?)/([^/].*?)/([^/].*?)/([^/].*?)$
This pattern tells the jump servlet how to extract the parameter values from a static URL. In addition, the servlet needs information about how to interpret the parameters. Some parameters may be simple String values, while others may represent the ID of a repository item. If the parameter is a repository item ID, the servlet needs to determine the item type and the repository that contains the item.
Therefore the indirect URL template also includes a regexElementList
property for specifying each parameter type. This property is an ordered list where the first element specifies the parameter type of the first regular expression, the second element specifies the parameter type of the second regular expression, and so on.
The syntax for each parameter type entry in the list is:
paramName
|paramType
[|additionalInfo
]
The paramName
is used to match the parameter with placeholders in the direct URL that the servlet forwards the request to.
Valid values for paramType
are:
string
, which denotes a simple stringid
, which denotes the ID of a repository item
The optional additionalInfo
field can be used to specify additional details if paramType
is id
. (This field should be omitted if paramType
is string
.) The syntax of additionalInfo
takes one of the following forms:
repositoryName
:itemDescriptorName
itemDescriptorName
The parameter type list for the regular expression pattern shown above would look similar to this:
item | id | /atg/commerce/catalog/ProductCatalog:product
parentCategory | id | /atg/commerce/catalog/ProductCatalog:category
displayName | string
parentCategoryDisplayName | string
Configuring URL Template Mappers
URL template mappers are used by the ItemLink
servlet bean to map repository item descriptors to URL templates. The servlet bean has an itemDescriptorNameToMapperMap
property that maps item descriptors to URL template mappers. For example:
itemDescriptorNameToMapperMap=\ product=/atg/repository/seo/ProductTemplateMapper,\ category=/atg/repository/seo/CategoryTemplateMapper
Each template mapper component has a templates
property that specifies one or more templates to use for rendering static URLs, and a defaultTemplate
property that specifies the template to use for rendering dynamic URLs. So, in this example, the product
item descriptor is associated with the templates listed by the ProductTemplateMapper
component, and the category
item descriptor is associated with the templates listed by the CategoryTemplateMapper
component. When ItemLink
generates a link to a specific repository item, it uses this mapping to determine the URL template to use.