The URLs generated by web applications can create problems for the web spiders (also known as web robots) that Internet search engines such as Google use to create search indexes. These URLs typically include query parameters that the spider may not know how to interpret. In some cases, a spider will simply ignore a page whose URL includes query parameters.

Even if the spider does index the page, it may give the page lower weighting than desired, because the URL may not contain search terms that could increase the weighting. For example, consider a typical URL for an ATG Commerce site:

/mystore/product/product.jsp?prodId=prod1002&catId=cat234

This type of URL is sometimes referred to as “dynamic,” because the content of the page is dynamically generated based on the values of the query parameters.

Now consider a static URL for the same page:

/mystore/product/Q33+UltraMountain/Mountain+Bikes

A spider is more likely to index the page with the static URL, and when it does, it is likely to mark “Mountain Bikes” and “Q33 UltraMountain” as key search terms and weight the page heavily for them. As a result, when a user searches for one of these terms, this page will appear near the top of the search results. The dynamic URL may return the same page and content when it’s clicked, but it is less likely to be weighted highly for these searches, and in some cases may not be indexed at all.

To address this issue, the ATG platform includes a Search Engine Optimization (SEO) feature that enables you to optimize your pages for indexing by web spiders, without compromising the human usability of the site. The key to this feature is the ability to render URLs in different formats, depending on whether a page is accessed by a human visitor or a web spider. This is handled through the atg.repository.seo.ItemLink servlet bean, which uses the User-Agent property of the HTTP request to determine the type of visitor. If the visitor is a spider, the servlet bean renders a static URL that the spider can use for indexing; otherwise, it renders a standard ATG dynamic URL.

Of course, the ATG request-handling components cannot actually interpret these static URLs. Therefore, the SEO feature also requires a servlet (atg.repository.seo.JumpServlet) that reads incoming static URLs (for example, if a user clicks a link returned by a Google search), and translates these URLs into their dynamic equivalents.

 
loading table of contents...