Search bots crawl site pages and index the pages based on content available in the HTML markup. Because Commerce Store Accelerator is a single-page application based on JavaScript, the full HTML markup for any given page is not available until after the JavaScript for the page has been processed. In order to provide a search bot with the markup it needs, a Commerce Store Accelerator application pre-generates the HTML markup for its pages, using the PhantomJS headless browser on the server side to execute the JavaScript. The pre-generated content is stored in the SEORepository. Commerce Store Accelerator uses a servlet filter called SEOFilter to identify search bot requests and then it retrieves the pre-generated HTML content from the repository and returns it to the search bot. If the requested page does not exist in the repository, the HTML content for it is generated, stored in the repository, and then served back to the search bot.

The SEOFilter, which is of class atg.filter.dspjsp.SEOFilter, uses two components, /atg/repository/seo/UserAgentDetector and /atg/repository/seo/PhantomjsRenderer, to identify search bot requests and render the complete markup for the requested URL. Both are configured as init-params on the SEOFilter element in the application’s web.xml file.

The /atg/repository/seo/UserAgentDetector component is a request-scoped component of class atg.repository.seo.UserAgentDetector. It has a browserType property, set to /atg/dynamo/servlet/pipeline/BrowserTypes/Robot, that defines the pattern an incoming request must match to in order to be considered a search bot request.

The /atg/repository/seo/PhantomjsRenderer component is of class atg.repository.seo.PhantomjsRenderer which is an implementation of the atg.repository.seo.MarkupRenderer interface. The MarkupRenderer interface requires that all implementing classes have a getHtmlContent() method that gets the complete HTML markup for a page. The PhantomjsRenderer class’s implementation of this method uses PhantomJS to perform the rendering. The PhantomjsRenderer component has the following properties:

In order to boost performance, Commerce Store Accelerator pages are pre-rendered and their content is stored in the SEORepository. When a search bot requests content for a page, Commerce Store Accelerator attempts to retrieve the content from the SEORepository first. If the page does not exist in the repository, Commerce Store Accelerator uses the PhantomJS headless browser to render the full HTML markup for the page, stores that markup in the SEORepository for future reference, and then returns the markup to the search bot.

In order to populate the SEORepository with pre-genererated pages, Commerce Store Accelerator uses two components. The /atg/endeca/sitemap/SiteLinksGenerator component, which is of class atg.endeca.sitemap.SiteLinksGenerator, generates a complete list of site links for the application. Then the /atg/repository/seo/SitemapPageCacheRenderer component, which is of class atg.repository.seo.SitemapPageCacheRenderer, invokes the PhantomJS headless browser for each link to generate the complete HTML markup and stores the pre-generated content in the SEORepository with the page URL as the key.


Copyright © 1997, 2016 Oracle and/or its affiliates. All rights reserved. Legal Notices