The site.xml
file provides override property values for the global configuration file.
The default.xml
file is the global default configuration for your Oracle Commerce Web Crawler and should not change often. Only one copy of this file is shipped with the product, and it is located in the workspace/conf/web-crawler/default
directory.
The site.xml
file is where you make the changes that override the default settings on a per-crawl basis. The properties that you can add to the site.xml
file
are the same ones that are in the default.xml
file. A site.xml
file is included in the workspace/conf/web-crawler/polite-crawl
and workspace/conf/web-crawler/non-polite-crawl
directories, but not in the workspace/conf/web-crawler/default
directory.