You can set the URL normalization properties in the default.xml file.

URL normalization (also called URL canonicalization) is the process by which URLs are modified and standardized in a consistent manner. The purpose of URL normalization is to transform a URL into a normalized or canonical URL so it is possible to determine if two syntactically different URLs are equivalent.

The Web Crawler performs URL normalization in order to avoid crawling the same resource more than once. By using the properties listed in the table, you can configure how the Web Crawler normalizes URLs.


Copyright © Legal Notices