Crawl scoping properties

You implement crawl scoping to control which URLs are crawled in the default.xml file..

A crawl scope defines the conditions under which a URL is considered within the scope of a crawl. A URL is within the crawl scope if it should be fetched for that crawl.

Crawl scoping is applied before all other filters, including the regular expressions in the crawl-urlfilter.txt file and custom plugins. This order of URL filtering means that even if a URL makes it through the crawl scope filter, it may still be filtered out by the crawl-urlfilter.txt file. However, a URL that is excluded by the crawl scope filter cannot be added by the crawl-urlfilter.txt file.

The crawl scope properties are listed in the following table.

Property Name	Property Value
`crawlscope.mode`	`ANY`, `SAME_DOMAIN`, or `SAME_HOST` (default is `SAME_HOST`). Specifies the mode for crawl scoping.
`crawlscope.on-redirected-seed`	Boolean value (default is `true`). Specifies whether to filter a URL based on its seed or its redirected seed.
`crawlscope.top-level-domains.generic`	Space-delimited list of top-level domain names. Do not modify this list because it may affect how domain names are retrieved. Contains a list of generic top-level domain names.
`crawlscope.top-level-domains.additional`	Space-delimited list of top-level domain names (default is empty). Specifies additional top-level domain names that are pertinent to your crawls.