You configure how the URL filter plugins are handled in the default.xml file.
| Property Name | Property Value |
|---|---|
| urlfilter.regex.file | File name (default is crawl-urlfilter.txt). Specifies the file in the configuration directory containing regular expressions used by the urlfilter-regex (RegexURLFilter) plugin. |
| urlfilter.order | Space-delimited list of URL filter class names (default is empty). Specifies the order in which URL filters are applied. |
| urlfilter.filter-seeds | Boolean value (default is false). Specifies whether URL filtering should be applied to the seeds. |
Keep in mind that the crawl scope filter (if configured) is applied before all other filters including the regular expressions in this file custom plugins. This means that once a URL has been filtered out by the crawl scope, it cannot be added by expressions in this file.