The functions discussed in this chapter operate at the Enumerate stage. These functions control if and how a robot gathers links from a given resource in order to use as starting points for further resource discovery.
The enumerate-urls function scans the resource and enumerates all URLs found in hypertext links. The results are used to spawn further resource discovery. You can specify a content-type to restrict the kind of URLs enumerated.
The parameters used with the enumerate-urls function and their description are:
The maximum number of URLs to spawn from a given resource. The default, if max is omitted, is 1024.
Content-type that restricts enumeration to those URLs that have the specified content-type. type is an optional parameter. If omitted, it will enumerate all URLs.
The following example enumerates HTML URLs only, up to a maximum of 1024:
Enumerate fn=enumerate-urls type=text/html