The following functions operate at the Enumerate stage. These functions control whether and how a robot gathers links from a given resource to use as starting points for further resource discovery.
The enumerate-urls function scans the resource and enumerates all URLs found in hypertext links. The results are used to spawn further resource discovery. You can specify a content-type to restrict the kind of URLs enumerated.
The maximum number of URLs to spawn from a given resource. The default is 1024.
Content-type that restricts enumeration to those URLs that have the specified content-type. type is an optional property. If omitted, the function enumerates all URLs.
The following example enumerates HTML URLs only, up to a maximum of 1024:
Enumerate fn=enumerate-urls type=text/html
The enumerate-urls-from-text function scans text resource, looking for strings matching the regular expression: URL:.*. The function spawns robots to enumerate the URLs from these strings and generate further resource descriptions.
The maximum number of URLs to spawn from a given resource. The default, if max is omitted, is 1024
Enumerate fn=enumerate-urls-from-text