If a Web server uses HTTP NTLM authentication to restrict access
to Web sites, you can specify authentication credentials that enable the Web
Crawler to access password-protected pages. The
http.auth.ntlm property sets the credentials to be
used by the HTTPClient for NTLM authentication.
Note: The Web Crawler only supports
Version 1 of the NTLM authentication scheme.
The credentials
must be specified in this format:
USERNAME1~~~PASSWORD1~~~HOST1~~~PORT1~~~REALM1~~~DOMAIN1|||USERNAME2~~~...
where:
- USERNAME is the
user ID to be sent to the server.
- PASSWORD is the
password for the user ID.
- HOST is a
specific host name to which the credentials apply (i.e., the host to be
crawled). Note that you cannot use the
ANY_HOST specifier.
- PORT is either
a specific host port or
ANY_PORT.
- REALM is either
a specific realm name on the host or
ANY_REALM.
- DOMAIN is
either a domain name or an IP address.
Note that the triple-tilde delimiter (
~~~)
must be used to separate the values.