If a Web server uses HTTP NTLM authentication to restrict access
to Web sites, you can specify authentication credentials that
enable the Web Crawler to access password-protected pages.
The http.auth.ntlm
property sets the credentials to be used by the HTTPClient for NTLM authentication.
Note: The Web Crawler only supports Version 1 of the NTLM authentication scheme.
The credentials must be specified in this format:
USERNAME1~~~PASSWORD1~~~HOST1~~~PORT1~~~REALM1~~~DOMAIN1|||USERNAME2~~~...
where:
- USERNAME is the user ID to be sent to the server.
- PASSWORD is
the password for the user ID.
- HOST is
a specific host name to which the credentials apply (i.e., the host to be crawled). Note that you cannot use the ANY_HOST
specifier.
- PORT is
either a specific host port or ANY_PORT.
- REALM is
either a specific realm name on the host or ANY_REALM.
- DOMAIN is
either a domain name or an IP address.
Note that the triple-tilde delimiter (
~~~)
must be used to separate the values.