The format of the form-based authentication credentials file is as follows:
<?xml version="1.0" encoding="UTF-8"?>
<credentials>
<formCredentials>
<authenticator>
<className>authClass</className>
<configuration>
<siteUrlPattern>siteUrl</siteUrlPattern>
<loginUrl>loginPageUrl</loginUrl>
<actionUrl>actionUrl</actionUrl>
<method>authMethodToUse</method>
<preCrawlAuth>shouldPreAuth</preCrawlAuth>
<parameters>
<parameter>
<name>paramName</name>
<value>paramValue</value>
</parameter>
</parameters>
<properties>
<property>
<name>propName</name>
<value>propValue</value>
</property>
</properties>
</configuration>
</authenticator>
</formCredentials>
</credentials>
The meanings of the elements and attribute values are listed in the following table.
|
Element |
Meaning |
|---|---|
|
|
Main opening elements. There can be only one set of these elements in the file. |
|
|
Defines one set of settings for the Authenticator plugin. The file will have multiple |
|
|
The name of the class that handles authentication logic. The Web Crawler default authenticator class is: |
|
|
Defines a set of credentials settings and properties. |
|
|
A regular expression that determines which sites will be authenticated (i.e., the Authenticator will be run only on those sites). |
|
|
The URL where the actual login is done (such as |
|
|
A full path to a URL that handles the logic for the GET/POST request, such as a CGI script. This field corresponds to the ACTION attribute of the form. Note that an action URL is often different from the login URL. |
|
|
A value of either |
|
|
Boolean value. Indicates whether authentication is done before the crawl starts (a value of |
|
|
Contains one or more sets of |
|
|
Contains a |
|
|
Contains one or more sets of |
|
|
Contains a |

