The format of the form-based authentication credentials file is as follows:
<?xml version="1.0" encoding="UTF-8"?> <credentials> <formCredentials> <authenticator> <className>authClass
</className> <configuration> <siteUrlPattern>siteUrl
</siteUrlPattern> <loginUrl>loginPageUrl
</loginUrl> <actionUrl>actionUrl
</actionUrl> <method>authMethodToUse
</method> <preCrawlAuth>shouldPreAuth
</preCrawlAuth> <parameters> <parameter> <name>paramName
</name> <value>paramValue
</value> </parameter> </parameters> <properties> <property> <name>propName
</name> <value>propValue
</value> </property> </properties> </configuration> </authenticator> </formCredentials> </credentials>
The meanings of the elements and attribute values are listed in the following table.
Element |
Meaning |
---|---|
|
Main opening elements. There can be only one set of these elements in the file. |
|
Defines one set of settings for the Authenticator plugin. The file will have multiple |
|
The name of the class that handles authentication logic. The Web Crawler default authenticator class is: |
|
Defines a set of credentials settings and properties. |
|
A regular expression that determines which sites will be authenticated (i.e., the Authenticator will be run only on those sites). |
|
The URL where the actual login is done (such as |
|
A full path to a URL that handles the logic for the GET/POST request, such as a CGI script. This field corresponds to the ACTION attribute of the form. Note that an action URL is often different from the login URL. |
|
A value of either |
|
Boolean value. Indicates whether authentication is done before the crawl starts (a value of |
|
Contains one or more sets of |
|
Contains a |
|
Contains one or more sets of |
|
Contains a |