You can set the parser properties in the default.xml file.
Property Name | Property Value |
---|---|
parse.plugin.file | File name (default is parse-plugins.xml). Specifies the configuration file that defines the associations between content-types and parsers. |
parser.character.encoding.default | ISO code or other encoding representation (default is windows-1252). Specifies the character encoding to use when no other information is available. |
parser.html.impl | neko or tagsoup (default is neko). Specifies which HTML parser implementation to use: neko uses NekoHTML and tagsoup uses TagSoup. |
parser.html.form.use_action | Boolean value (default is false). If true, the HTML parser will collect URLs from Form action attributes. Note: This may lead to undesirable behavior, such as submitting empty forms during the next fetch cycle.
If false, form action attributes will be ignored. |
Note that the NekoHTML parser is the default HTML parser.