The file

The file sets the logging properties.

You can modify the file to change the properties for the log4j loggers.

Default log4j properties

The default file has this configuration:
# Logger for crawl metrics


The presence of only the ConsoleAppender means that the standard output is directed to the console, not to a log file.

Logging to a file

You can change the default configuration so that messages are logged only to a file or to both the console and a file. For example, you would change the above configuration to a configuration similar to this:
# initialize root logger with level ERROR for stdout and fout
# set the log level for these components

# add a ConsoleAppender to the logger stdout to write to the console
# use a simple message format

# add a FileAppender to the logger fout
# create a log file
# use a more detailed message pattern

In the example, the FileAppender appends log events to the log file named crawl.log (which is created in the current working directory). The ConsoleAppender writes to the console using a simple pattern in which only the messages are printed, but not the more verbose information (logging level, timestamp, and so on).

In addition, you can change the component logging levels to any of these:
  • DEBUG designates fine-grained informational events that are most useful to debug a crawl configuration.
  • TRACE designates fine-grained informational events than DEBUG.
  • ERROR designates error events that might still allow the crawler to continue running.
  • FATAL designates very severe error events that will presumably lead the crawler to abort.
  • INFO designates informational messages that highlight the progress of the crawl at a coarse-grained level.
  • OFF has the highest possible rank and is intended to turn off logging.
  • WARN designates potentially harmful situations.
These levels allow you to monitor events of interest at the appropriate granularity without being overwhelmed by messages that are not relevant. When you are initially setting up your crawl configuration, you might want to use the DEBUG level to get all messages, and change to a less verbose level in production.

Note the default file contains a number of suggested component loggers that are commented out. To use any of these loggers, remove the comment (#) character.