DP logging overview

This topic provides an overview of the Data Processing logging files.

Location of the log files

Each run of Data Processing produces one or more log files on each machine that is involved in the Data Processing job. The log files are in these locations:
  • On the client machine, the location of the log files is set by the log4j.appender.edpMain.Path property in the DP log4j.properties configuration file. The default location is the $BDD_HOME/logs/edp directory. These log files apply to workflows initiated by both Studio and the DP CLI. When the DP component starts, it also writes a start-up log here.
  • On the client machine, Studio workflows are also logged in the $BDD_DOMAIN/servers/<serverName>/logs/bdd-studio.log file.
  • On the Hadoop nodes, logs are generated by the Spark-on-YARN processes.

Local log files

The Data Processing log files (in the $BDD_HOME/logs/edp directory) are named edpLog*.log. The naming pattern is set in the logging.properties configuration.

The default naming pattern for each log file is
edp_%timestamp_%unique.log
where:
  • %timestamp provides a timestamp in the format: yyyyMMddHHmmssSSS
  • %unique provides a uniquified string
For example:
edp_20150728100110505_0bb9c1a2-ce73-4909-9de0-a10ec83bfd8b.log

The log4j.appender.edpMain.MaxSegmentSize property sets the maximum size of a log file, which is 100MB by default. Logs that reach the maximum size roll over to the next log file. The maximum amount of disk space used by the main log file and the logging rollover files is about 1GB by default.