28 Tuning Analytics Configuration Parameters

This chapter describes the configuration files that are updated by the silent installer when it installs Analytics. This chapter also provides guidelines for setting properties in the configuration files. The files are:

28.1 Overview of Configuration Files

When installing Analytics, the silent installer sets the values of various properties to match the values that you specified in analytics-build.properties. The properties are stored in the following files:

Once Analytics is installed, you can tune various properties directly in their respective files, as necessary. Guidelines are available in the rest of this chapter.

Caution:

Installation directories (and related configurations) created by the silent installer must not be moved, renamed, or otherwise modified in any way.

28.2 global.xml

The Analytics silent installer modifies the global.xml file on each server where the file is stored and where the installer is executed. Once Analytics is installed, you can customize global.xml directly on its respective hosts.The customizable sections are:

28.2.1 Parameters Within <params> </params>

This section contains properties that define Analytics installation directories, system administrators' contact information, the handling of raw data, and data processing conditions.

Caution:

In Table 28-1, some parameters have an asterisk (*) suffixed to indicate that these parameters MUST NOT be reset.

Table 28-1 Analytics parameters in global.xml

Parameter Description Sample Value/ Format Host

swchart_instdir*

Absolute path to the installation directory of the Swiff Chart Generator.

UNIX:

/usr1/software/SwiffChart

Windows:

C:/Program Files/GlobFX
 /Swiff Chart Generator3

None

engine_instdir*

Absolute path to the directory into which the Analytics application was deployed.

UNIX:

/usr/share/tomcat/analytics

Windows:

C:/CS/tomcat5/webapps
 /analytics

None

href_cs*

Context URL of the WebCenter Sites interface.

http://analytics
 .yourcompany.com/analytics

None

report_instdir*

Absolute path to the deployed report configuration XML files.

UNIX:

/data/analytics/reports

Windows:

C:/CS/reports

None

forgotpassword

Email address for the Analytics administrator who is responsible for password recovery.

admin@yourcompany.com

Reporting Node

noaccount

Email address of the Analytics administrator who is responsible for creating accounts.

admin@yourcompany.com

Reporting Node

href_reporting*

URL of the Analytics Reporting application. This URL is required to link the Analytics Reporting application to the Analytics Administrator application.

../analytics/

None

href_admin*

URL of the Analytics Administrator application. This URL is required to link the Analytics Administrator application to the Analytics Reporting application.

../analyticsadmin/

None

href_help*

URL at which help relating to Analytics can be obtained.

http://www.oracle.com
 /technetwork/middleware
 /webcenter/sites/overview
 /index.html

None

encoding*

Character encoding to be used for decoding request parameters and encoding response parameters. This encoding should match the encoding of the application server

utf8

None

hadoop.hdfs
 .defaultfs*

Location of the root directory under which raw data, output, and cache files are stored on the Hadoop file system.

hdfs://<hostname>:<port>
 /analytics

where:

<hostname> is the name of the master node

<port> is the NameNode port specified in the fs.default.name configuration parameter in hadoop-site.xml.

None

hadoop.local.
 cachedir*

Local path of the folder that stores the fileEnvObjects at job startup.

/usr/local/cache/

None

analytics.
 filtercurrentdata

Flag that specifies whether to skip processing data for the current day, week, month, and year.

Set this property to true if wish to enable any of the properties listed in the row below. (The properties are: analytics.filtercurrentXXX)

Default value: false

Master Node

analytics.
 filtercurrentXXX

(where XXX stands for day, week, month, or year)

Flag that specifies whether to skip processing data for the current day/week/month/year. Processing for each day/week/month/year can be set individually by adding the following parameters and setting them to false:

analytics.filtercurrentday
analytics.filtercurrentweek
analytics.filtercurrentmonth
analytics.filtercurrentyear

Note: If you add one or more of the above parameters, make sure that analytics.filtercurrentdata (listed above) is set to true.

analytics.filtercurrentday=
 false

Master Node

admin.context*

URL of the Analytics Administrator application. Add a trailing slash (/).

http://<hostname>:<port>
 /analyticsadmin/

where:

<hostname> is the name of the Admin Server

<port> is the port of the application server

None

sensor.context*

URL of the sensor application. Add a trailing slash.

http://<hostname>:<port>/sensor/

where:

<hostname> is the name of the Data Capture server

<port> is the port of the application server

None

monitoring.
 registry.port*

Port on which to start the RMI service.

11199

None

href.hadoop.
 tasktracker*

URL to the Hadoop TaskTracker admin interface.

http://<hostname>:50030/

where:

<hostname> is the name of the master node

 
href.hadoop.
 filesystem.
 browser*

URL to the hdfs file system browser interface.

http://<hostname>:50070/

where:

<hostname> is the name of the master node

 
importer.sleeptime

Time interval (in minutes) after which hdfsagent will look for raw data to be copied from the local file system to hdfs.

10

Data Capture

NumberOfProcessor
 Threads

Number of jobs that run simultaneously. Each job is divided into tasks by Hadoop.

For high-volume systems, NumberOfProcessorThreads should be set to 1, so that only a single job can run at a given time. For low-volume systems or in demonstration scenarios, the value can be greater (3, 4, or 5).

1

Master Node


sensor.
 thresholdtime

Time interval (in minutes) after which the sensor will rotate the data.txt.tmp file (where incoming raw data is first written) to data.txt. Set a time interval such that no more than 5 to 10GB of data will be processed during the interval.

If this parameter is omitted, the default threshold time (10 min) is used.

240

Data Capture


session.rotate.
 delay

Interval of time (in minutes) after midnight that raw session data is kept open.

The default is 360 minutes. This means that session data will be moved to HDFS 6 hours after midnight; session processing will then start.

360

Data Capture

scheduler.
 checkinterval

Specifies the frequency at which the scheduler will create new Hadoop jobs for fresh data.

The default value is 15 minutes.

15

Master Node

midnight.offset

Allows the system to derive relative midnight used for file rotation. Relative midnight and session.rotate.

delay determine when the daily cycle for capturing session data ends.

Default value: 0

Format: minutes

Data Capture

cs_enabled

Specifies whether buttons for navigating to the WebCenter Sites interface are enabled or disabled in the Analytics interface.

Default value: true

Reporting Application

archive.enabled

Specifies whether HDFS Agent archiving of raw data files is enabled. When this property is set to true, HDFS Agent will automatically create archives of raw analytics data on a periodic basis. The archive directory and start time are specified in the following properties: archive.output.dir and archive.start.time

Default value: false

Data Capture

archive.output.dir

Path to the directory for storing archived data files. Must be a valid URI.

Format: directory path

Data Capture

archive.start.
 time

Start time for archiving. The archiving task will start at HH:mm on a daily basis.

Default value: 06:00

Format: HH:mm in 24-hour time format. HH ranges from 00–23; mm ranges from 00–59.

Data Capture

purgejobs.enabled

When this property is set to true the system will automatically schedule cleanup jobs to remove subfolders and files after they have been successfully processed.

Default value: false

Master Node

notification.
 enabled

Indicates whether email notifications are enabled. Email notifications are sent when the availability of Analytics services changes.

Default value: false

Admin Node

mail.from

Email address from which notifications are sent.

Format: Email address

Admin Node


sensor.
 requestqueue.
 maxsize

Specifies CRITICAL condition for the Analytics Sensor.

This property specifies a threshold value that triggers a CRITICAL (red) condition when the sensor cannot respond quickly enough to the amount of raw data that it needs to record. When the threshold is reached or exceeded, the Analytics Sensor component is displayed in red.

The threshold value for this property is expressed as an object impression, i.e., a single invocation of the sensor servlet.

(The Analytics Sensor component is represented in the Overview panel of the Components tab of the Analytics Administration interface.)

Default value: 10000

Admin Node


sensor.
 requestqueue.
 warnsize

Specifies WARNING condition for the Analytics Sensor.

This property specifies a threshold value that triggers a WARNING (yellow) condition when the sensor cannot respond quickly enough to the amount of raw data that it needs to record. When the threshold is reached or exceeded, the Analytics Sensor component is displayed in yellow.

The threshold value for this property is expressed as an object impression, i.e., a single invocation of the sensor servlet.

(The Analytics Sensor component is represented in the Overview panel of the Components tab of the Analytics Administration interface.)

Default value: 3000

Admin Node

ipBlacklist

Stores IP addresses whose records will be ignored during the normalization process. This parameter's value is a comma-separated list of IP addresses (and/or fragments) to which the requesting IP address is compared. If the requesting address matches any of the listed values, it is dropped.

  • If you wish to restrict specified IP addresses from accessing all sites, enclose the ipBlacklist parameter within the following element:

<params host="default"
...
</params>
  • If you wish to restrict specified IP addresses from accessing specific sites, then for each site enclose the ipBlacklist parameter within the following element:

<params host="siteName"> 
...
</params>

Format:

<param type="string"
name="ipBlacklist"
value="ip_address-1,
 ip_address-2, ..."/> 
  • Sample IP address blacklist for all sites:

    <params host="default" 
    <param type="string" name="ipBlacklist"  
    value="123.45.67.89,  
            192.168.1.4" 
    />
    </params>
    
  • Sample IP address blacklist for a specific site:

    <params host= "CompanySite.com">
    <paramtype="ipBlacklist"value="123.45.67.89, 192.168.90.5"/>
    </params>
    

Data Capture


28.2.1.1 Database Connection Parameters

Typically, users require only one database connection. Custom reporting may require multiple connections. If you need to define your own JDBC resources or reference the existing JDBC connections via JNDI, use the following tag:

<connection
    name="<connection_name>"
    default="true"
    type="<jdbc_or_resource>"
    classname="<database_driver_classname>"
    url="<database_url>"
    user="<database user name>"
    password="<database password>" />

Table 28-2 describes database connection parameters.

Table 28-2 Database Connection Parameters

Parameter Description

name

Name of the connection.

Example: localhostDB

default

There must be exactly one connection marked with default="true"

type

Type of connection: jdbc (JDBC) or resource (JNDI)

resourcename

JNDI attribute; JNDI name

Note: Used only if type is set to resource

classname

JDBC driver class.

Example: oracle.jdbc.driver.OracleDriver

url

JDBC URL

user

JDBC attribute; database user name

password

JDBC attribute; database password


Example 28-1 Example JDBC:

<connection
    name="jdbcsample" 
    default="true" 
    type="jdbc" 
    classname="oracle.jdbc.driver.OracleDriver" 
    url="jdbc:oracle:thin:@dbserver:1521:sid" 
    user="analytics" 
    password="analytics" />

Example 28-2 Example JNDI:

<connection 
    name="conn1" 
    default="false" 
    type="resource" 
    resourcename="java:comp/env/jdbc/tadev" />

28.2.2 LFS Logwriter Implementation Parameters

The LFS logwriter implementation writes incoming raw data to the local file system. If you wish to change the root path (the location to which raw data will be written), use the following tag:

<logwriters>
    <logwriter type="LFS" name="LFS" rootpath="C:/analytics/
      sensorlocal" />
</logwriters>

Caution:

In Table 28-3, some parameters have asterisk (*) suffixed to indicate that these parameters MUST NOT be reset.

Table 28-3 Logwriter Parameters

Parameter Description
type*

Type of logwriter.

Legal value: LFS

name*

Alias name of the logwriter.

Legal value: LFS

rootpath

Location, on the local file system, to which raw data will be written.

Examples:

UNIX:

rootpath="/analytics/sensor"

Windows:

rootpath="c:/analytics/sensor"

28.3 log4j.properties

The silent installer modifies the log4j.properties file on each server where the file is stored and the installer is executed. Once the Analytics installation is complete, you can customize properties directly in all or selected log4j.properties files, as shown in Table 28-4. (The log4j.properties file is located in <HADOOP_HOME>/conf.)

Table 28-4 Parameters in log4j.properties

Property Description Example / Format
log4j.rootLogger

Specify the log level and the appender of the root logger. Multiple appenders can be specified, separated by commas.

log4j.rootLogger=INFO,
 DaRoFiAppender

- or -

log4j.rootLogger=INFO,
 DaRoFiAppender,
 ConsoleAppender
log4j.category.
 com.fatwire.
 analytics

Specify the log level. The following explains the log levels in decreasing order of severity:

  • FATAL – Severe errors that cause premature termination.

  • ERROR – Runtime errors, or unexpected conditions.

  • WARN – Other runtime situations that are undesirable or unexpected, but not necessarily "wrong."

  • INFO – Provides informative messages about the workflow and status of the application.

  • DEBUG – Various kinds of debug information.

  • TRACE – All logging information

Note: In production mode, this property should be set to WARN.

INFO
log4j.appender.
 DaRoFiAppender

Specify the appenders to be used for logging.

org.apache.log4j.
 DailyRollingFileAppender
log4j.appender.
 DaRoFiAppender.
 datePattern

Specify the date pattern in the following format:

'.'yyyy-MM-dd
log4j.appender.
 DaRoFiAppender.file

Specify the location of the log file, along with the name of the log file.

../logs/xxx.log
log4j.appender.
 DaRoFiAppender.layout

Specify the layout.

org.apache.log4j.PatternLayout
log4j.appender.
 DaRoFiAppender.layout.
 ConversionPattern

Specify the layout pattern.

%d{ISO8601}%- 5p[%t]%c:%m%n

28.3.1 Setting Up Logging for the Hadoop Job Scheduler

Edit log4j.properties file by adding the following parameters:

hadoop.root.logger=WARN,console, DRFA
hadoop.log.file=hadoop.log
log4j.rootLogger=${hadoop.root.logger}, DRFA, EventCounter

#
# Daily Rolling File Appender
#
log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender
log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file}

# Rollver at midnight
log4j.appender.DRFA.DatePattern=.yyyy-MM-dd

# 30-day backup
log4j.appender.DRFA.MaxBackupIndex=30
log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout

# Pattern format: Date LogLevel LoggerName LogMessage
log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n  

28.4 futuretense_xcel.ini

The silent installer modifies the futuretense_xcel.ini file of the WebCenter Sites application that resides on the machine where the silent installer is running. The purpose of modifying the file is to specify the location of the Analytics application and the authorized user.

Caution:

In Table 28-5, some parameters have asterisk (*) suffixed to indicate that these parameters MUST NOT be reset.

Table 28-5 Analytics Properties in futuretense_xcel.ini

Property Description Example
analytics.datacaptureurl*

URL where the Analytics data capture servlet (sensor servlet) is running

http://<ipaddress>:<port>
 /sensor/statistic
analytics.enabled

Indicates whether Analytics is available.

Note: If set to false, this property disables data capture.

true
analytics.piurl*

URL where the Analytics performance indicator servlet is running. For information about the performance indicator, see the "Integrating Oracle WebCenter Sites: Analytics with Oracle Web Center Sites" chapter in the Oracle Fusion Middleware WebCenter Sites: Analytics Administrator's Guide.

http://<ipaddress>:<port>
 /analytics/PI
analytics.reporturl*

URL where the generated report is displayed.

http://<ipaddress>:<port>
 /analytics/Report.do
analytics.user*

Pre-configured Analytics user who logs in to Analytics from WebCenter Sites.

csuser

Default in Analytics. Changing the name is not recommended.