Enabling report generation for assets. Because WebCenter Sites assets are specific to a WebCenter Sites installation, you must register their asset types with Analytics by assigning them to reports through the Analytics Administration interface. This enables Analytics to:
Recognize WebCenter Sites asset types
Configure report menu options in the "General Information" and "Content Information" report groups
Generate reports on assets of the registered asset types
The process of recording each visitor's clicks and the associated information—the date and time of each click, the assets that are clicked, the IP address from which the clicks are issued, the site being visited, and so on. The information is captured in real time by the
sensor servlet and recorded in a
data.txt.tmp file on the local file system (local to the Analytics data capture application). The
data.txt.tmp file will be rotated by the sensor to
data.txt when either the threshold interval is reached (see the
sensor.threshold property on sensor.thresholdtime), or the application server is restarted.
Analytics can capture data on the usage of WebCenter Sites assets and on their visitors only if published pages are tagged for data capture. In the case of Engage assets, the assets themselves must be tagged for data capture.
Runs jobs in a parallel and distributed fashion in order to efficiently compute statistics on the raw data that is stored in the Hadoop Distributed File System.
Hadoop implements a computational paradigm named
Map/Reduce, which divides a large computation into smaller fragments of work, each of which may be executed on any node in the cluster.
Map/Reduce requires a combination of
jar files and classes, all of which are collected into a single
jar file that is usually referred to as a "job" file. To execute a job, you submit it to a
JobTracker. Hadoop Jobs then responds with the following actions:
Schedules and submits the jobs to
Processes raw data captured by the data capture application into statistical data and injects the statistics into the Analytics database.
(Hadoop provides a web interface to browse HDFS and to determine the status of the jobs.)
Hadoop jobs pre-calculate commonly requested site usage statistics (such as average number of requests for a piece of content per unit time) in order to shorten report generation time. Statistical computation is typically resource-intensive and time-consuming. Therefore, it is performed not on-the-fly, each time a report is generated, but in advance so that it can be available by the time it is needed. Thus, precalculated statistics are immediately available for retrieval into reports. Statistics include, for example:
Current information, such as today's total hits to each site, visiting countries, total number of visits from a given country, types of browsers, and average session duration.
Historical results, such as:
Daily, weekly, and monthly statistics—for example, the total number of requests for a given asset on a given site during a certain month in the reporting period. Yearly statistics—a histogram in the performance indicator indicating the frequency with which certain assets were accessed during each week of the past year.
How long a Hadoop job runs depends on a number of factors, including site activity within the latest data capture time frame, the cumulative volume of captured data, and the configuration of the Analytics application. When data analysis is complete, the resulting statistics are available, at any time, for report generation.
Integrating Analytics with your WebCenter Sites system means enabling report generation for asset types and users on your online site. Integration involves registering CM sites, WebCenter Sites users, and asset types with Analytics, configuring the Pageview Object (through the "Page Views" Report), and granting users the appropriate permissions through membership in the appropriate user groups. The steps necessary to accomplish these tasks are described in Integrating Analytics with WebCenter Sites.
A search performed by a visitor using the site's built in search engine. This search returns results from within the site's contents.
An Analytics construct. The subject of a report.
When storing and processing information, Analytics uses objects, whereas WebCenter Sites uses assets and asset types. To allow Analytics to recognize a WebCenter Sites asset type and track assets of that type, administrators define an Analytics object in terms of a WebCenter Sites asset type. They do so by configuring an Analytics report for the object and assigning the desired asset type to that object. The process of configuring a report defines the underlying asset.
Note: A special instance of an object is the Pageview Object, which administrators must configure (by configuring the "Page Views" Report) in order for reports in the "General Information" group to work.
The "Page Views" report supports multiple asset types.
A single invocation of the sensor servlet. For more information, see Object Impressions.
An Analytics construct. A group of one or more assets, whose asset types are enabled for tracking by the Analytics data capture application.
Asset types are enabled for tracking when they are defined in the Pageview Object and when published pages displaying those asset types are tagged with
AddAnalyticsImgTag (data capture tag). For more information about tracking, see Data Capture.
A default Analytics object which you configure through the "Page Views" report. The
Pageview object is the basis for the "Page Views," "Site Information" and "Clickstream" reports, and thus it should be assigned asset types whose assets make the most sense (from the marketing standpoint) to be included in these reports.
Pageview object can be assigned multiple asset types. The "Page Views" report will contain statistics on the usage of those asset types.
A report, based on the Pageview Object. The "Page Views" report displays statistics on Page View activity on your site.
Visitor activity data that has been processed by Hadoop Jobs into statistical data. When processing is complete, the data is injected into the Analytics database, where it is immediately available for the reports that users request from the Analytics reporting interface.
Unprocessed data describing visitor activity on the site, recorded during the Data Capture process and stored in the local file system for future processing. This is the data on which statistics are calculated by the Hadoop Jobs for display in reports. (This data cannot be directly used for report generation.)
Identifying a WebCenter Sites CM site to Analytics in order to enable Analytics to track visitor activity on that site.