8 Managing Content Tracker

Content Tracker and Content Tracker Reports are optional components that are automatically installed with Content Server. They are separate modules but, when enabled, work together to provide information about system usage. The information provided enables you to determine which content items are most frequently accessed and what content is most valuable to users or specific groups. Understanding the consumption patterns of your organization's content is essential to successful content management. This enables you to provide more appropriate, user-centric information more effectively.

This section covers the following topics:

8.1 Performance Optimization Functions

The current version of Content Tracker incorporates several optimization functions to ensure that information tracking processes are performed as efficiently as possible. These functions are implemented using applicable installation preference variables. Combined, the default values for these variables configure Content Tracker to function as efficiently as possible for use in high volume production environments.

Note:

By default, Content Tracker collects and records only content access event data. This excludes information gathering on non-content access events like searches as well as the collection and synthesis of user profile summaries. This configuration streamlines Content Tracker's functions and maximizes its overall performance.

However, during the Content Server installation process, you can optionally choose to enter alternate values for the various preference variables. If you prefer to initially accept the default values, you can manually change the values at a later time. When applicable, this is done by either editing the sct.cfg file (except for the preference variables) or using the update function in the Component Manager. See "Changing the Variable Settings for the Performance Optimization Functions".

The performance optimization functions include:

  • Content Access Only: This operating mode determines what types of information is collected. When enabled (the default), only content access events are recorded which excludes content searches and user profile information. As a result, Content Tracker populates only the SctAccessLog table (see "Combined Output Table"). The corresponding installation preference variable is SctTrackContentAccessOnly.

    Note:

    By default, Content Tracker collects static URL access event details in event logs for the Data Reduction process and services are logged in real time. However, only services that have event types for content access are logged.

    One exception is any service that causes an entry to be made in the DocHistory table will be tracked. This will happen regardless of whether the service, (for example, DELETE_REV) is defined in the Content Tracker services table (see "Service Calls" and "Service Call Configuration").

    Note:

    The Services Tab is not displayed if Content Access Only mode is ON (the default).

    Note:

    If the Activity Snapshots feature is enabled, Content Tracker will modify the metadata fields regardless of how the Content Access Only operating mode is set. This means that the User Metadata tables will be populated during data reduction.

    Note:

    The user reports are only visible if the Content Access Only operating mode is disabled. See "Report Generation".
  • Exclude Columns: This is a list of columns that Content Tracker does not populate in the SctAccessLog table. By default, bulky and rarely used information is not collected which reduces the size of the output table. The corresponding installation preference variable is SctDoNotPopulateAccessLogColumns.

  • Simplify User Agent: When enabled (the default), this function minimizes the information that is stored in the cs_userAgent column of the SctAccessLog table. The corresponding installation preference variable is SctSimplifyUserAgent.

  • Do Not Archive: When enabled (the default), this function ensures that the database tables contain the most current data and expired table rows are discarded rather than archived. The corresponding installation preference variable is SctDoNotArchive.

    Note:

    This optimization function is applicable to all Content Tracker database tables (the SctAccessLog table as well as the user metadata tables). By default, only the SctAccessLog table is populated but expired rows are not archived. However, if both the Content Access Only and Do Not Archive functions are disabled, all tables will be populated and their expired data archived.

8.2 About Content Tracker Components and Functions

Content Tracker monitors activity on your Content Server instance, and records selected details of those activities. It then generates reports that may help you understand the ways in which your system is being used. This section includes a very brief overview about Content Tracker and Content Tracker Reports functionality. It provides a basic background about these components and is intended to help prepare you for the more detailed information provided in "Operational Overview".

This section covers the following topics:

8.2.1 Content Tracker Summary

Content Tracker monitors your system and records information about various activities. This information is collected from various sources, then merged and written to a set of tables in your Content Server database. You can customize Content Tracker to change or expand the types of information it collects.

Note:

By default, Content Tracker collects and records only content access event data. This excludes information gathering on non-content access events like searches as well as the collection and synthesis of user profile summaries. However, during the Content Server installation process, the administrator can optionally choose to configure Content Tracker to collect all the information from the various categories. See "Performance Optimization Functions" for more information.

Content Tracker can monitor activity based on:

  • Content item accesses: Content Tracker gathers information about content item usage. The data is obtained from Web filter log files, the Content Server database, and other external applications such as portals and Web sites. Content item access data includes dates, times, content IDs and current metadata.

  • Content Server services: Content Tracker can track all services that return content, as well as services that handle search requests. However, by default, Content Tracker logs only the services that have content access event types. And, with a simple configuration change, Content Tracker can monitor literally any Content Server service, even custom services.

  • User accesses: Content Tracker can gather information about other non-content access events such as the collection and synthesis of user profile summaries. This data includes user names and user profile information.

8.2.2 Content Tracker Reports Summary

After Content Tracker extracts data and populates applicable database repository tables, the information is available for report generation. Content Tracker Reports enables you to:

  • Generate reports: Content Tracker Reports queries the tables created by Content Tracker and generates summary reports of various kinds of activities and the usage history of particular content items. The reports help you analyze specific groups of content or users based on metadata, file extensions, or user profiles. You can use the pre-defined reports that are provided, customize them to suit your installation, or use a compatible third-party reporting package.

  • Optimize content management practices: You can also use the reported data for content retention management. That is, depending on the access frequency of particular content items during specific time intervals, you may decide to archive or delete some of the items. Similarly, applications can use the data to provide portlets with the top content for particular types of users.

8.2.3 Data Recording Overview

Content Tracker records data from the following sources:

  • Web server filter: When content is requested via a static URL, the web server filter records certain details of the request and saves the information in one or more event log files. Event log files are organized according to the date on which the information was collected. The event log files are eventually used as input by the Content Tracker data reduction process.

  • Service handler filter: Content Tracker has a list of services that it monitors. By default, these include only serveices that return content. When one of these services is called, details of the service are copied and saved in the SctAccessLog table. You can change which services are monitored, and which details are recorded.

  • Content Tracker logging service: Content Tracker supports a general purpose logging service that is a single-service call that can be used to log an event. It can be called directly via a URL, as an action in a service script, or from IdocScript.

  • Content Server database tables: When configured to collect and process user profile information, the Content Tracker data reduction process will query selected Content Server database tables. This is done primarily to obtain information about the names and accounts of users who were active during the reporting period.

  • Application API: Content Tracker provides an interface by which other components and applications can be registered for tracking, and can have information about their activities recorded. For example, this interface allows cooperating applications, such as Site Studio, to log event information in real time.

    Note:

    The Application API is included in the SctApplicationFilter.hda file. This interface is designed as a code to code call which does not involve a Content Server service. The Application API is not meant for general use. If you are building an application and are interested in using this interface, you should contact Consulting Services.

8.2.4 Data Reduction Overview

The data reduction process gathers and merges the data obtained from the data recording sources. Until this reduction process has finished, the data in the Content Tracker tables is incomplete. You will usually run the reduction once for each day's worth of data gathered. The reduction may be run manually, or may be scheduled to run automatically, usually during an off-peak period when the system load is light.

8.2.5 Data Reporting Overview

Content Tracker Reports provides a set of reports that answer commonly asked questions about Content Server activity and usage. For example, you can determine which managed objects have been accessed most frequently. Depending on how Content Tracker is configured, these reports can also indicate which searches are used most often, and which users have been most active.

These reports are available directly, via the Content Tracker Report Generator Main Page, and indirectly as an action on the Content Information page. The available categories of pre-defined report options are contingent on whether or not Content Tracker is configured to collect only content access events or all types of tracking information. The reports, the underlying queries, and the output formatting are available for customization.

8.2.6 Content Tracker Terminology

You should be familiar with the following terminology when using Content Tracker and Content Tracker Reports:

  • Data collection: Gathering content access information programmatically and writing the information to event log files.

  • Data reduction: Processing the information from data collection and merging it into a database table.

  • Data Engine Control Center: The applet interface that provides access to the user-controlled functions of the Data Engine. The Data Engine Control Center is used to enable, schedule, and monitor data collection. It is also used to collect and manage data about user activity and service calls.

  • Collection: Tab used to enable data collection.

  • Reduction: Tab used to stop and start data reduction (that is, merging data into database tables).

  • Schedule: Tab used to enable automatic data reduction.

  • Snapshot: Tab used to enable activity metrics. Also, the word snapshot is used to denote an information set that represents the world as it existed at a particular time in the past. It is historical information that is instantaneous in nature and specifies what user accessed a particular content item at a particular moment.

  • Services: Tab used to add, configure, and edit Content Server service calls to be logged. It is also used to define the specific event details that are to be logged for a given service.

  • Service definitions: The ResultSet structure in the service call configuration file (SctServiceFilter.hda) that contains entries to define each Content Server service call to be logged. The service definition ResultSet is named ServiceExtraInfo.

  • ServiceExtraInfo ResultSet: See Service definitions.

  • Service entry: The entry in the service definition ResultSet (ServiceExtraInfo) that defines a Content Server service call to be logged. The ServiceExtraInfo ResultSet contains one service entry for each service to be logged.

  • Field map: A secondary ResultSet in the service call configuration file (SctServiceFilter.hda) that defines the service call data and the specific location where the data is to be logged.

  • Top Content Items: Most frequently accessed content items in the system.

  • Content Dashboard: An HTML page that provides overview information about the access of a specific content item.

8.2.7 General Limitations

Content Tracker is supported on most hardware and networked configurations. There are, however, certain hardware and software combinations that require special consideration. Some known limitations include:

8.2.8 General Considerations

The following general considerations are applicable for the current version of the Content Tracker and Content Tracker Reports components:

  • Hanging browser:

    If Content Server happens to terminate while the Data Engine Control Center is running, the browser can also hang. To easily resolve this issue, close the hung browser window.

  • Local time vs. GMT:

    A configuration parameter enables you to use local time instead of Greenwich Mean Time (GMT) to record user access times:

    • SctUseGMT=true configures Content Tracker to use GMT.

    • SctUseGMT=false configures Content Tracker to use local time. This is the default setting.

    If you are performing a new installation of Content Tracker and use the default setting for SctUseGMT, user accesses will be recorded in local time. If you are upgrading from an earlier version of Content Tracker and use the default setting for SctUseGMT, there will be a one-time retreat (or advance, depending on your location) in access times. Also, to accommodate the biannual daylight savings time changes, there will be discontinuities in recorded user access times (contingent on whether you use local time and your location).

8.3 Operational Overview

Content Tracker captures information regarding the consumption patterns of content items. Activity information is collected daily and includes tracking content accessed from Content Server by end users directly or via external applications such as portals and Web sites. The information is gathered from Content Server Web filter log files, the Content Server database, and other external applications such as portals and Web sites.

Once the data is collected, Content Tracker combines, analyzes and synthesizes the event information and loads the summarized activity into database tables. After reduction, this data becomes available for reporting purposes. You can use the Content Tracker Report Generator main page to produce reports that identify content usage trends. This will help you understand how your system is being used resulting in more successful content management.

This section covers the following topics:

8.3.1 Data Collection and Processing

Depending on how Content Tracker is configured, it can collect event information such as dynamic and static content accesses and service calls. Several mechanisms are used to collect the data.

  • Service Handler Filter: Examines Content Server service requests and writes certain details from them directly to the SctAccessLog table in real time. Only services listed in the SctServiceFilter.hda file are logged.

  • Web Server Filter: Collects data values from static URLs and logs them in raw data files.

  • Content Tracker Logging Service: Used to log event information generated by a suitably configured application.

This section covers the following topics:

8.3.1.1 Standard Data Reduction Process

During the data reduction process, the static URL information is extracted from the raw data files (see "Content Tracker Event Logs") and combined with the service information already stored in the SctAccessLog table (see "Combined Output Table").

Note:

By default, Content Tracker collects and records data only for the SctAccessLog table. Although the user data output tables exist, Content Tracker does not populate them. See "Performance Optimization Functions".

Depending on how Content Tracker is configured, this reduction process can:

  • Combine access information for static URL content access with service details.

  • Summarize information about user accounts that were active during the reporting period. This information is rolled up and written to the Content Tracker's user metadata database tables. See "Data Output" for details.

Surrounding text describes data_process.gif.

8.3.1.2 Data Reduction Process with Activity Metrics

Content Tracker provides the option to selectively generate search relevancy data and store it in custom metadata fields. The snapshot function enables you to choose which activity metrics to activate. The logged data provides content item usage information that indicates the popularity of content items.

Note:

By default, Content Tracker collects and records data only for the SctAccessLog table. Although the user data output tables exist, Content Tracker does not populate them unless the Snapshot function is activated. However, using the snapshot function will affect Content Tracker's performance. See "Performance Optimization Functions" for more information.

If you activate the snapshot function and activity metrics, the values in the custom metadata fields are updated following the reduction processing phase. When users access content items, the values of the applicable search relevance metadata fields change accordingly. Then, during the subsequent post-reduction step, Content Tracker uses applicable SQL queries to determine which content items were accessed during the reporting period.

Content Tracker updates the applicable database table metadata fields with the new values and initiates a re-indexing cycle. However, only the content items whose access count metadata values have changed are re-indexed. For more information about the snapshot function, the user interface screen, and activating the activity metrics, see "Snapshot Tab". For more information about the activity metrics' SQL queries and how to customize them, see "Activity Metrics SQL Queries".

The post-reduction processing step is necessary to:

  • Process and tabulate the activity metrics for each affected content item and load the data into the assigned custom metadata fields.

  • Initiate a re-indexing cycle on the content items with changed activity metrics values. This ensures that the data is part of the search index and, consequently, accessible for selecting and ordering search results.

Surrounding text describes data_process_sql.gif.

8.3.2 Data Collection

Content Tracker data collection includes collecting information from static URL references along with Content Server service call events. Both types of data are recorded in a combined output table (SctAccessLog). However, service calls are inserted into the log in real time whereas the static URL information must first undergo the reduction process (either manual or scheduled).

This section covers the following topics:

8.3.2.1 Service Handler Filter

The Content Server service handler filter is the primary Content Tracker data collection mechanism. This filter makes it possible for Content Tracker to obtain information about dynamic content requests that come through the web server, and also about other types of Content Server activity, such as calls from applications. The service request details are obtained from the DataBinder that accompanies the service call, and the information is stored in the combined output table (SctAccessLog) in real time. For more information about the SctAccessLog table, see "Combined Output Table".

There is a user-modifiable configuration file that is used to determine which Content Server service calls are logged. This file (SctServiceFilter.hda) uses a ResultSet structure that includes one service definition entry for each service to be logged. If you are using the extended service logging function, the SctServiceFilter.hda file also contains field maps that correspond to various service definition entries, see "Services Tab". For more detailed information about configuring service calls using the service handler filter, see "Service Call Configuration".

The ResultSet included in the SctServiceFilter.hda file is named ServiceExtraInfo. This ResultSet contains one or more service entries that define the services to be logged. To support the extended service logging function, additional ResultSets are used. These are called field map ResultSets. Each service that will have additional data values tracked must have a corresponding field map ResultSet in the SctServiceFilter.hda file. Field map ResultSets define the data fields, locations, and database destination columns for the related service.

8.3.2.2 Web Server Filter

Managed content that is retrieved via a static URL does not usually invoke a Content Server service. Therefore, the Content Tracker web server filter collects the access event details (static URL references) and records them in raw event logs (sctlog files). The information in these files requires an explicit reduction (either interactive or scheduled) before it is included in the combined output table (SctAccessLog) along with the service call data.

For more information about the sctlog files, see "Content Tracker Event Logs". For more information about the SctAccessLog table, see "Combined Output Table".

8.3.2.3 Content Tracker Logging Service

The Content Tracker logging service is a single-service call that may be called directly via a URL or as an action in a service script. It may also be called from IdocScript using the executeService() function. The calling application is responsible for setting any and all fields in the accompanying service DataBinder that need to be recorded, including the descriptive fields listed in the Content Tracker service filter configuration file (SctServiceFilter.hda). For more detailed information about configuring service calls using the Content Tracker logging service, see "Service Call Configuration".

Note:

There should be no duplication or conflicts between services logged via the service handler filter and those logged via the Content Tracker logging service. If a service is named in the Content Tracker service handler filter file then such services are automatically logged so there is no need for the Content Tracker logging service to do it. However, Content Tracker will make no attempt to prevent such duplication.

8.3.3 Data Reduction

During Content Tracker data reduction, the static URL information captured by the web server filter is merged and written into the output table (SctAccessLog) alongside the service call data. At the time of the reduction and contingent on how Content Tracker is configured, the Content Tracker user metadata database tables are also updated with information collected from the static URL accesses and from the service call event records gathered during the reporting time period.

This section covers the following topics:

8.3.3.1 Content Tracker Event Logs

When the Content Tracker web server filter collects the access event details (static URL references), it records the information in raw event logs (sctLog files). The information in these files requires an explicit reduction (either interactive or scheduled) before it is included in the combined output table (SctAccessLog) along with the service call data.

Content Tracker supports multiple input files for different event log types and for configurations with more than one web server. For this reason, each web server filter instance uses a unique tag as a filename suffix for the Content Tracker event logs. The unique identification suffix consists of the web server host name plus the server port number. The reduction process searches for and merges multiple raw event logs named sctLog-yyyymmdd-<myhostmyport>.txt. The raw event logs are processed individually.

This section covers the following topics:

8.3.3.1.1 Recorded Usernames in Content Access Entries

Occasionally, in a raw event log entry, you may notice that Content Tracker does not capture a username for a content access event, even though the user is logged into Content Server. For example, a logged-in user performs a search, views the content information of an item, and clicks the Web location link. The raw event log entry includes the information except the username.

In this case, the item was accessed via a static URL request and, in general, the browser does not provide a username unless the web server asks it to send the user's credentials. In particular, if the item is public content, the web server will not ask the browser to send user credentials, and the user accessing the URL will be unknown.

If you want Content Tracker to record the username for every document access, then you will need to configure your system such that a user login is required for every content item access. To do this, you must ensure that your content is not accessible to the guest role. In other words, if your content is not public, the user's credentials will be required to access the items. This ensures that a username is recorded in the raw event log entry.

8.3.3.1.2 File Storage after Data Reduction

Depending on how Content Tracker is configured, when raw data log files in the "new" cycle are reduced, the Data Engine moves the data files into the following subdirectories:

  • The default number of data sets that the recent/ directory can hold is 60 sets (dates) of input data log files. When the number of data sets is exceeded, the eldest are moved to the /archive directory.

    <cs_root>/data/contenttracker/data/recent/yyyymmdd/

  • By default, Content Tracker does not archive data. Instead, the expired rows are discarded to ensure optimal performance. However, if appropriately configured (see "Performance Optimization Functions"), Content Tracker uses the archive/ directory to hold all input data log files that have been moved out of the "recent" cycle.

    <cs_root>/data/contenttracker/data/archive/yyyymmdd/

When raw data files are reduced, another file (reduction_ts-yyyymmdd.txt) is generated as a time stamp file. For more detailed information about reduction cycle states for raw data file processing, see "Reduction Tab".

8.3.3.2 Combined Output Table

The SctAccessLog table contains entries for all static and dynamic content access event records. The SctAccessLog table is organized using one line per event in the reporting period. The rows in the table are tagged according to type:

  • S indicates the records logged for service calls.

  • W identifies the records logged for static URL requests.

This section covers the following topics:

8.3.3.2.1 File Types for Entries in the SctAccessLog

By default, Content Tracker does not log accesses to GIF, JPG, JS, CSS, CAB, and CLASS file types. This means that Web activity involving GIFs, JPGs, JSs, CSSs, CABs, and CLASSs will not result in entries in the web server filter event log files. Subsequently, entries for these file types will not be included in the combined output table (SctAccessLog) after data reduction.

To change the logging status of these file types, the desired file types must be enabled in the sct.cfg file located in the IntradocDir/custom/ContentTracker/resources/ directory. To enable logging of these file types, adjust the default setting for the SctIgnoreFileTypes configuration variable (gif,jpg,js,css). The default setting excludes these file types. To include one or more of these file types, delete each desired file type from the list. To ensure that these changes take effect, it is necessary to restart the web server and Content Server.

For more detailed information about configuration variables and the sct.cfg file, see "Configuration Variables".

8.3.3.2.2 URLs Collected for Entries in the SctAccessLog

The Content Tracker web server filter cannot distinguish between URLs for user content and those used by the Content Server user interface. Therefore, it is possible that references to UI objects, such as client.cab, may appear in the static access logs. To eliminate these false positives, you may define a list of directory roots that are to be ignored by the Content Tracker filter.

The list of directories is stored in the SctIgnoreDirectories configuration variable in the sct.cfg file located in the <cs_root>/data/contenttracker/config/ directory. This list will eliminate most if not all of the user interface object references. See "Configuration Variables".

You can manually change the contents of the SctIgnoreDirectories value to list all the directories whose content should be ignored. You may want to change the default value:

  • If you wish access to the UI objects to be logged along with user content.

  • If you have a different view of which directories should be logged and which should be excluded from the logs.

8.3.3.2.3 Information Collected for Entries in the SctAccessLog

The following table provides the information collected for each record in the SctAccessLog table.

Note:

By default, Content Tracker does not collect data to populate certain columns for bulky and rarely used items such as cs_referrer and cs_cookie. This ensures optimal performance. See "Performance Optimization Functions" for more information.
Column Name Type / Size Column Definition
SctDateStamp datetime Local date when data collected - YYYYMMDD (contingent on customer location and time of day event occurs -- may differ from date recorded for eventDate); time set to 00:00:00

Data source: Internal

SctSequence int / 8 Sequence unique to entry type

Data source: Internal

SctEntryType char / 1 Entry type - "W" or "S"

Data source: Internal

eventDate datetime GMT time and date when request completed (date contingent on customer location and time of day event occurs -- may differ from date recorded for SctDateStamp)
SctParentSequence integer Sequence of outermost Service Event in tree, if any.
c_ip varchar / 15 IP of client
cs_username varchar / 255  
cs_method varchar / 10 "GET"
cs_uriStem varchar / 255 Stem of URI
cs_uriQuery varchar / [maxUrlLen] Query portion, e.g. "IdcService=GET_FILE&dID=42..."
cs_host varchar / 255 Content Server server name
cs_userAgent varchar / 255 Client User Agent Ident

By default, this column contains either "browser" or the suffix of any string beginning with "java:". This simplification ensures optimal performance for Content Tracker.

cs_cookie varchar / [maxUrlLen] Current cookie
cs_referer varchar / [maxUrlLen] URI leading to this request
sc_scs_dID int / 8 dID

Data source: from query or derived from URL (reverse lookup)

sc_scs_dUser varchar / 50 dUser

Data source: Service DataBinder "dUser"

sc_scs_idcService varchar / 255 Name of IdcService, e.g. GET_FILE

Data source: from query or Service DataBinder "IdcService"

sc_scs_dDocName varchar / 30 dDocName

Data source: from query of Service DataBinder "dDocName"

sc_scs_callingProduct varchar / 255 Arbitrary identifier

Data source: SctServiceFilter config file or Service DataBinder "sctCallingProduct"

sc_scs_eventType varchar / 255 Arbitrary identifier

Data source: SctServiceFilter config file or Service DataBinder "sctEventType"

sc_scs_status varchar / 10 Service execution status

Data source: Service DataBinder "StatusCode"

sc_scs_reference varchar / 255 "web", "native", "sdc_url"

Values indicate the rendition of the accessed file; "web" = converted file (PDF), "native" = actual original file, and "sdc_url" = HTML.

Data source: algorithmically from query parameters or ServiceFilter config file

comp_username varchar / 50 Computed username. If a Service, obtained from UserData Service Object or HTTP_INTERNETUSER or REMOTE_USER or dUser. If a static URL, obtained from auth-user or internetuser.
comp_validRef char / 1 "1" if the access was a Web reference (W), and ispromptlogin and isaccessdenied are both NULL, and the static URL exists at reduction time. Or, if the access was a service call (S) and the sc_scs_status field is NULL.

"NULL" if the static URL did not exist at reduction time, or the user logon failed, or the logon succeeded but the user was not authorized to view the object.

Indicates whether the referenced object exists and is available to the requesting user.

sc_scs_isPrompt char / 1 "1" if true

Data source: Plugin immediateResponseEvent field "ispromptlogin"

sc_scs_isAccessDenied char / 1 "1" if true

Data source: Plugin immediateResponseEvent field "isaccessdenied"

sc_scs_inetUser varchar / 50 Internet user name (if security problem)

Data source: Plugin immediateResponseEvent field "internetuser"

sc_scs_authUser varchar / 50 Authorization user name (if security problem)

Data source: Plugin immediateResponseEvent field "auth-user"

sc_scs_inetPassword varchar / 8 Internet password (if security problem)

Data source: Plugin immediateResponseEvent field "internetpassword"

sc_scs_serviceMsg varchar / 255 Content Server service completion status

Data source: Service DataBinder "StatusMessage"

extField_1 through extField_10 varchar / 255 General purpose columns to use with the extended service tracking function. In the field map ResultSets, the DataBinder fields are mapped to these columns.

8.3.4 Data Output

Content Tracker snapshots static URL accesses and logs service calls that are written in real time to the combined output table. Data reduction is necessary to process the static URL information and add it to the combined output table. Depending on how Content Tracker is configured, the data reduction process updates the related user metadata database tables with the information derived from processing the static URL data and the service call data.

Additionally, when Content Tracker is appropriately configured, the static and dynamic content access request information as well as all metadata fields are accessible for use in reports generated by the Content Tracker Reports component. The logged metadata includes content item and user metadata.

This section covers the following topics:

8.3.4.1 Content Item Metadata

Rather than collecting content item metadata information, Content Tracker uses the standard Content Server metadata tables for content item metadata. This means that Content Tracker reports necessarily reflect current content item metadata. Therefore, if content item metadata has changed since a content item was accessed, any generated reports will reflect the changed metadata.

8.3.4.2 User Metadata

When appropriately configured, Content Tracker will collect and synthesize user profile summaries during the data reduction process. Thus, Content Tracker user metadata database tables are updated with information collected about users that were active during the reporting time period. These tables retain historically accurate user metadata. The names of the user metadata tables are formed from the root, which indicates the class of information contained, and an "Sct" prefix to distinguish the Content Tracker tables from native Content Server tables.

Note:

By default, Content Tracker does not archive data. This means that Content Tracker does not move expired rows from the Primary tables to the Archive tables. Instead, the expired rows are discarded which ensures optimal performance. See "Performance Optimization Functions" for more information.

Two complete sets of user metadata database tables are created:

  • Primary

    The Primary tables, named SctUserInfo, etc., contain the output for reduction data in the "new" and "recent" cycles.

  • Archive

    The Archive tables, named SctUserInfoArchive, etc., contain output for reduction data in the "archive" cycle.

When a reduction data file is moved from "recent" to "archive," the associated table records are moved from the Primary table to the Archive table. This prevents excessive buildup of rows in the Primary tables, and ensures that queries performed against recent data will complete quickly. Rows in the Archive table will not be deleted. You may move them to alternate storage for your historical records, or you may delete them using any SQL query tool. For more information about the reduction process and data cycles, see "Reduction Tab".

Tip:

If you wish to delete all the rows in the Archive tables, you can simply delete the tables themselves. They will be recreated during the next Content Server restart.

Reports are not run against archive data. Therefore, any data that has been demoted from 'recent' to 'archive' will not be included in the generated reports.

This section covers the following topics:

8.3.4.2.1 SctAccounts Table

The SctAccounts table contains a list of all accounts. The SctAccounts table is organized using one line for each account.

Field Name Type / Size / Field Definition
SctDateStamp datetime

GMT day when data is collected

dDocAccount varchar / 30

Name of a Content Server account


8.3.4.2.2 SctGroups Table

The SctGroups table contains a list of all user groups current at time of reduction. The SctGroups table is organized using one line per content item group.

Field Name Type / Size / Field Definition
SctDateStamp datetime

GMT day when data is collected

dGroupName varchar / 30

Name of a content item group


8.3.4.2.3 SctUserAccounts Table

The SctUserAccounts table contains entries for all the users who are listed in the SctUserInfo table and who are assigned accounts that are defined in the current instance. A separate entry exists for each user-account combination.

There is a special situation in which the group and account information of a user is not determined by Content Tracker. This occurs in a proxied configuration that has multiple proxy instances. When the current instance is a proxy, the group information for an active user who is defined in a different proxy is replaced by a single placeholder line in SctUserGroups for that user. This line contains the username and a "-" placeholder for the group. If at least one account is defined in the current instance, a similar entry is created in SctUserAccounts for any user who is defined in a different proxy.

The SctUserAccount table is organized using one line per Content Server user and user's account.

Field Name Type / Size / Field Definition
SctDateStamp datetime

GMT day when data is collected

dUserName varchar / 100

Name of the user. If local to a proxy instance, is prefixed by the Content Server relative URL, e.g. cs_2/user1

Account varchar / 30

Name of account to which user has access. Placeholder in proxy-to-proxy configuration when current proxy instance has at least one account.


8.3.4.2.4 SctUserGroups Table

The SctUserGroups table references only those users who logged on during the data collection period. If ContentTracker is running in a proxied Content Server configuration, only groups that are defined in the current instance are listed. For example, a user named "joe" is defined in the master instance and has access to groups "Public" and "Plastics" in the master instance. If "joe" logs on to a proxy instance and the group "Plastics" is not defined in the proxy, only the association between "joe" and "Public" will appear in SctUserGroups.

The SctUserGroups table is organized using one line for each user's group for each user that is active during the reporting period.

Field Name Type / Size / Field Definition
SctDateStamp datetime

GMT day when data is collected

dUserName varchar / 100

Name of the user. If local to a proxy instance, is prefixed by the Content Server relative URL, e.g. cs_2/user1

dGroupName varchar / 30

Name of group that the user has permission to access. No distinction is made for the type of access (R, RW, etc.).


8.3.4.2.5 SctUserInfo Table

The SctUserInfo table includes all users known to the current instance plus any additional users from a different instance who have logged on to the current instance during the data collection period. In a proxied configuration, users that are local to one instance are usually known (visible from the UserAdmin application) to other instances. (For this visibility to occur, Content Server instances must typically be restarted after local users have been added.) However, when a user is defined locally with the same name in two instances, only the local user is visible in each of these instances.

For example, the user "sysadmin" defined in the master is not the "sysadmin" user that appears in the UserAdmin application for a proxy. The proxy has its own "sysadmin" user who is defined locally. It is possible for these two different users to both log on during the same data collection period: the user from the master would log on as "sysadmin" and the user from the proxy would log on as something like "cs_2/sysadmin". In this case cs_2/ is the server relative URL that must be prepended to the proxy username. The SctUserInfo file generated for this period will contain separate entries for "sysadmin" and "cs_2/sysadmin".

The SctUserInfo table is organized using one line per Content Server user.

Field Name Type / Size / Field Definition
SctDateStamp datetime

GMT day when data is collected

dUserName varchar / 100

Name of the user. If local to a proxy instance, is prefixed by the Content Server relative URL, e.g. cs_2/user1

dUserType varchar / 30

Type of the user. Placeholder if user has no type


8.3.4.3 Reduction Log Files

When data reduction is run, the Content Tracker Data Engine generates a summary results log file in <cs_root>/data/contenttracker/log/. The reduction log files are named using the format reduction-yyyymmdd.log. The reduction logs may be useful to help diagnose errors that have occurred during the data reduction process. For more information about the raw event log files and their corresponding reduction logs, see "Content Tracker Event Logs".

8.3.5 Tracking Limitations

The current versions of Content Tracker and Content Tracker Reports have several known tracking limitations.

This section covers the following topics:

8.3.5.1 Tracking Limitations with Static URLs and WebDAV

Content Tracker is unable to guarantee exact access counts for content requested through static URLs or by WebDAV clients. The access counts determined by Content Tracker are generally correct, but there are specific, exceptional circumstances in which Tracker is unable to determine whether the content was actually delivered to the requesting user, or if it was, which specific revision of the content was delivered. This section describes situations that may result in incorrect access counts.

This section covers the following topics:

8.3.5.1.1 Missed Accesses for Content Repeatedly Requested via WebDAV

Scenario: User accesses a document via a WebDAV client, and then accesses the same document in the same manner later on. Only the first WebDAV request for the document is recorded. Access counts reported for such content will tend to be lower than actual.

Details: WebDAV clients typically use some form of object 'caching' to reduce the amount of network traffic. If a user requests a particular object, the client will first determine whether it already has a copy of the object in a local store. If it does not, the client will contact the server and negotiate a transfer. This transfer will be recorded as a COLLECTION_GET_FILE service request.

If the client already has a copy of the object, it will contact the server to determine whether the object has changed since the client local copy was obtained. If it has changed, then a new copy will be transferred and the COLLECTION_GET_FILE service details will be recorded.

If the client copy of the object is still current, then no transfer will take place, and the client will present the saved copy of the object to the user. In this case, the content access will not be counted even though the user appears to get a "new" copy of the original content.

8.3.5.1.2 False Positive for Access by Saved (stale) Static URL

Scenario: User saves a "Web Location" (URL) for a content file. The content is subsequently revised in such a way that the saved URL is no longer valid. The user then attempts to access the content via the (now stale) URL, and gets a "Page Cannot be Found" error (HTTP 404). Content Tracker may record this as a successful access even though the content was not actually delivered to the user. Access counts reported for such content will tend to be higher than actual.

Details: The "Web Location" of a content file is the means by which a user can access content via a "static URL". The specific file path in the URL is used in two, slightly different contexts: It is used by the web server to locate the content file in the Content Server repository, and it is also used by Content Tracker to determine the dID and dDocName of the content file during the data reduction process. The problem occurs when the content is revised in such a way that the web location for a given Content ID changes between the time the URL is saved and the time the access is attempted.

For example, if a Word document is checked in, and then revised to an XML equivalent, then the web location for the latest revision of the content will change from: DomainHome/ucm/cs/groups/public/documents/adacct/xyzzy.doc

to: DomainHome/ucm/cs/groups/public/documents/adacct/xyzzy.xml

where: "xyzzy" is the assigned Content ID.

The original revision is "renamed" as: DomainHome/ucm/cs/groups/public/documents/adacct/xyzzy~1.doc

This means the original Web Location will no longer work as a static URL. The Content ID obtained from the original URL, however, will match the latest revision. Consequently, Content Tracker reports this as an access to Content ID "xyzzy", even though the web server was unable to deliver the requested file to the user.

8.3.5.1.3 Wrong dID Reported for Access by Saved Static URL

Scenario: User accesses content via the "Web Location" (URL). The content is then revised before the Content Tracker data reduction operation is performed. The user will be reported as seeing the latest revision, not the one that s/he actually saw. Access counts reported for such content will tend to be attributed to a newer revision than actual. You can minimize this effect by scheduling or running Content Tracker data reductions on a regular basis.

Details: This is related to False Positive for Access by Saved (stale) Static URL, described above. That is, the web server uses the entire web location, (e.g. DomainHome/ucm/cs/groups/public/documents/adacct/xyzzy.doc), to locate and deliver the content, while Content Tracker uses only the ContentID portion to determine the dID and dDocName values. Moreover, Content Tracker makes this determination during data reduction, not at the time the access actually occurs. Consequently, Content Tracker will report the user as having seen the revision current at the time of the reduction, not the one that was current at the time of the access.

There are some implications of this that are not immediately obvious, such as when the group and/or security of the revision are changed from the original. For example, if a user accesses "Public" Revision 1 of a document through a static URL, and the document is subsequently revised to Revision 2 and changed to "Secure" before the Content Tracker data reduction takes place, Tracker will report that the user saw the Secure version. This may also occur when the content file type changes. If the user accesses an original.xml version, which is then superseded by an entirely different.doc before the data reduction is performed, Tracker will report that the user saw the.doc revision, not the actual.xml.

8.3.5.2 Tracking Limitations and Data Directory Protections

Content Tracker's web server filter runs in the authorization context of the user whose access request is being processed. In some cases, the owner of the request processing thread is a system account. In others, it is a requesting user or another type of non-system account used by the application. Regardless, the account associated with the requesting user/system determines the permissions that the web server filter is ultimately granted.

The web server filter collects access event details and records the information in raw event logs (sctlog files located in Content Tracker's data directory). If the existing sctlog file does not exist when the access event occurs, Content Tracker creates one and uses the default protection and authorization credentials of the user who owns the event thread. Or, if the accessing user account has write permission to the data directory, the content access data is recorded in the sctlog files. Otherwise, the logging request fails and the access event details are not recorded.

To ensure that Content Tracker can properly record user access requests, the data directory must be configured to accept the account authorization credentials for all users. Granting world write permission (or the equivalent) is one method. It is recommended that the Content Tracker data directory allows allow unlimited write access to all possible users unless there are conditions that cannot accommodate this level of unrestricted access permissions.

8.3.5.3 Tracking Limitations with ExtranetLook Component

The ExtranetLook component allows customizations of cookie-based login forms and pages for anonymous-type users. The component uses a built-in web server plugin that monitors requests and determines whether or not a request is authenticated based on cookie settings. When a user requests access to a content item, Content Tracker must function within the authorization context of the user's account.

After collecting the access information, Content Tracker tries to record the event data in the sctlog file. If the user's account permissions allow access to Content Tracker's data directory, then the request activity is logged into the sctlog file. However, if the account does not have write authorization, the logging request will fail and the request activity is not recorded.

For more information about access permission into Content Tracker's data directory, refer to "Tracking Limitations and Data Directory Protections".

8.4 Data Tracking Functions

This section covers the following topics:

8.4.1 Data Reduction Features

During the data reduction process, all of the accumulated raw data for a given date is gathered and merged making the information complete. This section describes the primary concepts of data reduction.

Note:

By default, Content Tracker is configured for maximum efficiency. This is done using the Performance Optimization Functions that are set during the Content Server installation procedure. Therefore, some of the data reduction processes are directly affected by these variable settings.

This section covers the following topics:

8.4.1.1 Data Reduction Cycles

Reduced table data is moved from the primary tables to the corresponding archive tables when the associated raw data is moved from 'recent' to 'archive' status. The primary tables contain the output for reduction data in the 'new' and 'recent' cycles and the archive tables contain output for reduction data in the 'archive' cycle.

Raw data is demoted from 'new' to 'recent' when the data is reduced and it is more than one day old. Thus, the 'new' cycle indicates that the data is for the current day or is unreduced data from previous dates. The 'recent' cycle indicates that the data is from yesterday or earlier and has been reduced.

Raw data is demoted to 'archive' (and the corresponding rows in the SctAccessLog table are moved to the SctAccessLogArchive table) when the number of 'recent' sets reaches a configured threshold number and a reduction process is run, either manually or via the scheduler. For more information about configuring the threshold number for 'recent' sets, refer to the SctMaxRecentCount configuration variable in the sct.cfg table. See "Configuration Variables". If a reduction process is never run, the raw data remains in the 'recent' cycle indefinitely.

8.4.1.2 Access Modes and Data Reduction

The way users access content items determines how those accesses are recorded in the SctAccessLog table. There are two basic user access modes: service accesses (viewing the actual native file) and static URL accesses (viewing the web location file). If content items are accessed via a service, the events are recorded in the SctAccessLog table in real time. In this case, the activity is recorded immediately and is not dependent on the reduction process.

However, if content items are accessed via static URLs, the web server filter records the events in a static log file. During the data reduction process, the static log files for a specified date are gathered and the data is moved into the SctAccessLog table. In this case, if data reduction is not performed for a given date, there will be no static URL records in the SctAccessLog and no evidence that these accesses ever occurred.

Thus, the difference in the way static and service accesses are processed has implications with regard to interval counts, see "Snapshot Tab". For example, a user might access a content item twice on Saturday: once via the web location file (static access) and once via the native file (service access). The service access is recorded in the SctAccessLog table but the web location access is not.

Then, if Sunday's data is reduced, only the service access (not the static access) is included in the summaries of the short and long access count intervals. However, if Saturday's data is also reduced, then both the service and static accesses are recorded in the SctAccessLog table and, subsequently, included in the short and long access intervals.

8.4.1.3 Reduction Sequence for Event Logs

Generally, data sets are reduced in chronological order to ensure that the information included in generated reports is as current as possible. In particular, the order in which the raw data log files are reduced determines what specific user access data is logged and counted. During reduction, the SctAccessLog and user metadata database tables are modified with data from the raw data files.

If you are using the snapshot function to gather search relevance information, then the metadata fields associated with the activated activity metrics are also updated during data reduction. The activity metrics use custom metadata fields that are included in Content Server's DocMeta database table. For more information, see "Snapshot Tab".

The currentness of the information in the various database tables is dependent on the order in which you reduce the data sets. Content Tracker always changes the activity metrics values according to the applicable data in the reduction data set. Normally, data sets are reduced in calendar order to ensure that activity metrics will advance as expected. In fact, to ensure that data values are complete and current, you should perform data reduction on a daily basis.

If the data sets are reduced out of order, re-reducing the current or most recent data set will correct the counts. However, it is always preferable to consistently reduce data in calendar order.

The following scenarios show how the reduction sequence affects the stored data.

Scenario 1:

Depending on how content items are accessed, if activity on certain days (such as Saturdays and Sundays) is never reduced, then accesses that occur on those days might never be logged or counted, see "Access Modes and Data Reduction". Similarly, if a content item is accessed on Tuesday and reductions are done for Monday and Wednesday, the Tuesday access is might not be counted toward the last access of that content item.

Scenario 2:

If there was a significant increase in accesses in the last few days, and you reduce data from two weeks earlier, the long and short access metrics for content items will not reflect the recent activity. Instead, the interval values from two weeks earlier override today's values. Reducing the current or most recent data set will correct the counts.

The reduction order does not adversely affect the Last Access date. The reduction process only changes the Last Access date if the most recent access in the reduction data set is more recent than the current Last Access value in Content Server's DocMeta database table.

If you have reduced a recent data set and a particular content item had been accessed, the Last Access field is updated with the most recent access date in the reduction data set. If you then re-reduce an older data set, the older access date for this content item will not overwrite the current value.

For more information about long and short activity metrics, see "Snapshot Tab" and the applicable check boxes and corresponding fields/intervals.

Scenario 3:

Reducing the data sets in an arbitrary order interferes with the demotion of "recent" data files to "archive" data files. The movement of the associated table records is based on the age, archive tables are intended to store the "oldest" data. If the data sets are reduced in random order, it is not apparent which data is the oldest.

For more information about recent and archive data files, see "User Metadata", "Data Reduction Cycles", and the Cycle column on the "Reduction Tab".

8.4.1.4 Reduction Schedules

Reduction runs can be configured to run on a scheduled basis. It is commonplace to use the scheduler to periodically reduce the raw data. In this case, there would be a steady flow of raw data into the 'recent' and 'archive' repositories, and a similarly steady flow of reduced data from the primary tables to the archive tables. For additional information about raw data, data statuses and primary / archive tables, see the "Reduction Tab".

The following are key characteristics of the Content Tracker scheduling process:

  • If the Content Tracker Data Engine is disabled the day before a scheduled reduction run, no data is collected. If the Content Tracker Data Engine is enabled on the day of the scheduled reduction run, the scheduler will not run because no data is available.

  • Data reductions scheduled for a given day are performed on data collected during the previous day. The previous day is defined as the 24-hour period beginning and ending at midnight (system time).

    Note:

    Depending on various conditions of the system load, the following error may be issued if the scheduled reduction is set to run within a few minutes after midnight:
    <date_time>: Cannot reduce data. A request is in progress to delete raw data that was generated on this date.
    

    If this message is issued, try scheduling the reduction run 5 or 10 minutes later.

    Tip:

    To conserve CPU resources, reduction runs can be scheduled for early morning hours when the system load is generally the lowest.

8.4.2 Activity Snapshots

The activity snapshots feature captures user metadata that is relevant for each recorded content item access.

Note:

By default, Content Tracker collects and records only content access event data. This excludes information gathering on non-content access events like searches as well as the collection and synthesis of user profile summaries. As a result, only the SctAccessLog table is populated. Although the user data output tables exist, Content Tracker does not populate them unless the Snapshot function is activated. See "Performance Optimization Functions" for more information.

This section covers the following topics:

8.4.2.1 Search Relevance Metrics

When activated, the activity metrics and corresponding metadata fields provide search relevance information about user accesses of content items. An optional automatic load function allows you to update the last access activity metric to ensure that checked-in content items are appropriately timestamped.

Content Tracker fills the search relevance custom metadata fields with content item usage information that indicates the popularity of particular content items. This information includes the date of the most recent access and the number of accesses in two distinct time intervals.

Users can apply the information generated from these activity metrics functions in various ways. You can selectively use the activity metrics to subsequently order search results based on content item popularity. For example, you might want to order search results according to which content items have been recently viewed or the most viewed in the last week.

If the snapshot function is activated, the values in the search relevance metadata fields are updated during a post-reduction step. During this processing step, Content Tracker uses SQL queries to determine which content items have changed activity metrics values. Content Tracker updates the applicable database tables with the new values and initiates a re-indexing cycle. However, only the content items that have changed metadata values are re-indexed. See "Data Reduction Process with Activity Metrics".

The Snapshot tab enables you to activate the snapshot function and selectively enable each of the activity metrics. Each function that you activate must have a custom metadata field associated with it.

For more information, see "Enabling the Snapshot Function and the Activity Metrics Options" and "Linking Activity Metrics Functions to Search Relevance Metadata Fields".

Alternatively, you can manually update the applicable configuration variables in Content Tracker's sct.cfg file. See "Configuration and Customization".

8.4.2.2 Search Relevance Metadata Fields

Before you can link the activity metrics functions to custom metadata fields, the fields must already exist and must be of the correct type. The metadata field associated with the Last Access metric must be of type Date. The metadata fields associated with the Access Count metrics must be of type Integer. See "Creating the Search Relevance Metadata Fields".

When you create custom metadata fields to use in conjunction with the activity metrics, you have the option to enable them for the search index. If the custom metadata fields are indexed (and searchable), the access values stored in them are more efficiently accessed. That is, indexed fields are more useful for selecting and/or ordering search results by relevance.

Indexing is expensive, particularly when full text search is enabled. The disadvantage of indexed metadata fields is that when the values in the search relevance metadata fields change, the affected content items must be re-indexed to update their values in the database table. Therefore, on a large instance with many content item accesses, updating the search relevance fields will adversely affect performance.

Alternatively, you can disable the indexing function of the custom metadata fields. In this case, it is possible to search on and find values for non-indexed metadata fields, but the search is more expensive.

If re-indexing the affected content items degrades the performance too severely, you can optionally deactivate the Snapshot function. Unfortunately, this means that the activity metrics information will no longer be collected. As a result, you will be unable to order current search results by usage (for example, listing accessed content items in order of decreasing popularity).

8.4.3 Service Calls

Content Tracker enables you to log Content Server service calls along with data values relevant to the associated services. Every service that you want to be logged must have a service entry in the serviced call configuration file (SctServiceFilter.hda). In addition to the services that are logged, their corresponding field map ResultSets can optionally be included in the SctServiceFilter.hda.

Note:

By default, Content Tracker only logs services that have event types for content access or services that cause an entry to be made in the DocHistory table. Although this ensures maximum performance, this means that some service events will not be logged. See "Performance Optimization Functions" for more information.

The service entries in the SctServiceFilter.hda file allow Content Tracker to gather event and usage information. The enabled services automatically log various general DataBinder fields, such as dUser and dDocName. Linking a field map ResultSet to a service entry enables you to use the extended service call tracking function. The field map ResultSet consists of a list of data field names, location names, and their associated general purpose table column names in the output database table (SctAccessLog).

The SctAccessLog table provides additional general purpose columns for use with the extended service call tracking function. You can fill these columns with any data values you feel are appropriate for the associated service call. When you list the data field names in the field map ResultSet, you must also list the location name that is the source of the data field, and the table column name where the data is logged. Because the extended service tracking function logs and tracks specific data for a specific service call, you can generate customized reports for access and usage information.

Caution:

In field map ResultSets, nothing prevents you from mapping data fields to existing, standard SctAccessLog table columns. The extended service mapping occurs after the standard field data values are collected. Consequently, you can override any of the standard table column fields.

For example, the service you are logging might carry a specific user name (such as, MyUserName=john) in a data field. You could use the extended tracking function to override the contents of the sc_scs_dUser column. In this case, you simply combine MyUserName and sc_scs_dUser and use them as the data field, location, and table column set in the field map ResultSet.

Therefore, it remains your responsibility to ensure that the data being logged is a reasonable fit with the SctAccessLog column type.

For more information about the SctAccessLog table and the general purpose columns, see "Combined Output Table". For more detailed information about the SctServiceFilter.hda file, the extended service call tracking function, and ResultSet configuration, see "Service Call Configuration".

8.4.4 Web Beacon Objects

A web beacon is a managed object that facilitates specialized tracking support for indirect user accesses to web pages or other managed content. In earlier versions, Content Tracker was unable to gather data from cached pages and pages generated from cached services. The web beacon feature involves the use of client side embedded references that are invisible references to the managed beacon objects within Content Server. This system enables Content Tracker to record and count user access requests for managed content items that have been copied by an external entity for redistribution without obtaining content directly from Content Server.

Briefly described, the web beacon feature uses specially designed web beacon references that are encoded with pseudo query parameters and embedded into managed content items or web pages. When a user requests a tagged object, the browser encounters the web beacon reference while rendering the page. To resolve the reference, the browser requests the web beacon object from Content Server and, in doing so, transmits the encoded query string. Content Tracker interprets the query parameter values and can record and count the access to the cached item or web page.

Note:

The implementation requirements for the web beacon feature are highly contingent on the system configurations involved. There are many factors that cannot be addressed by Content Tracker documentation. All access records collected and processed by Content Tracker are, at best, an indication of general user activity and not exact counts.

This section covers the following topics:

8.4.4.1 Use Cases for Web Beacon Referencing

The web beacon feature enables you to track user accesses to managed content items or web sites that originated in Content Server but are not currently being served by Content Server. When users access cached web pages and content items, Content Server and Content Tracker are unaware that these requests ever happened. Therefore, without using web beacon referencing, Content Tracker will not record and count such requests.

When cached content is served to consumers, users perceive that the requested object was served by Content Server. In actuality, however, the managed content is actually provided using non-dynamic content delivery methods. In these situations, the managed content is actually served by a static website, a reverse proxy server, or even out of a file system. To ensure that this type of activity can be tracked, you will need to implement the web beacon feature.

This section covers the following topics:

8.4.4.1.1 Reverse Proxy Server Activity Tracking

The reverse proxy server is positioned between the users and Content Server. In this configuration, the reverse proxy server caches managed content items by making a copy of each requested object. The next time another user asks for the document, it displays its own copy from the private cache. If the reverse proxy server does not already have the requested object in its cache, it must request a copy from Content Server.

Because the reverse proxy server delivers cached content, it does not directly interact with Content Server. In such cases, Content Tracker cannot detect these requests and does not track this type of user access activity. However, when you implement the web beacon feature, Content Tracker has the ability to gather user request data from cached pages.

Note:

Generally, a reverse proxy server configuration is used to improve web performance by caching or by providing controlled web access to applications and sites behind a firewall. The idea is to load balance and move copies of frequently accessed content to a web server where it is updated on a scheduled basis. This is an indirect delivery system where users work dynamically but content is delivered statically.

However, for the web beacon feature to work, each user access includes an additional, direct request to the managed beacon object that resided in Content Server. Although this adds overhead to normal requests, the web beacon object is very small and does not significantly interfere with the reverse proxy server's performance. Furthermore, it is only necessary to embed the web beacon references in objects that you specifically want to track.

8.4.4.1.2 External Site Studio Web Site Activity Tracking

Site Studio is integrated with Content Server and enables users to create web sites that are stored and managed in the content server instance. When Site Studio and Content Server are located on the same server, Content Tracker is configured to automatically track the applicable user accesses. The gathered Site Studio activity data is then used in pre-defined reports. See "Site Studio Web Site Activity Reporting".

If your web site is intended for an external audience, you may decide to create a copy of the site and transfer it to another server. In addition to being viewed publicly, this solution also ensures that site development remains separate from the production site. In this arrangement, however, you would need to implement the web beacon feature to make sure that Content Tracker can collect and process user activity.

Note:

Currently, there are two modes of Site Studio integration with Content Tracker. One type is the existing built-in integration that automatically occurs when Site Studio is installed alongside Content Server. The other form uses the web beacon feature and Content Tracker regards Site Studio the same as any other web site generator.

8.4.4.2 Web Beacon Operational and Implementation Overview

The design point of Content Tracker is to record and count requests for objects that are currently managed by Content Server. The web beacon feature makes it possible to count requests for managed objects that have been copied by an external entity (such as a reverse proxy server or entities for redistribution without involving Content Server. See "Reverse Proxy Server Activity Tracking" and "External Site Studio Web Site Activity Tracking".

The following list provides a brief overview of the web beacon feature's functionality and implementation requirements. It provides general background information and is intended to help prepare you for more detailed information provided in the subsequent sections.

  • Part 1 - Creating Web Beacon Objects:

    To begin implementing the web beacon feature, you will need to create a web beacon object. This is typically a very small object such as a 1x1 pixel transparent image. The web beacon object must then be checked into Content Server and added to Content Tracker's list of web beacon object names.

  • Part 2 - Creating Web Beacon References:

    The next implementation step involves creating the web beacon references to the checked in web beacon object and embedding them into cached HTML pages or managed content items. These references are composed of two parts. The first part is a URL reference to the web beacon object. The second part consists of extra information that is used to identify the requested content. This information is encoded as pseudo query parameters.

    When users access any managed content items or web pages that contain the embedded web beacon reference, the server (Site Studio or reverse proxy) delivers the cached item. However, to render the web page or content item, the browser must also resolve the web beacon reference. Thus, the browser requests the web beacon object from Content Server.

    This access request to Content Server carries with it the encoded information from the query parameters. The information in the query parameters is sufficient to enable Content Tracker to identify the tagged web page or managed object. Generally, the web browser ignores the query parameters long enough for the Content Tracker web server filter to copy and store them for data reduction.

  • Part 3 - Content Tracker's Reduction Processing for Web Beacon References:

    Finally, Content Tracker logs the web beacon reference to the beacon object in the usual way. See "Data Reduction". During data reduction, Content Tracker checks the dDocName of each referenced object against the list of registered web beacons. See "Web Beacon Objects". If the dDocName of the requested object is on the list, then the query parameters are processed in such a manner to ensure that the URL request is logged as a request for the tagged object (web page or managed content item) rather than the web beacon object.

8.4.4.3 Web Beacon Objects

When you implement the web beacon feature, you need to create one or more content items to use as the web beacon object(s). Typically, a web beacon object consists of a 1x1 pixel transparent image; anything that has low overhead and won't disrupt the page being rendered is suitable. The ideal web beacon object would have zero content. Although you can create multiple web beacon objects, most implementations will function fine using only one.

Note:

When you create a web beacon object, you must ensure that it is not a file type value included in the SctIgnoreFileType configuration variable. If the web beacon object file type is one of the listed values, Content Tracker will not collect the activity data for the associated cached content items or web pages. For more detailed information about the SctIgnoreFileType configuration variable, see "File Types for Entries in the SctAccessLog".

Next, your completed web beacon object(s) must be checked into Content Server. Then, you will need to update Content Tracker's SctWebBeaconIDList preference/configuration variable. During data reduction, Content Tracker checks the SctWebBeaconIDList settings to determine how the web beacon reference listings in the event logs should be processed. If the applicable web beacon object is listed, Content Tracker processes the data appropriately.

Note:

When Content Tracker is installed, you have the option to enter the dDocNames of web beacon objects into the SctWebBeaconIDList preference variable. However, if you don't know the names of the web beacon objects in advance, you can also manually change the SctWebBeaconIDList settings. This is done by using the update function in the Component Manager. See "Adding/Editing Web Beacon Object Names to the Web Beacon ID List".

8.4.4.4 Web Beacon References

After you have created and checked in your web beacon object(s), you need to create their corresponding web beacon reference(s). Generally, a single web beacon object will work in most systems because different query strings appended to the web beacon static URL make each reference unique. Additionally, each query parameter set consists of distinct combinations of variables that identify specific cached web pages or managed content items.

To ensure that the web beacon feature effectively tracks indirect user activity, the web beacon references for web beacons must first be properly formatted and include the necessary information. Next, the references must be appropriately embedded into cached documents or web pages and correctly executed. Finally, Content Tracker must be able to accurately capture and store the applicable data for future reduction processing.

This section covers the following topics:

8.4.4.4.1 Format Structure for Web Beacon URL References

Web beacon URL references consist of two parts: the web beacon static URL used to access the web beacon object managed by Content Server and a pseudo query string with content item variables. The query string appended to the web beacon static URL is never actually executed. However, the set of query parameter values provides sufficient additional information for Content Tracker to be able to identify the associated managed object. See "Data Capture and Storage for Cached Managed Content Items".

Note:

When you create the web beacon references, you must ensure that the web beacon static URL within Content Server does not use a directory root that is included in the SctIgnoreDirectories configuration variable.

If the web beacon static URL root directory is one of the listed values, Content Tracker will not collect the activity data for the associated cached content items or web pages. For more detailed information about the SctIgnoreDirectories configuration variable, see "File Types for Entries in the SctAccessLog"

The query parameter set functions as a code that informs Content Tracker what the actual managed content item is that the user accessed. One of the query parameters is the item's dID. Including a unique set of query parameter values enables you to monitor indirect user access activity for managed objects that have been copied and cached.

The following examples illustrate general format structures associated with the web beacon feature. Combined, the examples demonstrate how you can use one web beacon object while creating an unbounded number of different query strings. The same web beacon object (dDocName = bcn_2.txt) is used in all of the examples. However, by varying the query parameters, the requests for this web beacon object can convey to Content Tracker a 'request' for any managed object in the repository.

These examples are based on the following assumptions:

Example 8-1 Web Beacon Request Without Query Parameters

http://myhost.somewhere.else/idc/groups/public/documents/adacct/bcn_2.txt

This example begins with a static web reference to the web beacon object. Although it is a legitimate direct access to the web beacon object, there are no appended query parameters. Therefore, Content Tracker will process this access event as a request for the web beacon object itself. See "Data Collection and Processing".

Example 8-2 Web Beacon Request for Tracking doc1

http://myhost.somewhere.else/idc/groups/public/documents/adacct/bcn_2.txt?sct_dDocName=doc1&sct_dID=1234&...

This example also begins with the usual static web reference to this beacon object. However, it has a pseudo query string appended to it that contains an arbitrary number of query parameters. The values contained in these query parameters convey the information about the specific managed object (doc1) that the user has requested.

Example 8-3 Web Beacon Request for Tracking doc2

http://myhost.somewhere.else/idc/groups/public/documents/adacct/bcn_2.txt?sct_dDocName=doc2&sct_Ext_2=WebSite4

This example is similar to Example 8-2. The parameter values provide information about the user requested content item (doc2). However, in this example, the query string includes another parameter to convey additional information about the tagged content item. In this case, the added parameter uses an extField column name. Accordingly, the value WebSite4 is copied into the extField_2 column of the SctAccessLog table. (The extField column substitution is optional and application dependent.)

Example 8-4 Web Beacon Request for Tracking doc3

http://myhost.somewhere.else/idc/groups/public/documents/adacct/bcn_2.txt?sct_dDocName=doc2&sct_Ext_2=WebSite4&sct_Ext_8=SubscriptionClass6

This example modifies Example 8-3 by adding a second (although non-sequential) extField column name. In this case, WebSite4 is copied into the extField_2 column of the SctAccessLog table, and SubscriptionClass6 is copied into the extField_8 column. (Again, the extField column substitutions are optional and application dependent.

8.4.4.4.2 Placement and Retrieval Scheme for Web Beacon References

To properly facilitate the web beacon feature, the specially constructed web beacon references must be embedded in whatever managed object you want to track. Generally, web beacon references can be embedded in any HTML page. Users indirectly request access to the modified "managed" content items (via an external Site Studio web site or a reverse proxy server).

In turn, the browser encounters the web beacon reference while rendering the requested page. Each display of the managed object, regardless of how the object was obtained, causes the browser to request a copy of the web beacon object directly from Content Server. When the browser resolves the web beacon reference, Content Tracker captures the data that includes the web beacon reference along with the set of pseudo query parameters that identify the managed content item.

8.4.4.4.3 Data Capture and Storage for Cached Managed Content Items

Ordinarily, query parameters in static URLs serve no function for the web browser. However, when resolving the web beacon static URL, the browser ignores the appended query parameters long enough for Content Tracker's web server filter to record them. Although the pseudo query string is never executed, Content Tracker captures the query parameter values along with other data such as the client IP address and date/time stamp. Content Tracker records the data in web access event logs (sctlog files). See "Content Tracker Event Logs".

8.4.4.5 Reduction Processing for Web Beacon References

The web beacon feature enables Content Tracker to capture and record indirect user requests for managed content items. During data reduction, when these specially constructed web beacon references are processed, Content Tracker determines that the request was for a web beacon object rather than a regular content item. Content Tracker does this by comparing the web beacon's dDocName with the list of dDocNames in the SctWebBeaconIDList.

If there isn't a match in the list, or there are no query parameters appended to the web beacon reference, Content Tracker processes the access event normally. See "Data Reduction". If the web beacon's dDocName is identified, Content Tracker continues to process and interpret the associated URL query parameters. Ultimately, the data reduction process treats the web beacon access request as a request for the web page or content item.

During data reduction, Content Tracker completes the processing by parsing the query parameters and performing various value substitutions for fields that are ultimately written to the SctAccessLog. For more detailed information about the SctAccessLog table and individual data fields, see "Combined Output Table". The query parameter values are mapped to SctAccessLog fields as follows:

  • sct_dID replaces the web beacon object's dID

  • sct_dDocName replaces the web beacon object's dDocName

  • sct_uriStem replaces the web beacon object's URI stem (everything preceding the '?')

  • sct_uriQuery replaces the web beacon object's URI query portion (everything following the '?')

  • sct_Ext_n is copied directly into the SctAccessLog Extended Field n

Example 8-5 Data Reduction Processing for Query Parameter Values

/idc/groups/public/documents/adacct/bcn_2.txt?sct_dDocName=WW_1_21&sct_dID=42&sct_Ext_1=WillowStreetServer&sct_Ext_2=SubscriptionTypeA

After data reduction processing, Content Tracker records this web beacon type request in the SctAccessLog table as an access to WW_1_21 rather than to bcn_2.txt. Other data such as the username, time of access, client IP, etc., are derived from the HTTP request. Additionally, WillowStreetServer is copied into the extField_1 column of the SctAccessLog table, and SubscriptionTypeA is copied into the extField_2 column. (These last two field substitutions are optional and application dependent.)

8.4.4.6 General Implementation Considerations

To implement Content Tracker's web beacon feature, you will need to manually perform the following tasks:

  1. Create the web beacon object.

  2. Check it into Content Server.

  3. Update the SctWebBeaconIDList.

  4. Define the web beacon references.

  5. Embed them into the cached content items and/or web sites that you want to track.

When implementing the web beacon feature, there are multiple challenges, limitations, and guidelines that you need to consider.

This section covers the following topics:

8.4.4.6.1 Web Beacon Design Limitations

Given the types of challenges that are involved in implementing the web beacon feature, the following limitations need to be considered:

  • One of the challenges in implementing the web beacon feature involves determining the means by which the web beacon reference is attached to a tagged object. Also, there are situations where the requested object does not allow embedded references (for example, a PDF or Word document). In this case, the web beacon object must be requested directly from Content Server before the actual content item is requested.

  • There are numerous situations in which the web beacon feature will not work such as with certain browser configurations. If the user has disabled cross-domain references in their browser, and both the web page and Content Server instance are in different domains, Then the web beacon object is never requested from Content Server and the user access is not counted.

    For example:

    • If the static web site is located in the ABChost.XYZorg.com domain,

    • And the Content Server (from which this web site was built and which stores the web beacon object) is in the myhost.myorg.com domain,

    • And a user has disabled cross-domain references but is accessing documents from the ABChost.XYZorg.com domain (thus forbidding the browser to access any link other than ABChost.XYZorg.com),

    • Then, when the user's browser encounters an embedded web beacon reference and requests the web beacon object from the Content Server instance in the ABChost.XYZorg.com domain, the request fails and the user's content item access from the myhost.myorg.com domain is not counted.

  • The first time a managed content item is accessed via a reverse proxy server, it will be counted twice: once when the Content Server provides the item to the reverse proxy server, and a second time when the browser requests the web beacon object.

  • Depending on the specific configuration, it might be necessary to devise a method to prevent the reverse proxy server and external Site Studio from caching the web beacon object itself. Browsers also do caching. This situation would prevent Content Tracker from counting any relevant content accesses.

    Tip:

    One way to avoid this situation would be to ensure each web beacon object request is unique. This could be done by appending a single-use query parameter to the web beacon reference that contains a random number. For example:

    dDocName=vvvv_1_21&FoolTheProxyServer=12345654321

    By changing the number on each request, you might fool the cache along with the web server and the browser and into thinking that the request is new.

8.4.4.6.2 Web Beacon Design Guidelines

Given the types of challenges that are involved in implementing the web beacon feature, the following guidelines need to be considered:

  • The sct_dDocName and sct_dID parameter values in the web beacon reference must resolve to an actual managed content item in the same Content Server instance that provides the requested web beacon object.

  • Using the ExtField columns in the SctAccssLog table is optional and entirely application dependent.

  • Use of ExtField_10 is reserved for the web beacon object's dDocName. This allows report writers a way to determine which web beacon object was used to signal the access to the actual managed content item.

  • Spelling and capitalization of the query parameter names must be exact.

  • Embedded commas or spaces in the query parameter values are not allowed.

  • Typically, the dDocName and dID of a managed object are included in the web beacon reference although to be considered a legitimate access request, it is not necessary to provide both. If any of the standard fields are missing, Content Tracker resolves the identification parameters as follows:

    • Given a dID, Content Tracker can determine the content item's dDocName.

    • Given a dDocName, Content Tracker can determine the content item's dID with the following provision:

      In this case, the dID will be the content item's most current revision. If the revision changes after the content item is cached, then the user will see the older version. However, Content Tracker counts this access request as a view of the most recent revision of the content item.

    • Given a proper URI Stem, Content Tracker can determine the content item's dDocName but will assume the dID of the most recent revision.

  • You must restart Content Server after you have made any changes to the web beacon list (SctWebBeaconIDList).

  • Do not create a web beacon object that uses a file type or is located in a directory that Content Tracker is configured to disregard. Content Tracker's web server filter separates out file types and directories listed in the SctIgnoreFileType and SctIgnoreDirectories configuration variables. See "Web Beacon Objects" and "Format Structure for Web Beacon URL References".

  • Content Tracker is unable to verify whether or not the cached content item was delivered.

  • Content Tracker performs normal folding of static URL accesses. If a user repeatedly requests the same content item and makes no intervening requests for another document, then Content Tracker assumes that the consecutive requests are the same document. In this case, these access requests are considered to be all one access request.

  • The query parameters can represent any managed object and need not necessarily be what the user is actually viewing.

8.4.4.7 Examples of Web Beacon Embedding

There are many different embedding methods to use when implementing the Content Server web beacon feature. Each technique has advantages and disadvantages and one may be more appropriate for a particular situation than another. Due to the differences in system configurations, there isn't a single technique that is optimal in all circumstances.

This section describes three techniques:

Note:

Code fragment files for all of the examples are available and included in Content Tracker's documentation directory. These examples are intended to demonstrate general approaches to tracking accesses to managed content. For your convenience, the code fragments are provided as a starting point. However, they may need to be adapted to work with your specific application and network topology.

All of the examples below use the same web beacon object: WebBeacon.bmp. This graphic file is checked into the Content Server instance IFHE.comcast.net/idc/ with a dDocName of wb_bmp. Typically, the web beacon object is a 1 x 1 pixel transparent image, but can actually be almost anything.

8.4.4.7.1 Embedded HTML Example

The simplest, most direct use of a web beacon for tracking managed content access is to embed a reference to the beacon directly into the HTLM source for the containing web page. When the requesting user's browser attempts to render the page, it will send a request to the Content Server instance where the web beacon object resides. The extra information appended to the beacon URL as a set of query parameters will be counted by Content Tracker as a reference to the web page, not the web beacon object.

In this example, the technique involves placing an image tag in the web page to be tracked. The src attribute of the image refers to the web beacon object (wb_bmp) which was checked into a Content Server instance named IFHE.comcast.net/idc/. When the user's browser loads the image from IFHE.comcast.net/idc/, the additional query information will be recorded, and ultimately interpreted as a reference to dDocName BOPR.

This approach is undoubtedly the simplest, but has the obvious disadvantage that the user's browser, and/or a reverse proxy server, might cache a copy of the web beacon object. As such, no additional requests will be posted directly to the Content Server instance IFHE.comcast.net/idc/. Furthermore, no additional accesses to any content tagged with this web beacon will be counted.

The HTML fragment for this method might be written as follows:

<!-- WebBeaconEmbeddedHtml.htm - Adjust the Web Beacon web location and managed object identfiers in the img src attribute, then paste into your web page -->
<img src="http://IFHE.comcast.net/idc/groups/public/documents/adacct/wb_bmp.bmp?sct_dID=1&sct_dDocName=BOPR&sct_uriStem=http://IFHE.comcast.net/idc/groups/public/documents/adacct/bopr.pdf&sct_Ext_1=Sample_Html_Beacon_Access" width="21" height="21" />
8.4.4.7.2 Embedded Javascript Example

The cached web beacon problem described in the Embedded HTML Example can be overcome by using Javascript instead of HTML Using the embedded Javascript method requires two script tags:

  • The cs_callWebBeacon function that issues the actual web beacon request.

  • An unnamed block that assigns context values to certain Javascript variables, and then calls the cs_callWebBeacon function.

This method offers some advantages over Embedded HTML. The identifying information for the managed content object is defined in a list of variables which improves readability. Also, the web beacon request is made effectively unique by adding a random number to the pseudo query parameters.

However, there are also has some disadvantages. There is more code to manage, the URL of the web beacon server is hardcoded in each web page, and the user's browser might not have Javascript enabled.

The Javascript fragment for this method might be written as follows:

// WebBeaconEmbeddedJavascript.js - Adjust the managed object and Web Beacon descriptors,
and then paste this into your web page.
//

<script type="text/javascript" >

    var cs_obj_dID = "" ;
    var cs_obj_dDocName = "" ;
    var cs_obj_uriStem = "" ;
    var cs_extField_1 = "" ;
    var cs_extField_2 = "" ;
    var cs_extField_3 = "" ;
    var cs_extField_4 = "" ;
    var cs_extField_5 = "" ;
    var cs_extField_6 = "" ;
    var cs_extField_7 = "" ;
    var cs_extField_8 = "" ;
    var cs_extField_9 = "" ;
    var cs_beaconUrl = "" ;

    function cs_void( ) { return ; }

    function cs_callWebBeacon( ) {
        //
        var cs_imgSrc = "" ;
        var cs_inQry = false ;

        if ( cs_beaconUrl && cs_beaconUrl != "" ) {
            cs_imgSrc += cs_beaconUrl ;
        }

        if ( cs_obj_dID && cs_obj_dID != "" ) {
            if ( cs_inQry ) {
                cs_imgSrc += "&" ;
            } else {
                cs_imgSrc += "?" ;
                cs_inQry = true ;
            }
            cs_imgSrc += "sct_dID=" + cs_obj_dID ;
        }

        if ( cs_obj_dDocName && cs_obj_dDocName != "" ) {
            if ( cs_inQry ) {
                cs_imgSrc += "&" ;
            } else {
                cs_imgSrc += "?" ;
                cs_inQry = true ;
            }
            cs_imgSrc += "sct_dDocName=" + cs_obj_dDocName ;
        }
        if ( cs_obj_uriStem && cs_obj_uriStem != "" ) {
            if ( cs_inQry ) {
                cs_imgSrc += "&" ;
            } else {
                cs_imgSrc += "?" ;
                cs_inQry = true ;
            }
            cs_imgSrc += "sct_uriStem=" + cs_obj_uriStem ;
        }

        if ( cs_extField_1 && cs_extField_1 != "" ) {
            if ( cs_inQry ) {
                cs_imgSrc += "&" ;
            } else {
                cs_imgSrc += "?" ;
                cs_inQry = true ;
            }
            cs_imgSrc += "sct_Ext_1=" + cs_extField_1 ;
        } 

        if ( cs_extField_2 && cs_extField_2 != "" ) {
            if ( cs_inQry ) {
                cs_imgSrc += "&" ;
            } else {
                cs_imgSrc += "?" ;
                cs_inQry = true ;
            }
            cs_imgSrc += "sct_Ext_2=" + cs_extField_2 ;
        } 

        <!-- and so on for the remaining extended fields -->

        if ( cs_inQry ) {
            cs_imgSrc += "&" ;
        } else {
            cs_imgSrc += "?" ;
            cs_inQry = true ;
        }

        var dc = Math.round( Math.random( ) * 2147483647 ) ;
        cs_imgSrc += "sct_defeatCache=" + dc ;

        var wbImg = new Image( 1, 1 ) ;
        wbImg.src = cs_imgSrc ;
        wbImg.onload = function( ) { cs_void( ) ; }

    }

</script>

<script type="text/javascript">
    //
    var cs_obj_dID = "1" ;
    var cs_obj_dDocName = "BOPR" ;
    var cs_obj_uriStem = "http://IFHE.comcast.net/idc/groups/public/documents/adacct/bopr.pdf" ;
    var cs_extField_1 = "Sample_Javascript_Beacon_Access" ;
    var cs_beaconUrl = "http://IFHE.comcast.net/idc/groups/public/documents/adacct/wb_bmp.bmp" ;

    cs_callWebBeacon( ) ;

</script>
8.4.4.7.3 Served Javascript Example

The hardcoded web beacon server problem described in the Embedded Javascript Example can be overcome by splitting the code into two fragments:

  • The Managed Code Fragment:

    In this example, the managed code fragment contains the cs_callWebBeacon function. It can be checked in and managed by a Content Server instance, either the instance which manages the web beacon, or some other. The src attribute contained in the in-page fragment refers to the managed code fragment and causes it to be dynamically loaded into the web page.

  • The In-Page Code Fragment:

    The in-page portion still consists of two <script> tags, but the first contains only a reference to the cs_callWebBeacon code instead of the code itself. The advantage for this is that changes to the cs_callWebBeacon function can be managed centrally instead of having to modify each and every tagged web page.

    This solution admittedly incurs the additional network overhead of loading the managed code into the web page on the user's browser. However, the requirement for a web beacon assist to tracking implies that the network environment includes an efficient reverse proxy server, or other caching mechanism. The same cache that conceals managed object access will also minimize the impact of the code download.

Managed Code Fragment:

// WebBeaconServedJavascript_Checkin.js - Check this in to your Content Server, then fixup
// the Javascript include src attribute in WebBeaconManagedJavascriptIncludeSample.js
//
    var cs_obj_dID = "" ;
    var cs_obj_dDocName = "" ;
    var cs_obj_uriStem = "" ;
    var cs_extField_1 = "" ;
    var cs_extField_2 = "" ;
    var cs_extField_3 = "" ;
    var cs_extField_4 = "" ;
    var cs_extField_5 = "" ;
    var cs_extField_6 = "" ;
    var cs_extField_7 = "" ;
    var cs_extField_8 = "" ;
    var cs_extField_9 = "" ;
    var cs_beaconUrl = "http://IFHE.comcast.net/idc/groups/public/documents/adacct/wb_bmp.bmp" ;

    function cs_void( ) { return ; }

    function cs_callWebBeacon( ) {
        //
        var cs_imgSrc = "" ;
        var cs_inQry = false ;

        if ( cs_beaconUrl && cs_beaconUrl != "" ) {
            cs_imgSrc += cs_beaconUrl ;
        }

        if ( cs_obj_dID && cs_obj_dID != "" ) {
            if ( cs_inQry ) {
                cs_imgSrc += "&" ;
            } else {
                cs_imgSrc += "?" ;
                cs_inQry = true ;
            }
            cs_imgSrc += "sct_dID=" + cs_obj_dID ;
        }

        if ( cs_obj_dDocName && cs_obj_dDocName != "" ) {
            if ( cs_inQry ) {
                cs_imgSrc += "&" ;
            } else {
                cs_imgSrc += "?" ;
                cs_inQry = true ;
            }
            cs_imgSrc += "sct_dDocName=" + cs_obj_dDocName ;
        }

        if ( cs_obj_uriStem && cs_obj_uriStem != "" ) {
            if ( cs_inQry ) {
                cs_imgSrc += "&" ;
            } else {
                cs_imgSrc += "?" ;
                cs_inQry = true ;
            }
            cs_imgSrc += "sct_uriStem=" + cs_obj_uriStem ;
        }

        if ( cs_extField_1 && cs_extField_1 != "" ) {
            if ( cs_inQry ) {
                cs_imgSrc += "&" ;
            } else {
                cs_imgSrc += "?" ;
                cs_inQry = true ;
            }
            cs_imgSrc += "sct_Ext_1=" + cs_extField_1 ;
        }

        if ( cs_extField_2 && cs_extField_2 != "" ) {
            if ( cs_inQry ) {
                cs_imgSrc += "&" ;
            } else {
                cs_imgSrc += "?" ;
                cs_inQry = true ;
            }
            cs_imgSrc += "sct_Ext_2=" + cs_extField_2 ;
        }

        <!-- and so on for the remaining extended fields -->

        if ( cs_inQry ) {
            cs_imgSrc += "&" ;
        } else {
                cs_imgSrc += "?" ;
                cs_inQry = true ;
        }

        var dc = Math.round( Math.random( ) * 2147483647 ) ;
        cs_imgSrc += "sct_defeatCache=" + dc ;

        var wbImg = new Image( 1, 1 ) ;
        wbImg.src = cs_imgSrc ;
        wbImg.onload = function( ) { cs_void( ) ; }

    }

In-Page Code Fragment:

<script type="text/javascript" src="http://IFHE.comcast.net/idc/groups/public/documents/adacct/wbmjcs.js" >
</script>

<script type="text/javascript">
    //
    var cs_obj_dID = "1" ;
    var cs_obj_dDocName = "BOPR" ;
    var cs_obj_uriStem = "http://IFHE.comcast.net/idc/groups/public/documents/adacct/bopr.pdf" ;
    var cs_extField_1 = "Sample_Managed_Javascript_Beacon_Access" ;

    cs_callWebBeacon( ) ;

</script>

8.4.5 Using Content Tracker

This section provides information and task procedures about Content Tracker functions. This section covers the following topics:

8.4.5.1 Changing the Variable Settings for the Performance Optimization Functions

You can change the values of the installation preference variables for the optimization functions using one of the following methods:

8.4.5.1.1 Changing Installation Preference Variables using Content Manager Advanced
  1. Log in to Content Server as an administrator.

  2. Select Admin Server from the Administration menu.

    The Content Admin Server page is displayed.

  3. Click the name of the Content Server instance whose security checks preference setting will be changed.

    The Content Admin Server <instance_name> page is displayed.

  4. Click Component Manager Advanced.

    The advanced version of the Component Manager page is displayed.

  5. In the Update Component configuration field, select Content Tracker from the list.

  6. Click Update.

    The Update Component Configuration page is displayed.

  7. Select the installation preference variable that you want to update and enter the new setting.

  8. Click Update.

    Content Tracker Reports is successfully updated with the new setting and is effective immediately. You do not need to restart Content Server.

8.4.5.1.2 Changing Installation Preference Variables using the config.cfg File
  1. In a text editor, open the config.cfg

    IntradocDir/config/config.cfg

  2. Scroll to locate the installation preference variable that you want to update and change the value.

  3. Save and close the config.cfg file.

  4. Restart Content Server to apply the changes.

8.4.5.2 Accessing the Data Engine Control Center

The Data Engine Control Center is used to enable, schedule, and monitor data collection and reduction.

To access the Data Engine Control Center:

  1. Open the Content Tracker Administration page:

    Administration tray, Content Tracker Administration.

  2. Scroll down and click the Data Engine Control Center icon.

    The Content Tracker Data Engine Control Center interface is displayed.

8.4.5.3 Enabling or Disabling Data Collection

When data collection is enabled, Content Tracker logs web traffic activity on the Content Server. By default, the Enable Data Collection check box is selected on the Collection tab of the Data Engine Control Center. Selecting this check box enables data collection. Clearing this check box disables data collection.

To enable or disable data collection:

  1. Open the Data Engine Control Center, see "Accessing the Data Engine Control Center".

  2. On the Collection tab, select (to enable collection) or clear (to disable collection) the Enable Data Collection check box.

  3. Click OK.

    Note:

    After you click OK, do not immediately exit the applet. You must wait until the Updated Data Collection state confirmation message displays. Occasionally, this may take a few seconds. If you exit the applet before the confirmation message displays, the requested change(s) may not occur.
  4. After the Updated Data Collection state confirmation message displays, click OK.

  5. Restart the Content Server.

    Note:

    Look carefully at the text above the check box to determine whether data collection is enabled or disabled.

    When enabled, the text reads "Data collection is enabled..."

    When disabled, the text reads "Data collection is not enabled...

8.4.5.4 Running Data Reduction Manually

To manually reduce data:

  1. Open the Data Engine Control Center, see "Accessing the Data Engine Control Center".

  2. On the Reduction tab, click (to highlight) the set of input data to reduce.

  3. Click the Reduce Data button.

    A confirmation dialog box is displayed.

  4. Click Yes to reduce the data.

    The Status will change from 'ready' to 'running,' and the Percent Done will display the progress of data reduction. When data reduction is complete, the time stamp will be displayed in When Finished, and the Cycle will display 'recent.'

    Note:

    If you choose to reduce the current date's data, the data will be reduced, but the Cycle will continue to display the data set as 'new.'

8.4.5.5 Setting Data Reduction to Run Automatically

To set data reduction to run automatically:

  1. Open the Data Engine Control Center, see "Accessing the Data Engine Control Center".

  2. On the Schedule tab, select the Scheduling Enabled check box.

  3. Select check boxes for the days when data collection will occur.

  4. Select the hour and minute when data collection will occur.

  5. Click OK.

    Note:

    After you click OK, do not immediately exit the applet. You must wait until the Updated reduction scheduling information confirmation message displays. Occasionally, this may take a few seconds. If you exit the applet before the confirmation message displays, the requested change(s) may not occur.
  6. After the Updated reduction scheduling information confirmation message displays, click OK.

    Data will be reduced automatically on the day(s) and time that you selected.

8.4.5.6 Deleting Data Files in Any Cycle

To delete data files in any cycle:

  1. Open the Data Engine Control Center see "Accessing the Data Engine Control Center".

  2. On the Reduction tab, click (to highlight) the set of input data to delete.

  3. Click the Delete button.

    A confirmation dialog box is displayed.

  4. Click OK to delete the data.

    The selected data set is deleted, and is no longer displayed in the window

8.4.5.7 Deleting Data Files in 'Archive' Cycle

To delete data files in the archive cycle:

  1. Open the Data Engine Control Center see "Accessing the Data Engine Control Center".

  2. On the Reduction tab, click the Delete Archive button.

    A confirmation dialog box is displayed.

  3. Click OK to delete the data.

    All data sets in 'archive' cycle are deleted, and are no longer displayed in the window

8.4.5.8 Creating the Search Relevance Metadata Fields

Before you can implement the snapshot function, you must decide which custom metadata fields will be associated with each of the enabled activity metrics. Also, the custom metadata fields must already exist and must be of the correct type. Depending on which activity metrics you plan to enable, you must create one or more custom metadata fields using an applicable procedure.

This section covers the following topics:

8.4.5.8.1 Creating the Custom Metadata Field for the Last Access Metric

To create custom metadata fields to assign to the last access field:

  1. Open the Content Tracker Administration page:

    Select the Administration tray, then Content Tracker Administration

  2. Click the Configuration Manager icon.

    The Configuration Manager interface is displayed.

  3. On the Information Fields tab, click Add.

    The Add Custom Info Field screen is displayed.

  4. Enter the name of the metadata field to be assigned to the Last Access metric. For example, LastAccess.

  5. Click OK.

    The Add Custom Info Field screen is displayed.

  6. Select Date from the Field Type menu.

    Normally, you do not need to enter a value in the Default Value field. However, if you do not enter a value for this field and there is no specified default value, then the Last Access field is not populated until a content item has been checked in and a data reduction run. Some applications, however, need to have the Last Access field contain a valid value at all times. In this case, you will need to enter a value in the Default Value field that will ensure that the Last Access field is populated with the date and time of the content checkin. For more information, see "Populating the Last Access Field Using the Default Value".

    Field Type with a value of Date is the only required attribute for the last access custom metadata field. However, if you want the last access custom metadata field to be searchable, you must ensure that the Enable for Search Index check box is selected.

    Indexing this custom metadata field is optional, although indexing makes searches on this field more efficient. Furthermore, indexing allows you to query the accumulated search relevance statistics and generate useful data. For example, you can create a list of content items ordered by their popularity, etc.

    For more information about the advantages and disadvantages of indexing the search relevance metadata fields, see the "Snapshot Tab".

  7. Click OK.

    The custom metadata field is added to the Field Info list on the Information Fields tab.

  8. Click Update Database Design to validate the current database and add the custom metadata field to the system.

8.4.5.8.2 Creating the Custom Metadata Field for the Short and Long Access Metrics

To create custom metadata fields to assign to the short and long access fields:

  1. Open the Content Tracker Administration page:

    Select the Administration tray, then Content Tracker Administration

  2. Click the Configuration Manager icon.

    The Configuration Manager interface is displayed.

  3. On the Information Fields tab, click Add.

    The Add Custom Info Field screen is displayed.

  4. Enter the name of the metadata field to be assigned to the Short or Long Access Count metric. For example, ShortAccess or LongAccess.

  5. Click OK.

    The Add Custom Info Field screen is displayed.

  6. Select Integer from the Field Type menu.

    Field Type with a value of Integer is the only required attribute for the Short and Long Access Count custom metadata field. However, if you want the Short and Long Access Count custom metadata fields to be searchable, you must ensure that the Enable for Search Index check box is selected for both.

    Indexing these custom metadata fields is optional, although indexing makes searched on these fields more efficient. Furthermore, indexing allows you to query the accumulated search relevance statistics and generate useful data. For example, you can create a list of content items ordered by their popularity, etc.

    For more information about the advantages and disadvantages of indexing the search relevance metadata fields, see the "Snapshot Tab".

  7. Click OK.

    The custom metadata field is added to the Field Info list on the Information Fields tab.

  8. Click Update Database Design to validate the current database and add the custom metadata field to the system.

8.4.5.9 Enabling the Snapshot Function and the Activity Metrics Options

By default, the snapshot function and activity metrics are disabled. To use these optional features, you must first enable the snapshot post-processing function which activates the activity metrics choices. Then, you can selectively enable the desired activity metrics and assign their preselected custom metadata fields.

To enable the snapshot function and activate the activity metrics:

  1. Open the Data Engine Control Center, see "Accessing the Data Engine Control Center".

  2. Click the Snapshot tab.

  3. Select the Enable Snapshot post-processing check box.

    The snapshot function is enabled and the activity metrics options are activated.

  4. Click OK.

    A confirmation dialog box is displayed.

  5. Click OK.

    The snapshot state and Content Tracker's configuration file (sct.cfg) are updated.

    To verify the snapshot function and activity metrics have been enabled, you can access the Content Tracker's sct.cfg file in the following directory:

    <cs_root>/data/contenttracker/config/sct.cfg

    Optionally, you can manually enable the snapshot function and activate the activity metrics options. For more detailed information about the specific snapshot configuration variables and how to manually edit them, see "Configuration Variables" and "Manually Setting Content Tracker Configuration Variables", respectively.

8.4.5.10 Linking Activity Metrics Functions to Search Relevance Metadata Fields

After the activity metrics options have been activated, they must be individually selected to enable them. Enabling the activity metrics also activates their corresponding custom metadata fields.

To enable the activity metrics and activate their corresponding custom metadata fields:

  1. Open the Data Engine Control Center, see "Accessing the Data Engine Control Center".

  2. Click the Snapshot tab.

    The snapshot function must be enabled; otherwise the activity metrics options are not activated. See "Enabling the Snapshot Function and the Activity Metrics Options".

  3. Select one or more of the activity metric check boxes.

    Each selected activity metric is enabled and each corresponding custom metadata field is activated.

  4. In the Field field, enter the internal name of the custom metadata field to be linked to the activity metric. For example, xLastAccess, xShortAccess, or xLongAccess.

  5. For the Short and Long Access Counts, enter the applicable Interval amounts in days. For example, 7 days for the Short Access Count and 28 days for the Long Access Count.

  6. Click OK.

    A confirmation dialog box is displayed.

  7. Click OK.

    The snapshot state and Content Tracker's configuration file (sct.cfg) are updated.

    Content Tracker performs minimal error checking on the activity metrics field names. Be aware that the fields on the Snapshot tab are case-sensitive. It is important that all field values are spelled and capitalized correctly. For more information about the specific Content Tracker error checks for the snapshot function, see the "Snapshot Tab".

    To verify that the activity metrics are linked to the appropriate custom metadata fields, you can access the Content Tracker's sct.cfg file in the following directory:

    <cs_root>/data/contenttracker/config/sct.cfg

    Optionally, you can manually link the activity metrics to their respective custom metadata fields. For more detailed information about the specific activity metrics configuration variables and how to manually edit them, see "Configuration Variables" and "Manually Setting Content Tracker Configuration Variables", respectively.

8.4.5.11 Setting a Checkin Time Value for the Last Access Metadata Field

The Last Access Date field is normally updated by Content Tracker when a managed object is requested by a user and a data reduction run. Therefore, the Last Access field in Content Server's DocMeta database table may be empty (NULL) until the next data reduction is run.

However, some applications require that the date and time of content checkin be recorded immediately in the Last Access field. To accommodate this requirement, the Last Access field must be populated with an appropriate date and time value. Content Tracker provides several methods to populate the Last Access field.

This section covers the following topics:

8.4.5.11.1 Populating the Last Access Field Using the Default Value

Normally, you do not need to enter a value in any field that has a Default value. However, if you do not enter a value for the Last Access field, and there is no specified default value, then the field is not populated when a content item is checked in. The checkin date or most recent access date is only recorded once a data reduction has been run.

To support the requirements for particular applications, you can use the Autoload option to backfill the Last Access field for existing content, see "Populating the Last Access Field Using the Autoload Option". For all future content item checkins, you can configure the Last Access custom metadata field by setting the Default Value field.

The value you enter must be a function or an expression that will cause the field to be populated with the date and time of content checkin. This ensures that the current date and time is automatically entered into the Last Access field.

To populate the Last Access field using the Default Value field:

  1. Open the Content Tracker Administration page:

    Select the Administration tray, then Content Tracker Administration

  2. Click the Configuration Manager icon.

    The Configuration Manager interface is displayed.

  3. On the Information Fields tab, select the custom metadata field that you have linked to the Last Access metric and click Edit.

    The Edit Custom Info Field screen is displayed.

    Note:

    The Last Access custom metadata field must already exist. If not, you must create it and link it to the Last Access activity metric function. See "Creating the Custom Metadata Field for the Last Access Metric".
  4. In the Default Value field, enter an expression that will cause the field to be populated with the date and time of content checkin.

    For example, you could specify a default value of <$dateCurrent()$> to cause the Last Access field to be populated with the current checkin date and time.

  5. Click OK.

    The Last Access custom metadata field is updated.

  6. Backfill the Last Access field in for existing content, see "Populating the Last Access Field Using the Autoload Option".

8.4.5.11.2 Populating the Last Access Field Using the Autoload Option

The Autoload option on the Snapshot tab allows you to retroactively replace NULL values in the Last Access field with the current date and time. The only DocMeta records that are affected using the Autoload option are those where the Last Access metadata field is empty (NULL).

To populate the Last Access field using the Autoload option:

  1. Open the Data Engine Control Center, see "Accessing the Data Engine Control Center".

  2. Click the Snapshot tab.

    The snapshot function must be enabled; otherwise the activity metrics options are not activated. See "Enabling the Snapshot Function and the Activity Metrics Options".

  3. Select the Enable Last Access updates check box.

  4. Link the Last Access metric to the applicable custom metadata field, see "Linking Activity Metrics Functions to Search Relevance Metadata Fields".

  5. Select the Autoload check box.

  6. Click OK.

    A confirmation dialog box is displayed and the current date and time are inserted into the applicable Last Access fields (those with NULL values) in Content Server's DocMeta database table.

    Tip:

    By default, the Autoload query sets the Last Access metadata field to the current date and time. However, you may customize the query to set the Last Access field to dCreateDate, dReleaseDate, or any other time that meets the needs of your application. See "Customizing the Autoload Option SQL Query".
8.4.5.11.3 Populating the Last Access Field for Batchloads and Archives

To ensure proper retention of archived and batchloaded content, you must set the Last Access field date for the import/insert. Otherwise, the access date for these content items will be NULL, and retention based on this field will fail.

Note:

The Last Access date can be used in conjunction with Retention Manager to maintain retention schedules. Ensuring that this field is set properly during batchloads and archives is important for the success of the retention. Please consider carefully what date is most reflective of when the content was last accessed. For example, an import of 1998 data is probably better tagged with that date than the date you perform the import.

The name of the Last Access field is based on the name you specified in Configuration Manager, see "Creating the Custom Metadata Field for the Last Access Metric". In the case of Last Access, xLastAccess would be used in the import/insert. Refer to the Enable Last Access updates checkbox and corresponding data field on the "Snapshot Tab".

To populate the Last Access field using Content Server's Batch Loader:

  1. Access the Batch Loader.

  2. Create a file record that establishes an appropriate Last Access date. The following is an example of an applicable file record:

    # This is a comment
    Action=insert
    dDocName=Sample1
    dDocType=ADACCT
    xLastAccess=5/1/1998
    dDocTitle=Batch Load record insert example
    dDocAuthor=sysadmin
    dSecurityGroup=Public
    primaryFile=links.doc
    dInDate=8/15/2001
    <<EOD>>
    
  3. Run the Batch Loader to process the file record.

    Note:

    Refer to the Oracle Fusion Middleware System Administrator's Guide for Content Server for more detailed information.

8.4.5.12 Editing the Snapshot Configuration

To modify the current snapshot activity metrics settings:

  1. Open the Data Engine Control Center, see "Accessing the Data Engine Control Center".

  2. Click the Snapshot tab.

    The snapshot function must be enabled; otherwise the activity metrics options are not activated. See "Enabling the Snapshot Function and the Activity Metrics Options".

  3. Make the necessary changes in the activity metrics fields.

  4. Click OK.

    A confirmation dialog box is displayed.

  5. Click OK.

    The snapshot state and Content Tracker's configuration file (sct.cfg) are updated.

    Content Tracker performs minimal error checking on the activity metrics field names. Be aware that the fields on the Snapshot tab are case-sensitive. It is important that all field values are spelled and capitalized correctly. For more information about the specific Content Tracker error checks for the snapshot function, see the "Snapshot Tab".

    To verify the modified values of the snapshot and activity metrics configuration variables, you can access the Content Tracker's sct.cfg file in the following directory:

    <cs_root>/data/contenttracker/config/sct.cfg

    Optionally, you can manually edit the configuration settings for the snapshot activity metrics. For more detailed information about the specific activity metrics configuration variables and how to manually edit them, see "Configuration Variables" and "Manually Setting Content Tracker Configuration Variables", respectively.

8.4.5.13 Adding/Editing Service Entries

To add or edit a service:

  1. Open the Data Engine Control Center, see "Accessing the Data Engine Control Center".

  2. Click the Services tab.

  3. Click Add to create a new service entry.

    Or, select an existing service entry from the Service Name list and click Edit.

    The Extended Service Tracking screen is displayed. The fields are empty if you are adding a new service entry.

    If you are editing an existing service entry, the fields are populated with those values. In this case, the Service Name field is deactivated.

  4. Enter or modify the applicable field values (except in the Field Map field).

    If you want to link this service entry to a field map ResultSet, enter the applicable name in the Field Map field. Then, see the procedure for "Adding Field Map ResultSets and Linking Them to Service Entries".

  5. Click OK.

    A confirmation dialog box is displayed.

  6. Click OK.

    The Extended Service Tracking screen closes and the Services tab on the Data Engine Control Center is displayed.

    If you added a new service entry, it is included in the Services list. If you edited an existing service entry, the updated field values are included in the Services list.

    The services state and Content Tracker's SctServiceFilter.hda are updated.

    Content Tracker does not perform error checking (such as field type or spelling verification) for the extended services tracking function in the Data Engine Control Center. Errors are not generated until you perform a reduction. These fields are case-sensitive.Therefore, if you are adding new services or editing existing services, be careful to enter the proper service call names. Ensure that all field values are spelled and capitalized correctly.

    To verify that the service entry's values are added to the SctServiceFilter.hda file or that the existing service entry's values are properly modified, you can access Content Tracker's SctServiceFilter.hda file in the following directory:

    <cs_root>/data/contenttracker/config/SctServiceFilter.hda

    Optionally, you can manually add or edit services. For more detailed information about service entries in the SctServiceFilter.hda file and how to manually edit them, see "About the Service Call Configuration File" and "Manually Editing the SctServiceFilter.hda File", respectively.

8.4.5.14 Adding Field Map ResultSets and Linking Them to Service Entries

To implement the extended service call tracking function, you need to link service entries to field map ResultSets in the SctServiceFilter.hda file.

To add a field map ResultSet and link it to a service entry:

  1. Open the Data Engine Control Center, see "Accessing the Data Engine Control Center".

  2. Click the Services tab.

  3. Select the desired service entry from the Service Name list.

    If you need to add a new service entry, refer to the procedure for "Adding/Editing Service Entries".

  4. Click Edit.

    The Extended Service Tracking screen is displayed and the fields are populated with the selected service entry's values. In this case, the Service Name field is deactivated. If necessary, you can edit this service entry's values now in addition to adding the field map ResultSet.

    If this service is already linked to a field map ResultSet, the name is listed in the Field Map field and one or more data field, location, and table column set are listed in the Field Name, Field Location, and Column Name fields. If you want to edit or delete existing data field, location, and table column sets, see the procedure for "Editing Field Map ResultSets".

  5. If the selected service is already linked to a field map ResultSet, skip this step. However, if the selected service is not linked to a secondary ResultSet, the Field Map field is empty. Enter the name of the field map ResultSet.

  6. Click Add.

    The Field Map screen is displayed.

  7. Enter the appropriate values in the fields.

  8. Click OK.

    The Field Map screen closes and the values are added to the Field Name and Column Name fields.

    Note:

    If you need to add more than one data field, location and table column set, repeat Steps 6 through 8 as necessary.
  9. Click OK.

    A confirmation dialog box is displayed.

    The Extended Service Tracking screen closes and the Services tab on the Data Engine Control Center is displayed.

    The services state and Content Tracker's SctServiceFilter.hda file are updated.

  10. Click OK.

    Content Tracker does not perform error checking (such as field type or spelling verification) for the extended services tracking function in the Data Engine Control Center. Errors are not generated until you perform a reduction. These fields are case-sensitive. Therefore, if you are adding new field map ResultSets or editing existing field map ResultSets, be careful to enter the proper DataBinder field names and SctAccessLog table column names. Ensure that all field values are spelled and capitalized correctly.

    To verify that the field map ResultSet values are added to the service call configuration file or that the values are properly modified, you can access the Content Tracker's SctServiceFilter.hda file in the following directory:

    <cs_root>/data/contenttracker/config/SctServiceFilter.hda

    Optionally, you can manually add field map ResultSets and manually link them to service entries. For more detailed information about service entries and field map ResultSets in the SctServiceFilter.hda file and how to manually edit them, see "About the Service Call Configuration File" and "Manually Editing the SctServiceFilter.hda File", respectively.

8.4.5.15 Editing Field Map ResultSets

To edit a field map ResultSet:

  1. Open the Data Engine Control Center, see "Accessing the Data Engine Control Center".

  2. Click the Services tab.

  3. Select the desired service entry from the Service Name list.

  4. Click Edit.

    The Extended Service Tracking screen is displayed and the fields are populated with the selected service entry's values. In this case, the Service Name field is deactivated. If necessary, you can also edit the other field values for this service entry in addition to editing the field map ResultSet.

  5. To edit the Field Map ResultSet, you can either:

  1. Click OK.

    A confirmation dialog box is displayed.

  2. Click OK.

    The Extended Service Tracking screen closes and the Services tab on the Data Engine Control Center is displayed. The services state and Content Tracker's SctServiceFilter.hda file are updated.

    Content Tracker does not perform error checking (such as field type or spelling verification) for the extended services tracking function in the Data Engine Control Center. Errors are not generated until you perform a reduction. These fields are case-sensitive. Therefore, if you are editing field map ResultSets by adding one or more data field, location, and table column sets, be careful to enter the proper data field names, location names, and SctAccessLog table column names. Ensure that all field values are spelled and capitalized correctly.

    To verify the modified values of the data field, location, and table column sets in field map ResultSets, you can access the Content Tracker's SctServiceFilter.hda file in the following directory:

    <cs_root>/data/contenttracker/config/SctServiceFilter.hda

    Optionally, you can manually modify the values of the data field, location, and table column sets in field map ResultSets. For more detailed information about field map ResultSets in the SctServiceFilter.hda file and how to manually edit them, see "About the Service Call Configuration File" and "Manually Editing the SctServiceFilter.hda File", respectively.

8.4.5.16 Deleting Service Entries

To delete a service:

  1. Open the Data Engine Control Center, see "Accessing the Data Engine Control Center".

  2. Click the Services tab.

  3. In the Services list, select the service entry that you want to delete.

  4. Click Delete.

    You are asked to verify your request to delete service logging for this service entry.

  5. Click Yes.

    The selected service entry is deleted from the Services list and removed from the SctServiceFilter.hda file.

    To verify that the service entry has been deleted, you can access the Content Tracker's SctServiceFilter.hda file in the following directory:

    <cs_root>/data/contenttracker/config/SctServiceFilter.hda

    Optionally, you can manually delete specific service entries. For more information, see "Manually Editing the SctServiceFilter.hda File".

8.4.5.17 Deleting Field Map ResultSets

To delete a field map ResultSet:

  1. Open the Data Engine Control Center, see "Accessing the Data Engine Control Center".

  2. Click the Services tab.

  3. In the Services list, select the service entry that is linked to the field map ResultSet that you want to delete.

  4. Click Edit.

    The Extended Service Tracking screen is displayed and the fields are populated with the selected service entry's values.

  5. Remove the field map ResultSet name from the Field Map field.

  6. Select a data field, location, and table column set and click Delete.

    The data field, location, and table column set is removed from the list. Repeat this step for each data field, location, and table column set (as necessary).

  7. Click OK.

    The field map ResultSet is removed from the SctServiceFilter.hda file. It is no longer linked to the service entry.

    To verify that the field map ResultSet has been deleted, you can access the Content Tracker's SctServiceFilter.hda file in the following directory:

    <cs_root>/data/contenttracker/config/SctServiceFilter.hda

    Optionally, you can manually delete field map ResultSets. For more information, see "Manually Editing the SctServiceFilter.hda File".

8.4.5.18 Adding/Editing Web Beacon Object Names to the Web Beacon ID List

To set or edit the preference setting for the list of web beacon objects:

  1. Log in to Content Server as an administrator.

  2. Select Admin Server from the Administration menu.

    The Content Admin Server page is displayed.

  3. Click the name of the Content Server instance whose web beacon preference setting will be changed.

    The Content Admin Server <instance_name> page is displayed.

  4. Click Component Manager.

    The Component Manager page is displayed.

  5. In the Update Component configuration field, select Content Tracker from the list.

  6. Click Update.

    The Update Component Configuration page is displayed.

  7. In the SctWebBeaconIDList preference field, enter the applicable web beacon object dDocNames separated by commas.

  8. Click Update.

  9. Restart Content Server to apply the changes.

8.5 Report Generation

Content Tracker Reports uses the captured and reduced data to generate reports that outline the usage history of particular pieces of content. You can use the pre-defined reports that are provided or create custom queries for the information to be tracked. Optionally, you can use any external commercial reporting tool. See "External Report Generator".

Note:

By default, Content Tracker is configured for maximum performance using several optimization functions (see "Performance Optimization Functions"). As such, Content Tracker collects and records only content access event data. This excludes information gathering on non-content access events like searches as well as the collection and synthesis of user profile summaries.

Because non-content event data is not gathered, various pre-defined report options are not displayed on the Content Tracker Report Generator main page. However, all of the pre-defined report options can be available if you reconfigure Content Tracker. This is done by changing the settings of the optimization functions (see "Changing the Variable Settings for the Performance Optimization Functions").

Note:

The user reports are only visible if the Content Access Only operating mode is disabled. See "Performance Optimization Functions".

Reports can be derived from a variety of criteria, including specific users, groups of users, and any set of content that can be defined by a query or group of metadata values. Based on the variables in the system (such as number of users, amount of content, metadata count, etc.), Content Tracker Reports enables hundreds of key metrics to be included in reports. Specialized reports enable you to understand and disclose which content is most relevant to users.

This section covers the following topics:

8.5.1 Oracle and DB2 Case Sensitivity

If Oracle or DB2 is used as the Content Server database, metadata values are case sensitive and care must be used when entering the content metadata values in the applicable query report criteria. As a result, depending on how a value is entered in the corresponding field, Content Tracker Reports may not return all possible matching files.

With either an Oracle or DB2 Content Server database, values must be entered exactly as they are entered in the Content Server. Therefore, depending on the lettering structure of the values in the Content Server, the values entered in the query metadata fields will need to be entered in all lowercase letters, all uppercase letters or mixed-case letters. Otherwise, Content Tracker Reports will not return all of the matching files.

For example, if the content type in the Oracle or DB2 Content Server database is AdAcc but the user enters it in the query field as adacc, ADACC, or Adacc, Content Tracker Reports will not return any results. In this case, the content type metadata value must be entered using mixed-case letters. This is true for all of the metadata fields in each of the pre-defined query reports.

8.5.2 Access Control Lists and Content Tracker Reports Secure Mode

The security checks preference variable (SctrEnableSecurityChecks) is set when you install the Content Tracker Reports component. Essentially, this preference variable enables you to select one of two security modes: secure and non-secure. The security checks preference provides the option to employ individual user role and account information to restrict the visibility of content item information in report results.

This means that you control what content items (and, subsequently, the metadata) that users can see in their generated reports. Ideally, users should not be able to see anything through Content Tracker Reports that they couldn't find via a Content Server search. Therefore, if you select the secure mode, the information in any generated report will be filtered based on the user's role and account privileges.

However, if you have enabled Access Control Lists (ACLs) on your Content Server instance, the secure mode option in Content Tracker Reports does not work. During installation, you must leave the security checks preference check box blank. This means that on an ACL-based system, the secure mode must be disabled. In this case, it is possible for users other than a system administrator to see information about content items that they would not otherwise be authorized to access and view.

Note:

For more detailed information about the security checks installation preference and how it will affect the report queries and report results, see "Security Checks and Query Results".

8.5.3 Pre-Defined Reports

Content Tracker Reports provides many pre-defined report options that you can use to generate reports for the most commonly requested topics.

This section covers the following topics:

8.5.3.1 Default Report Format

Each report produced using the Content Tracker Report Generator main page has the same general format and visual layout. The following is the Top Content Items report that is selected by default when the Content Tracker Report Generator main page is accessed. The information provided by the reports is extracted from the reduced data in the SctAccessLog database table and other Content Server database tables, as necessary.

Only users that actually request and open content item are included in the Content Tracker Report Generator's compiled results. The opened content item can be the web location file (the absolute path to the content item), an HTML version (by using Dynamic Converter), or the actual native file. Users that open only the Content Information page are not included in the tracked data.

There is generally a one-day delay from the time that a user accesses a content item until that information is included in the Content Tracker Report Generator's access history results. The information must first be accumulated by Content Tracker and then undergo a data reduction cycle. Thus, the content item access history results are derived from the reduced data in the SctAccessLog and other Content Server database tables. Manually reducing the data immediately updates the database tables and, subsequently, the generated query reports will also display the updated information. For more information about the reduction process, see the "Reduction Tab".

Surrounding text describes sample_report.gif.
Field Description
Report Name field The name of the selected query report.
Dates field The dates entered in the Start Date and End Date fields. If you did not enter specific dates, the default dates are used for the query.
Results table columns Provide the relevant information for the selected report.
Printer-friendly Version link Opens a new browser window and displays the report without the navigation trays.

8.5.3.2 Content Dashboard Feature

When a generated query report contains an active link to a specific content item, clicking the link displays the corresponding Content Dashboard. The content dashboard in the following screen capture shows that two versions of a particular content item were each accessed three times. In this view, the revision access results are shown individually.

Surrounding text describes ctr_vers_sep.gif.

If you click the All Versions Together link on the Content Dashboard, the access results for both versions are combined.

Surrounding text describes ctr_vers_togeth.gif.

8.5.3.3 Drill Down Report Feature

There are various levels of report results that are generated for each pre-defined report. Depending on the search criteria you enter on the Content Tracker Report Generator main page, the results are filtered accordingly. The top level reports are summary reports and provide very general information. You can use the links on the top level reports to drill down to more specific information.

Surrounding text describes ctr_report_drlldwn_scrn.gif.

8.5.4 Custom Reports

In addition to the sample reports provided with Content Tracker Reports, you also have the option to create custom queries to track information. However, before you begin creating your custom report query, you should be aware of some issues that may affect how you design your query.

This section covers the following topics:

8.5.4.1 Custom Report Queries and Oracle

If you are using Oracle and aliases to display the column names in the generated report, you must add the aliases to the following file:

/shared/config/resources/upper_clmns_map.htm

Example:

If your column headers are:

  • Name

  • Access_Date_GMT

Then, you must enter the following lines to the upper_clmns_map.htm file:

<tr>
<td>NAME</td>
<td>Name</td>
</tr>
<tr>
<td>ACCESS_DATE_GMT</td>
<td>Access_Date_GMT</td>
</tr>

8.5.4.2 Custom Report Queries and Extended Service Tracking

If you are using the extended service tracking function, you must be aware of what data values are written to specific columns in the SctAccessLog table before designing your SQL queries. In particular, you must be aware that the name of the service will always be logged to the sc_scs_idcService column. Therefore, you should include it as a qualifier in any query that uses the contents of the extended fields.

For more information about the extended service tracking function, see "About the Service Call Configuration File" and the "Services Tab".

8.5.4.3 Custom Report Query Display Results

After you have successfully added the custom report query to the report query file, you can use it and view the resulting:

Figure 8-1 Custom Report Link

Surrounding text describes Figure 8-1 .

Figure 8-2 Generated Custom Report

Surrounding text describes Figure 8-2 .

Figure 8-3 Drill-Down Report

Surrounding text describes Figure 8-3 .

8.5.5 External Report Generator

Commercial report generation tools can be used to produce basic text reports or more sophisticated graphics such as bar graphs or pie charts from the data collected by Content Tracker.

Note:

This guide assumes that users have a comprehensive working knowledge of or competent familiarity with the external reporting tool they are using to create custom reports. For this reason, this section is intentionally written to provide only very basic guidelines that can be applicable to most commercially available reporting products.

8.5.6 User Authentication/Authorization and Auditing

Content Tracker Reports provides an auditing feature that enables you to monitor unsuccessful attempts to access the system or permission-protected content items. Two reports are available that can help you analyze attempted security breaches that include failed user logons and unsuccessful attempts to access secure content items. This information is essential to safeguard system and content security as well as to maintain proper maintain audit trails and records.

The available auditing reports include:

  • Authorization Failures by User

    This report provides access authorization denial information that includes user names and their IP addresses. Although these users have system access privileges, their role/account memberships may restrict them from accessing particular content items (such as access to payroll content).

  • Login Failures

    This report provides login/authentication failure information that includes user names and their IP addresses. The logged data does not distinguish between external, internal, and global users because, without a successful login, it is impossible to differentiate user types.

8.5.7 Site Studio Web Site Activity Reporting

If you are using Site Studio, then Content Tracker is automatically configured to track Site Studio activity. Content Tracker Reports uses the logged data to generate the pre-defined reports that summarize the Web site access results.

This section covers the following topics:

8.5.7.1 Main Page Site Studio Report Links

The Site Studio-specific Web access reports are included on the Content Tracker Report Generator main page if you have installed Site Studio.

Surrounding text describes ctr_ss_mainpage_rpts.gif.

8.5.7.2 Site Studio Pre-Defined Reports

The Site Studio pre-defined reports use the default Content Tracker Reports formatting and provide drill-down report capabilities. The top level reports for both are summary reports that use Site ID and Accesses as their general criteria. The drill-down reports provide the relevant statistics.

  • Web Site Content Accesses

    This report is ID based at the top level and in subsequent drill-down reports, the results are listed by Content ID and Relative URL.The information shows what URLs are being used to access a Web site. However, there are cases where many different URLs will actually display the same page. Therefore, the results of this report also provide the total number of hits on the nodes, regardless of how the user got there.

Surrounding text describes ctr_ss_mainpg_rpts_site.gif.
  • Web Site Accesses by URL

    This report provides summaries of the Web site relative URLs and the relevant activity sums.

    Surrounding text describes ctr_ss_mainpg_rpts_url.gif.

8.5.8 Security Checks and Query Results

During the installation process for Content Tracker Reports, you have the option to employ individual user role and account information to restrict the visibility of content item information in report results. This means that you control what content items (and, subsequently, the metadata) that users can see in their generated reports. Ideally, users should not be able to see anything through Content Tracker Reports that they couldn't find via a Content Server search.

Caution:

If you have enabled Access Control Lists (ACLs) on your Content Server instance, the secure mode option in Content Tracker Reports does not work. For more information, see "Access Control Lists and Content Tracker Reports Secure Mode".

This section covers the following topics:

8.5.8.1 Security Checks Preference Variable

The security checks preference variable (SctrEnableSecurityChecks) is set when you install the Content Tracker Reports component. This preference variable enables you to select one of two security modes: secure and non-secure. The secure mode takes into account which user is running the report queries and the non-secure mode does not.

Note:

During installation, you decide which mode is used by selecting the security checks checkbox or leaving it blank. After installation, you also have the option to change the setting using the Component Manager, see "Changing the Security Checks Preference Setting".

This section covers the following topics:

8.5.8.1.1 Values for the Security Checks Preference Variable

The values for the security checks preference variable include:

  • SctrEnableSecurityChecks=True (selected checkbox) enables the security checks installation preference and configures Content Tracker Reports to operate in secure mode.

    In secure mode, the same security criteria (role and account qualifications) that are used to limit Content Server search results are also applied to the Content Tracker Report Generator's queries and the generated reports. Thus, it is possible that two different users running the Top Content Items report may see different results. See the secure mode example in the section of "Security Mode Examples".

  • SctrEnableSecurityChecks=False (blank checkbox) disables the security checks installation preference and configures Content Tracker Reports to operate in non-secure mode. This is the default setting.

    In non-secure mode, the additional role and account criteria used to restrict Content Server search results are not applied to Content Tracker Report Generator's queries and the generated reports. Thus, it is possible for a user other than a system administrator to see information about content items that they would not be authorized to access and view. See the non-secure mode example in the section of "Security Mode Examples".

8.5.8.1.2 Security Mode Examples

A user might have admin, contributor, guest, and sysmanager privileges (a semi-admin user) but does not have the proper role/account membership to see a particular content item (such as the payroll report). The assigned privileges allow this user to access the Content Server Admin page, and therefore, the Content Tracker Report Generator main page. However, when this user performs a standard search in Content Server, the results page would not reveal that the payroll report exists.

If the security checks preference variable is enabled, Content Tracker Reports enforces the same role/account membership checks. Then, depending on the user requesting a specific report, the role/account matching activity determines what content item usage data is included.

As demonstrated in the following examples, the report results generated for a specific user (the semi-admin user described above) are contingent upon whether the preference variable is enabled or not.

  • Secure mode example:

    When the security checks preference is enabled, Content Tracker Reports is running in secure mode and checks for role/account matches. In this case, the semi-admin user is not entitled to retrieve and view confidential data. Due to the restrictions associated with this user's role/account privileges, the payroll content item remains completely invisible. The data is not included in report results and the user is unaware of its existence.

  • Non-secure mode example:

    When the security checks preference is disabled, Content Tracker Reports is running in non-secure mode and does not check for role/account matches. In this case, although the semi-admin user is not entitled to access or view the payroll report, some confidential information associated with the payroll content item can nevertheless be retrieved.

    At the very least, the user can discover the payroll report's existence and view some of its metadata The danger in this situation depends on what kind of information the metadata contains. In some cases, even knowing the content item exists could be a serious breach of security.

    Note:

    This kind of security breach is not limited to semi-admin users. For example, a non-privileged user (that is, someone not ordinarily authorized to view a particular content item on a search results page) might gain access to the Content Tracker Report Generator main page. This could occur either by reaching the Admin page or by guessing a URL. In this case, the user would see a report containing some of the metadata describing the prohibited content item.

8.5.8.2 Report Queries and Security Modes

The contenttrackerreports_query.htm file contains all the Content Tracker Report Generator's queries that produce the pre-defined and custom reports. To support non-secure and secure modes, this file contains two sets of queries. One set takes into account which user is running the query (secure mode) and the other set does not (non-secure mode). The security checks preference setting determines which set of queries is used, see "Security Checks Preference Variable".

Note:

For localization support, the word "document" was changed to "content item" in the pre-defined report names. However, the corresponding report queries still include an abbreviation for the word document (doc). The report query names have not been changed in the contenttrackerreports_query.htm file.

For example, the "Top Content Items" report is one of the pre-defined reports listed on the Content Tracker Report Generator main page. The corresponding report queries in the contenttrackerreports_query.htm file use the pre-existing naming conventions:

qSctrTopDocs (non-secure version)

qSctrTopDocs_SEC (secure version)

This section covers the following topics:

8.5.8.2.1 Pre-Defined Reports and Security Modes

Almost all the pre-defined report queries have both secure and non-secure forms included in the contenttrackerreports_query.htm file. Generally, if the search results of a query can be affected by user role and account privileges, then secure variants of the non-secure queries are included. And, if the security checks preference variable is enabled, then the secure forms of queries take precedence and are executed instead of the corresponding non-secure queries.

It is not possible to selectively enable or disable the security checks preference variable for individual report queries. However, it is possible to manage secure and non-secure queries by customizing the contenttrackerreports_query.htm file. In effect, you can disable security checks (account matching) for a particular query by deleting or renaming the secure form of the query. Thus, if the security checks preference variable is enabled, but a secure form of a given query is not found in the contenttrackerreports_query.htm file, then the non-secure form of the query is used to generate the report.

For more information about using security checks for a particular pre-defined query, see "Enabling/Disabling Security Checks for Report Queries". For more information about collectively enabling or disabling security checks for all report queries, see "Changing the Security Checks Preference Setting".

8.5.8.2.2 Custom Reports and Security Modes

In addition to the pre-defined reports, you can also create custom reports that are based on search queries tailored to your particular needs. In addition to creating custom reports, you can also selectively implement security checks for them. That is, if you want security checks performed for your new custom report, then you can include both the non-secure and secure forms of the query in the contenttrackerreports_query.htm file.

For example, you can add a custom report with both query forms. If the non-secure query name is qMyTopTwenty, then the secure query name would be qMyTopTwenty_SEC. If the security checks preference variable is enabled, the report is generated using the secure query (qMyTopTwenty_SEC). If the security checks preference variable is not enabled, the report is generated using the non-secure query (qMyTopTwenty).

Note:

The secure form of a custom query should follow the specific pattern of the existing secure queries in the contenttrackerreports_query.htm file. For more information, see "Creating Secure Report Queries".

For more information about using security checks for a particular custom query, see "Enabling/Disabling Security Checks for Report Queries". For more information about collectively enabling or disabling security checks for all report queries, see "Changing the Security Checks Preference Setting".

8.5.8.3 Security Mode Selection

To generate a requested report, Content Tracker Reports must select and execute the applicable non-secure or secure query.

This section covers the following topics:

8.5.8.3.1 Query Type Selection Process

Content Tracker Reports chooses a report query based on the following process:

  • When a user submits a report request, the name of that report query is fed to a dedicated Content Tracker Reports service.

  • The Content Tracker Reports service enforces the security checks setting as follows:

    • If the security checks preference is disabled:

      Content Tracker Reports is running in non-secure mode and does not perform role/account matching (user role and account privilege verification). The Content Tracker Reports service searches for the non-secure version of the query and uses it to generate the requested report. It is irrelevant whether there is a secure version of the report query.

      In non-secure mode, only non-secure queries are used to generate reports. As a result, all users see the same report results regardless of their individual role and account memberships.

    • If the security checks preference is enabled:

      Content Tracker Reports is running in secure mode and performs role/account matching (user role and account privilege verification).

      To begin processing:

      The Content Tracker Reports service appends the "_SEC" suffix to the submitted query name and searches the contenttrackerreports_query.htm file for this variant of the requested query.

      During the search:

      • If the secure form of the query is found, then it is used to generate the requested report.

        This means that the security checks to enforce role/account matching are performed and the query results are limited by the role and account privileges of the user requesting the report. Accordingly, different users may see different data results.

      • If the secure form of the query is not found, then the non-secure variant is used.

        This actually produces the same result as if the security checks preference was disabled. This means, role/account permissions are not authenticated and the content item data is not filtered. Consequently, the results included in reports are identical for all users. It is possible for users without proper permissions to view confidential information.

8.5.8.3.2 Report Query Selection Example

When a user requests the User Type report:

  1. The report query name (qSctrUsersByType) is passed to the Content Tracker Reports service.

  2. The Content Tracker Reports service evaluates the request based on the security checks preference variable:

    1. If security checks are disabled (set to false), then the service finds the qSctrUsersByType query in the contenttrackerreports_query.htm file.

    2. If security checks are enabled (set to true), then the service adds a security suffix to the query name (qSctrUsersByType_SEC) and searches for this variant in the contenttrackerreports_query.htm file.

  3. Depending on the security checks status, Content Tracker Reports uses the applicable query to generate the Users by User Type report.

Surrounding text describes sel_query_process.gif.

8.5.8.4 Customization for Report Query Security

In secure mode, Content Tracker Reports always gives priority to the secure forms of queries. This means that if a a secure form of a query is found in the contenttrackerreports_query.htm file, then it is used to generate the report instead of the corresponding non-secure query.

It is not possible to selectively enable or disable the security checks preference variable for individual report queries. However, it is possible to manage secure and non-secure queries by customizing the contenttrackerreports_query.htm file. Depending on your security requirements for report data, you may optionally want to customize the report query file.

Customizing the report query file involves:

  • Selectively enabling or disabling security checks (account matching) for specific report queries.

  • Creating one or more non-secure custom report queries and, depending on the security requirements of the information, selectively including the corresponding secure version.

8.5.9 Using Content Tracker Reports

This section provides information and task procedures about Content Tracker Reports functions.

This section covers the following topics:

8.5.9.1 Generating Reports

To generate a pre-defined or custom report:

  1. Open the Content Tracker Report Generator main page by clicking the Content Tracker Reports link in the Administration tray.

  2. Select the radio button of the desired report type.

  3. Enter any desired search and filtering criteria in the applicable fields.

  4. Click Submit.

    The selected report type is displayed.

8.5.9.2 Accessing Drill Down Reports

To access one or more drill down reports:

  1. Generate a pre-defined or custom report. See "Generating Reports".

  2. After generating a pre-defined report, certain line item results contain an active drill down report link. Click on the desired link.

    The selected drill-down report is displayed.

    Note:

    Some reports contain multiple levels of drill down reports. For example, the Top Content Items report contains a DocName drill down report link. Clicking this link generates another report that displays the applicable content access details for the selected content item. In this report, two additional drill down reports are available: one for Accesses and another for Users.

8.5.9.3 Accessing Reports from the Information Page

The Access History Report for any content item can be generated from the Information page of that content item as follows:

  1. Search for a content item and click the associated Info icon.

    The Content Information page is displayed.

  2. Select View Access History Report from the Global Actions list.

    The most current Content Access Report for the content item is displayed.

  3. On the Content Access Report, click the live Accesses link.

    The most current Accesses by Day report for the content item is displayed.

  4. On the Content Access Report, click the live Users link.

    The most current Accesses by User report for the content item is displayed.

8.5.9.4 Viewing Access Results by Revision

By default, the access results for multiple versions of a single content item are displayed individually on the Content Dashboard. To see the separated access results view of the Content Dashboard report:

  1. Generate a content item-based query report from the Content Tracker Report Generator main page. See "Generating Reports". For example, select the Top Content option on the Content Tracker Report Generator Main Page to generate the applicable report.

  2. Select a content item from the results report and click the content identification number listed in the DocName column.

    The Content Dashboard for the selected content item is displayed. By default, this view shows the access results for each revision of the selected content item that was accessed. For more information, see "Content Dashboard Feature".

8.5.9.5 Viewing Access Results for All Versions Combined

To see the combined access results view of the content dashboard report:

  1. Generate a content item-based query report from the Content Tracker Report Generator main page. See "Generating Reports". For example, select the Top Content option on the Content Tracker Report Generator Main Page to generate the applicable report.

  2. Select a content item from the results report and click the content identification number listed in the DocName column.

    The Content Dashboard for the selected content item is displayed.

  3. Click the All Versions Together link.

    The resulting content dashboard view shows the combined access results for both versions.

8.5.9.6 Creating Custom Report Queries

This section provides an example that demonstrates how to create a non-secure custom report query. This particular query generates a report that lists users and their personal attributes. The data is derived from the Content Server's Users database table.

Note:

The example in this section uses a non-secure query. Therefore, the generated report results can be viewed by any user regardless of their role and account privileges. All of the reports are generated using either non-secure of secure queries. The query selection is dependent on the security mode. For more detailed information about the optional security checks preference variable, see "Security Checks and Query Results". If you want to create a secure report query, see "Creating Secure Report Queries".

To create the custom users report:

  1. Design your SQL report query.

  2. Enter the custom report query into the query file of Content Tracker Reports:

    1. In a text editor, open the contenttrackerreports_query.htm file:

      IntradocDir/custom/ContentTrackerReports/resources/contenttrackerreports_query.htm

    2. Enter the custom report name, number of columns, and the source database table.

      For example, the following excerpt from the query file illustrates that the custom query report will extract the information from all columns in the Users database table.

      <tr>
          <td>qCustomUsers</td>
          <td>
          SELECT *
          FROM Users
          </td>
      </tr>
      
  3. Enter a link to the custom report in the Content Tracker Report Generator main page file:

    1. Open the following directory:

      IntradocDir/custom/ContentTrackerReports/templates

    2. In a text editor, open the following file:

      contenttrackerreports_main_page.htm

    3. Enter the attributes to display the link on the Content Tracker Report Generator main page.

      For example, the following excerpt from the main page file illustrates that the custom report link is presented as a selectable radio button and is listed as "Custom Users Report" on the page. See the custom report link in the section about "Custom Report Query Display Results".

      <h4 class=xuiSubheading>Custom Reports</h4>
      <table width=80% border=0>
          <tr>
          <td> <span class="tableEntry"><input type="radio" name="radiobutton" value="qCustomUsers">
          Custom Users Report </span></td>
          </tr>
      </table>
      
  4. Enter the formatting requirements in the template resource file of Content Tracker Reports:

    1. Open the following directory:

      IntradocDir/custom/ContentTrackerReports/resources

    2. In a text editor, open the following file:

      contenttrackerreports_template_resource.htm

      To view the resulting custom report format, see the generated custom report in the section about "Custom Report Query Display Results".

    3. Enter the display features to use for the generated custom report as well as any desired drill-down reports, see the drill-down report in the section about "Custom Report Query Display Results".

      For example, the following excerpt from the template resource file illustrates that in addition to the link listing, the report title is "Deanna's First Report" and a drill-down report is provided that is based on the content items seen by user report.

      <!-- Custom Template -->
      <@dynamichtml qCustomUsers_vars@>
          <$reportWidth = "100%"$>
          <$title = "<i>Content Access Report</i>"$>
          <$reportTitle="Deanna's First Report"$>
          <$column1Width="35%"$>
          <$column0Drill="qSctrDocsSeenByUser_Drill"$>
      <@end@>
      
  5. Restart the Content Server to apply the changes.

8.5.9.7 Changing the Security Checks Preference Setting

You can manually enable or disable the ScrtEnableSecurityChecks preference setting:

  1. Log in to Content Server as an administrator.

  2. Select Admin Server from the Administration menu.

    The Content Admin Server page is displayed.

  3. Click the name of the Content Server instance whose security checks preference setting will be changed.

    The Content Admin Server <instance_name> page is displayed.

  4. Click Component Manager.

    The Component Manager page is displayed.

  5. In the Update Component configuration field, select Content Tracker Reports from the list.

  6. Click Update.

    The Update Component Configuration page is displayed.

  7. In the ScrtEnableSecurityChecks preference field, enter the new setting (true or false).

  8. Click Update.

    Content Tracker Reports is successfully updated with the new setting and is effective immediately. You do not need to restart Content Server.

8.5.9.8 Enabling/Disabling Security Checks for Report Queries

If the security checks preference variable is enabled, and a secure version of a query exists in the contenttrackerreports_query.htm file, then Content Tracker Reports will use the secure query to generate the requested report. However, you may decide that certain reports do not need to be generated using security checks. Accordingly, you can selectively disable the secure version of any report query.

To disable security checks (account matching) for particular report queries:

  1. In a text editor, open the contenttrackerreports_query.htm file:

    IntradocDir/custom/ContentTrackerReports/resources/contenttrackerreports_query.htm

  2. Locate the secure version of the query that you want to disable.

  3. Rename the query. For example, if you want to disable the qSctrUsersByType_SEC query, you can add the suffix "_disabled" to the query name:

    qSctrUsersByType_SEC_disabled
    

    Renaming the query ensures that the Content Tracker Reports service can not find the secure query in the contenttrackerreports_query.htm file. Instead, the non-secure version (qSctrUsersByType) will be used.

    Note:

    Renaming a secure query is a temporary disabling solution. Later, if you decide that you prefer to use the secure version of a query, you can easily re-enable it by restoring its original name.

    Alternatively, you can delete the secure version of the query. However, if you subsequently reconsider, you would need to recreate the entire secure version of the query.

  4. Save and close the contenttrackerreports_query.htm file.

  5. Restart the Content Server to apply the changes.

8.5.9.9 Creating Secure Report Queries

For most of the pre-defined report queries, there are both non-secure and secure versions in the contenttrackerreports_query.htm file. Optionally, you can create a secure version for any query that does not currently have one. In particular, this includes any non-secure custom queries that you have added.

To create a secure version of a non-secure report query:

  1. In a text editor, open the contenttrackerreports_query.htm file:

    IntradocDir/custom/ContentTrackerReports/resources/contenttrackerreports_query.htm

  2. Locate the query for which you want to create a secure version. For consistency, you should add your secure query immediately following the corresponding non-secure version.

  3. Design your secure SQL report query: It might be helpful to review Step 2 in the procedure for "Creating Custom Report Queries".

  4. Adjust your query to ensure that it follows the pattern of the existing secure queries:

    1. In the FROM clause, include the Revisions table.

    2. In the WHERE clause, include the %SCTR_SECURITY_CLAUSE% token. This acts as a placeholder for the WHERE clause that the Content Tracker Reports service inserts.

    3. Complete the query following the established pattern in the existing secure queries.

      The Figure 8-4 illustrates a typical report query pairing.

  5. Save and close the contenttrackerreports_query.htm file.

  6. Restart the Content Server to apply the changes.

Figure 8-4 Examples of Non-Secure and Secure Report Query Versions

Surrounding text describes Figure 8-4 .

8.5.9.10 Using an External Report Generator

To generate custom reports from an external reporting tool:

  1. Open the external reporting tool application

  2. Set up an ODBC connection (if appropriate) to the Content Server database.

  3. Select the database tables that you want to use in your report.

  4. Link together the selected tables based on key IDs or fields that are common within the files. Ideally, each selected table could be linked using the same key ID or field if it is common to each table.

  5. Choose and integrate the desired fields from each table into the report form. In most cases, the fields can be selected, dragged, and dropped onto the form.

    In this step, you design the customized report. The specific fields that you select will display as columns on the final, basic text report that the external reporting application generates.

  6. Optionally, you may want to create custom parameters and/or criteria if the external reporting application supports these options.

    For example, one type of custom parameter would allow you to either have queried information hard-coded into the final report or use a prompt to obtain input directly form the end user. Additionally, creating specific sort criteria can strategically restrict and optimize the aggregate data included in the final report.

  7. Specify the sorting order of the selected fields and format the final report output.

  8. Preview the final report (optional).

  9. Check the report into a delivery mechanism.

    Generally, the final report can be formatted and delivered as web-viewable pages or as a printable file. The external reporting application can also use the data results to create attractive graphics such as bar graphs or pie charts.

    Additionally, the saved file can be imported into other products such as Microsoft Excel or Word files.

8.6 Service Call Configuration

This section covers the following topics:

8.6.1 About the Service Call Configuration File

The Content Tracker service handler filter makes it possible to gather information about Content Server activity other than content requests. Service request details are collected by the service handler filter and stored in the SctAccessLog table in real time. The details are obtained from the DataBinder that accompanies the service call. For a Content Server service call to be logged, it must have an entry in the service call configuration file (SctServiceFilter.hda).

The SctServiceFilter.hda file is a user-modifiable configuration file that is used to limit the number of service calls that are logged. This enables you to selectively control which services will be logged. Additionally, you can optionally expand the data logging function for any service call included in the SctServiceFilter.hda file. That is, you can also log and track data values of specific DataBinder fields that are relevant to a particular service. See "Extended Service Call Tracking Function".

Service tracking is limited to top-level services that are called via the server socket port. Sub-services, or services that are called internally, cannot be tracked.

The purpose of the SctServiceFilter.hda file is to define which parts of Content Server are of particular interest to users. If a Content Server service is not listed in the SctServiceFilter.hda file, it is ignored by Content Tracker. Additionally, if a service is not listed in this file, it can only be logged by the Content Tracker logging service. See "About the Content Tracker Logging Service".

There are two ways to make changes to the SctServiceFilter.hda file. You can add new services and edit the existing service call parameters in the file from the Data Engine Control Center, see "Services Tab". Or, you can manually edit the SctServiceFilter.hda file, see "Manually Editing the SctServiceFilter.hda File".

Tip:

You can control the services that you want to log by including or excluding them from the SctServiceFilter.hda file. This is an effective method to control logging for particular services or for all services. Also, the extended service call tracking function enables you to customize the type of data that is logged for a specific service.

This section covers the following topics:

8.6.1.1 General Service Call Logging

Services listed in the SctServiceFilter.hda file are detected by the Content Tracker service handler filter and the values of selected data fields are captured. Content Tracker then logs the named service calls. The information along with the timestamps, etc. are written dynamically into the SctAccessLog table.

For each enabled service, Content Tracker automatically logs certain standard DataBinder fields, such as dUser, dDocName, etc.Also, DataBinder fields associated with the extended service call tracking function are logged to the general purpose columns in the SctAccessLog table.

Data is inserted into the SctAccessLog table in real time using Content Tracker-specific services sequence numbers and a type designation of "S" for service. ("W" designations indicate static URL event types). Manual and/or scheduled reductions are only required to process the static URL access information gathered by the web server filter. See "Web Server Filter" for details.

Surrounding text describes service_logging.gif.

8.6.1.2 Extended Service Call Tracking Function

The extended service call tracking function enables you to log Content Server service calls and, optionally, supplement this information by also logging relevant data values from one or more additional DataBinder fields other than the standard DataBinder fields logged by each configured service call.

This section covers the following topics:

8.6.1.2.1 Service Call ResultSet Combinations

Each service that Content Tracker logs must have an entry in the ServiceExtraInfo ResultSet that is contained in the SctServiceFilter.hda file. Content Tracker automatically logs various standard DataBinder fields, such as dUser and dDocName. However, the service-related data logged by Content Tracker can be expanded by logging and tracking relevant data values from supplementary DataBinder fields.

The extended service call tracking function is implemented by linking the entries in the ServicesExtraInfo ResultSet to field map ResultSets. Each field map ResultSet contains one or more sets of data field names, the source location, and the destination table column name in the SctAccessLog table. This grouping allows you to select data fields that are relevant to the associated service call and have the data values logged into the specified column in the SctAccessLog table.

Since more than one expanded service can be logged using the extended tracking function, the contents of the general purpose columns in the SctAccessLog table cannot be properly interpreted without knowing which service is being logged. The service name is always logged in the sc_scs_idcService column. Your queries should match this column with the desired service name.

Caution:

In field map ResultSets, nothing prevents you from mapping data fields to existing, standard SctAccessLog table columns. The extended service mapping occurs after the standard field data values are collected. Consequently, you can override any of the standard table column fields.

For example, the service you are logging might carry a specific user name (such as, MyUserName=john) in a data field. You could use the extended tracking function to override the contents of the sc_scs_dUser column. In this case, you simply combine MyUserName and sc_scs_dUser as the data field, location, and table column set in the field map ResultSet.

Therefore, it remains your responsibility to ensure that the data being logged is a reasonable fit with the SctAccessLog column type.

For examples of linked service entries and ResultSets, see "Linked Service Entries and Field Map ResultSets". For more information about the contents of the SctAccessLog table and the general purpose columns that are intended to be mapped to data fields, see the "Combined Output Table". For more information about the service call user interface, see the "Services Tab".

8.6.1.2.2 General Purpose Columns in the Output Table

In the field map ResultSets for extended service tracking, you must map the DataBinder fields to columns in the SctAccessLog table. The general purpose columns (extField_1 through extField_10) are available for mapping. These columns may be filled with any data values you consider appropriate for logging and tracking for a particular service. It is recommended and expected that you use these columns to avoid overwriting the standard table columns.

Tip:

The name of the service will always be logged to the sc_scs_idcService column. Therefore, you should include it as a qualifier in any query that uses the contents of the extended fields. For more information about custom reports that include specific SQL queries involving SctAccessLog table columns, see "Creating Custom Report Queries".

8.6.1.3 Service Call Configuration File Contents

The initial contents of the service call configuration file (SctServiceFilter.hda) are the commonly used content access, search, and user authentication services native to Content Server. This file contains a ResultSet structure with one entry for each service to be logged. Optionally, to support the extended service call tracking function, this file may also include field map ResultSets that are linked to the service entries contained in the ServiceExtraInfo ResultSet.

You can add new entries and/or edit existing entries in the SctServiceFilter.hda file with the Services user interface accessed through the Data Engine Control Center. Or, you can optionally change entries in the file manually. See the "Services Tab" or "Manually Editing the SctServiceFilter.hda File".

Note:

You can review the set of initial services that Content Tracker logs into the SctAccessLog table by accessing the SctServiceFilter.hda file in the following directory:

<cs_root>/data/contenttracker/config/SctServiceFilter.hda

The following tables provide details of the service call configuration file result set schema. The values are copied directly to the corresponding columns in the SctAccessLog table.

ServiceExtraInfo ResultSet Contents:

Feature Description
Service Name (sctServiceName) The name of the service to be logged. For example, GET_FILE. If no row is present in the ResultSet for a given service, the service will not be logged.
Calling Product (sctCallingProduct) An arbitrary string. It is generally set to "Core Server" for all standard Content Server entries.
Event Type (sctEventType) An arbitrary string. It is generally set to "Content Access" for all standard Content Server entries.
Reference (sctReference) Used to set the sc_scs_reference field in the SctAccessLog table. If blank, the internal getReference logic is used.
Field Map (sctFieldMap) The name of the field map ResultSet that is added to the SctServiceFilter.hda file. This field is only required if you plan to use the extended service call tracking function. This function enables you to log DataBinder field information to one or more of the general purpose columns in the SctAccessLog table.

Field Map ResultSet Contents:

Feature Description
Field Map Link The name of the field map ResultSet.

To help you create your field map, a configuration variable can be set that writes out the service DataBinder object. This enables you to see what data is available at the time the event is recorded.

DataBinder Field (dataFieldName) The name of the DataBinder field name whose data values are logged to a general purpose column in the SctAccessLog table. See also the Field Name field on the Field Map Screen.
Data Location (dataLocation) The section in the Content Server service DataBinder where the field to be logged is located.See also the Field Location field on the Field Map Screen.
Access Log Column (accessLogColumnName) The specific general purpose column in the SctAccessLog table where data values from a specified DataBinder field are logged. See also the Column Name field on the Field Map Screen.

The fields copied from the DataBinder and inserted into the SctAccessLog table include: dID, dDocName, IdcService, dUser, SctCallingProduct, SctEventType, and SctReference. If the values for the latter three fields are included in a service's entry in the SctServiceFilter.hda file, they will override the corresponding values in the data field.

There should be no duplication or conflicts between services logged via the service handler filter and those logged via the Content Tracker logging service. If a service is named in the Content Tracker service handler filter file then such services are automatically logged so there is no need for the Content Tracker logging service to do it.

Tip:

Adding desired service calls to the SctServiceFilter.hda file and using this method to log specific activity allows you the advantage of providing values for the CallingProduct, EventType, and Reference fields. The assigned values are copied directly to the corresponding columns in the in the SctAccessLog table.

8.6.1.4 ResultSet Examples

The default SctServiceFilter.hda file includes various common service calls.

Note:

You can review the initial set of services that Content Tracker logs into the SctAccessLog table along with the service entries and field map ResultSets by accessing the SctServiceFilter.hda file in the following directory:

<cs_root>/data/contenttracker/config/SctServiceFilter.hda

For more detailed information about these services or any others that you may want to include in the service call configuration file, see the Oracle Fusion Middleware Services Reference Guide for Universal Content Management

This section covers the following topics:

8.6.1.4.1 ServiceExtraInfo ResultSet Entries

The following list provides examples of several service entries contained in the SctServiceFilter.hda file's ServiceExtraInfo ResultSet.

  • GET_FILE_BY_NAME

    Core Server

    Content Access

  • GET_DYNAMIC_URL

    Core Server

    Content Access

  • GET_DYNAMIC_CONVERSION

    Core Server

    Content Access

  • GET_EXTERNAL_DYNAMIC_CONVERSION

    Core Server

    Content Access

  • GET_ARCHIVED_FILE

    Core Server

    Content Access

  • COLLECTION_GET_FILE

    Folders

    Content Access

8.6.1.4.2 Linked Service Entries and Field Map ResultSets

The following table lists several examples of service entries that are linked to field map ResultSets. These examples, or other similar ones, are included in the initial SctServiceFilter.hda file.

Service Entries Field Map ResultSets
GET_SEARCH_RESULTS
Core Server
Search

SearchFieldMap
@ResultSet SearchFieldMap
3
dataFieldName 6 255
dataLocation 6 255
accessLogColumnName 6 255
MiniSearchText
LocalData
extField_1
TranslatedQueryText
LocalData
extField_2
IsSavedQuery
LocalData
extField_7
@end
PNE_GET_SEARCH_RESULTS
Core Server
Search

SearchFieldMap
@ResultSet SearchFieldMap
3
dataFieldName 6 255
dataLocation 6 255
accessLogColumnName 6 255
MiniSearchText
LocalData
extField_1
TranslatedQueryText
LocalData
extField_2
IsSavedQuery
LocalData
extField_7
@end
GET_FILE
Core Server
Content Access

GetFileFieldMap
@ResultSet GetFileFieldMap
3
dataFieldName 6 255
dataLocation 6 255
accessLogColumnName 6 255
RevisionSelectionMethod
LocalData
extField_1
Rendition
LocalData
extField_2
@end

8.6.2 About the Content Tracker Logging Service

The Content Tracker logging service is a single service call (SCT_LOG_EVENT) that allows an application to log a single event to the SctAccessLog table. The service may be called directly via a URL or as an action in a service script. It may also be called from IdocScript using the executeService() function. The calling application is responsible for setting any and all fields in the service DataBinder that are to be recorded, including the descriptive fields listed in the Content Tracker SctServiceFilter.hda configuration file.

The SCT_LOG_EVENT service copies information out of the service DataBinder. This data is inserted into the SctAccessLog table in real time using the Content Tracker specific services sequence numbers and a type designation of "S" for service. Manual and/or scheduled reductions are only required to process the static URL access information gathered by the web server filter. See "Web Server Filter".

Note:

There should be no duplication or conflicts between services logged via the service handler filter and those logged via the Content Tracker logging service. If a service is named in the Content Tracker service handler filter file then such services are automatically logged so there is no need for the Content Tracker logging service to do it. However, Content Tracker will make no attempt to prevent such duplication.

8.6.3 Managing Service Call Information

This section provides information and task procedures for mapping and logging data from Content Server services to the combined output database table (SctAccessLog).

This section covers the following topics:

8.6.3.1 Manually Editing the SctServiceFilter.hda File

To add or change entries in the SctServiceFilter.hda file:

  1. In a text editor, open the SctServiceFilter.hda file:

    <cs_root>/data/contenttracker/config/.../SctServiceFilter.hd

  2. Edit an existing entry or add a new service entry. For example, to add the GET_FILE_FORM service, enter the following service entry to the ServiceExtraInfo ResultSet in the file:

    GET_FORM_FILE
    Threaded Discussion
    Content Access
    <optional_reference_value>
    <optional_field_map_link_value>
    

    where the optional_field_map_link_value is used if you are implementing the extended service call tracking function. In this case, you must also add or edit the corresponding field map ResultSet. Otherwise, if you are implementing extended service tracking, skip Step 3.

  3. If you use extended service tracking, you must add or edit the corresponding field map ResultSet. For example, to add the SS_GET_PAGE service and track additional data field values, enter the following service entry and corresponding field map ResultSets to the file:

    Service Entry Field Map ResultSet
    SS_GET_PAGE
    Site Studio
    Web Hierarchy Access
    web
    SSGetPageFieldMap
    
    @ResultSet SSGetPageFieldMap
    3
    dataFieldName 6 255
    dataLocation 6 255
    accessLogColumnName 6 255
    <DataBinder_field_name>
    <data_field_location_name>
    <access_log_column_name>
    @end
    

    Note:

    Include as many sets of DataBinder field, location, and table column names as necessary.
  4. Save and close the file.

  5. Restart the Content Server to apply the new definitions.

    Note:

    Search request events are logged into the SctAccessLog table in real time and do not need to be reduced. Optionally, you can add or edit services with the user interface included in the Data Engine Control Center. For more information, see the "Data Engine Control Center" and the "Services Tab".

8.6.3.2 Setting Required DataBinder Fields to Call the Content Tracker Logging Service

The following table provides the SctAccessLog column names and the corresponding DataBinder fields that Content Tracker looks for when the Content Tracker logging service (SCT_LOG_EVENT) is called. When an application calls the Content Tracker logging service, the application is responsible for setting the necessary fields in the service DataBinder for Content Tracker to find. For more detailed information about the SctAccessLog fields, see the "Combined Output Table".

SctAccessLog Column Name Service DataBinder LocalData Field
SctDateStamp [computed]
SctSequence SctSequence
SctEntryType "S"
eventDate [computed]
SctParentSequence SctParentSequence
c_ip REMOTE_HOST
cs_username HTTP_INTERNETUSER
cs_method REQUEST_METHOD
cs_uriStem HTTP_CGIPATHROOT
cs_uriQuery QUERY_STRING
cs_host SERVER_NAME
cs_userAgent HTTP_USER_AGENT
cs_cookie HTTP_COOKIE
cs_referer HTTP_REFERER
sc_scs_dID dID
sc_scs_dUser dUser
sc_scs_idcService IdcService (or SctIdcService)
sc_scs_dDocName dDocName
sc_scs_callingProduct sctCallingProduct
sc_scs_eventType sctEventType
sc_scs_status StatusCode
sc_scs_reference sctReference (also ...)
comp_username [computed - HTTP_INTERNETUSER or ...]
sc_scs_isPrompt n/a
sc_scs_isAccessDenied n/a
sc_scs_inetUser n/a
sc_scs_authUser n/a
sc_scs_inetPassword n/a
sc_scs_serviceMsg StatusMessage

8.6.3.3 Calling the Content Tracker Logging Service from an Application

You can call the SCT_LOG_EVENT service from an application. This can be done by the application developer, or by a user willing to modify the application service scripts. The application can call SCT_LOG_EVENT from Java. Or, the application can include calls to SCT_LOG_EVENT in the service script.

8.6.3.4 Calling the Content Tracker Logging Service from IdocScript

You can call the SCT_LOG_EVENT service indirectly from IdocScript, using the executeService( ) function. This is the same as calling the SCT_LOG_EVENT service from an application except that it occurs from IdocScript instead of the application Java code. Content Tracker cannot distinguish whether the SCT_LOG_EVENT service is called from Java or from IdocScript.

8.7 Configuration and Customization

This section covers the following topics:

8.7.1 Configuration Variables

The following table lists the default values of the configuration settings used in the current version of Content Tracker. These configuration variables are contained in the Content Tracker configuration file:

<cs_root>/data/contenttracker/config/sct.cfg

Config. Setting Default Value Remarks
SctAutoTruncateDataStrings FALSE Used by: JAVA

Determines whether the reduction process will truncate data strings to fit into the corresponding table column.

SctComponentDir <cs_root>/data/contenttracker/ Used by: JAVA

Path to the directory where Content Tracker is installed.

SctDebugLogEnabled FALSE Used by: JAVA

Set TRUE to enable Java code execution trace. Used with SctDebugLogFilePath.

SctDebugLogFilePath <cs_root>/data/contenttracker/log/SCT_DEBUG_TRACE.log Used by: JAVA

Directory for Java code execution trace. Used with SctDebugLogEnabled.

SctDebugServiceBinderDumpEnabled FALSE Used by: JAVA

Set TRUE to enable diagnostic output of Service DataBinder objects during Service logging.

SctExternalUserLogEnabled TRUE Used by: JAVA

Set TRUE to enable replication of External user account and role information to UserSecurityAttributes table.

SctFilterPluginLogDir <cs_root>/data/contenttracker/data/ Used by: filter plugin

Path to the directory where filter plugin will store the event logs.

SctIdcAuthExtraConfigParams   List of Content Tracker configuration parameters that are passed along to the filter plugin, merged programmatically into idcAuthExtraConfigParams by the Content Tracker startup filter.
SctIgnoreDirectories DomainHome/ucm/cs/resources/;DomainHome/ucm/cs/common/ Used by: filter plugin

Directs filter plugin to disregard URLs contained within the listed directory roots.

SctIgnoreFileTypes gif,jpg,js,css Used by: filter plugin

Directs filter plugin to disregard URLs with the listed filetypes.

SctLogDir <cs_root>/data/contenttracker/data/ Used by: JAVA

Path to the directory(s) where Content Tracker looks for the raw event logs - sctLog, etc. May be multi-valued, e.g. dir1;dir2;…;dirn.

SctLogEnabled TRUE Used by: filter plugin, JAVA

If False, directs service handler filters and web server filter plugin to ignore all events and create no logs. This is the Content Tracker Master On/Off switch.

SctLogSecurity TRUE Used by: filter plugin, JAVA

If true, directs filter plugin to record IMMEDIATE_RESPONSE_PAGE events in the sctSecurityLog event log, and the reduction process to read the event log.

SctMaxRecentCount 5 Used by: JAVA

Maximum number of days worth of reduced data kept in the "Recent" state. Overflow from Recent is moved to Archive state.

SctMaxRereadTime 3600 Used by: JAVA

Maximum number of seconds that can occur between consecutive references by a particular user to a particular content item, e.g. a PDF file, and have the adjacent references be considered a single sustained access. Consecutive references which occur further apart in time count as separate accesses.

SctReductionAvailableDatesLookback 0 Used by: JAVA

Used with SctReductionRequireEventLogs to limit Available Dates range. Unit = Days. Zero = unlimited.

SctReductionLogDir <cs_root>/data/contenttracker/log/ Used by: JAVA

Path to the directory where the Content Tracker reduction logs are stored.

SctReductionRequireEventLogs TRUE Used by: JAVA

Used in Detached configurations. FALSE means proceed with Reduction even if no event logs are found.

SctScheduledReductionEnable TRUE Used by: JAVA

Used in Multi-JVM configurations to select which Content Server instance performs the reduction.

SctSnapshotEnable FALSE Used by: JAVA

Set TRUE to enable Snapshot functions. Set from Data Engine Control Center.

SctSnapshotLastAccessEnable FALSE Used by: JAVA

Set TRUE to enable Last Access Date Snapshot function. Set from Data Engine Control Center.

SctSnapshotLastAccessField [none] Used by: JAVA

Metadata field name for Last Access Date, e.g. xLastAccessDate. Set from Data Engine Control Center.

SctSnapshotLongCountEnable FALSE Used by: JAVA

Set TRUE to enable "Long" interval access count Snapshot function. Set from Data Engine Control Center.

SctSnapshotLongCountField [none] Used by: JAVA

Metadata field name for Long Interval Count, e.g. xAccessesInLast90Days. Set from Data Engine Control Center.

SctSnapshotLongCountInterval [none] Used by: JAVA

Number of days for "Long" Interval. Set from Data Engine Control Center.

SctSnapshotShortCountEnable FALSE Used by: JAVA

Set TRUE to enable "Short" interval access count Snapshot function. Set from Data Engine Control Center.

SctSnapshotShortCountField [none] Used by: JAVA

Metadata field name for Short Interval Count, e.g. xAccessesInLast10Days. Set from Data Engine Control Center.

SctSnapshotShortCountInterval [none] Used by: JAVA

Number of days for "Short" Interval. Set from Data Engine Control Center.

SctUseGMT FALSE Used by: filter plugin, JAVA

Set TRUE for logged event times to be converted to Universal Coordinated Time. FALSE uses local time.


The following variables are not available in the sct.cfg file. You can access them only via the Component Manager:

Config. Setting Default Value Remarks
SctPostReductionExec [none] Used by: JAVA

Path to Post Reduction Executable (assumed to be in IntradocDir/custom/ContentTracker/bin/)

SctProxyNameMaxLength 50 Used by: JAVA

Maximum number of characters in the name of any Content Server proxy server in the configuration. Used to increase the size of user name fields in Content Tracker table creation.

SctUrlMaxLength 3000 Used by: JAVA

Maximum expected length (characters) for URL fields. Used to determine column widths when creating tables. There may be several such columns in a given table.

SctWebBeaconIDList [none] Used by: filter plugin

List of zero or more web beacon objects. Required to add the ability to feed data to Content Tracker using client-side tags. Enables Content Tracker to gather data from cached pages and pages generated from cached services.


8.7.2 Manually Setting Content Tracker Configuration Variables

To set or edit any of the Content Tracker configuration variables:

  1. In a text editor, open the sct.cfg file:

    <cs_root>/data/contenttracker/config/sct.cfg

  2. Locate the configuration variable to be edited.

  3. Enter the applicable value.

  4. Save and close the sct.cfg file.

  5. Restart Content Server to apply the changes.

Optionally, you can add or edit the configuration variables for the activity metrics metadata fields with the user interface included in the Data Engine Control Center. These include the following variables:

  • SctSnapshotEnable

  • SctSnapshotLastAccessEnable

  • SctSnapshotLastAccessField

  • SctSnapshotLongCountEnable

  • SctSnapshotLongCountField

  • SctSnapshotLongCountInterval

  • SctSnapshotShortCountEnable

  • SctSnapshotShortCountField

  • SctSnapshotShortCountInterval

For more information about the user interface and the activity metrics functions, see the "Data Engine Control Center" and the "Snapshot Tab".

8.7.3 Activity Metrics SQL Queries

The snapshot feature enables you to log and track search relevance custom metadata fields. Content Tracker fills these fields with content item usage and access information that reflects the popularity of particular content items. The information includes the date of the most recent access and the number of accesses in two distinct time intervals. For more information about the snapshot feature, see the "Snapshot Tab".

If the snapshot feature and activity metrics are enabled, the values in the custom metadata fields are updated following the reduction processing phase. When users access content items, the values of the applicable search relevance metadata fields change accordingly. Subsequently, Content Tracker runs three SQL queries as a post-reduction processing step to determine which content items were accessed during the reporting period. For more information about the post-processing reduction step, see "Data Reduction Process with Activity Metrics".

This section covers the following topics:

8.7.3.1 Customizing the Activity Metrics SQL Queries

The SQL queries are available as a resource and can be customized to fulfill your specific needs. You may want to filter out certain information from the final tracking data. For example, you might want to exclude accesses by certain users in the tabulated results. The SQL queries are included in the sctQuery.htm file:

IntradocDir/custom/ContentTracker/resources/SctQuery.htm

Note:

In general, you should feel free to modify the WHERE clause in any of the SQL queries. However, it is recommended that you leave everything else as is.

The following SQL queries are used for the search relevance custom metadata fields:

This section covers the following topics:

8.7.3.1.1 qSctLastAccessDate

For the last access function, the qSctLastAccessDate SQL query uses the SctAccessLog table. It checks for all content item accesses on the reduction date and collects the latest timestamp for each dID. The parameter for the query is the reduction date. In this case, dates may be reduced in random order because the comparison test for the last access date will only signal a change if the existing DocMeta value is older than the proposed new value.

For more information about the last access field, see the "Snapshot Tab".

8.7.3.1.2 qSctAccessCountShort and qSctAccessCountLong

For the short and long access count functions, the qSctAccessCountShort and qSctAccessCountLong SQL queries are identical except for the "column name" for the count. They use the SctAccessLog table to calculate totals for all accesses for each dID across the time intervals specified (in days) for each. The parameters are the beginning and ending dates for the applicable rollups.

For more information about the short and long access count fields, see the "Snapshot Tab".

8.7.3.2 Customizing the Autoload Option SQL Query

The Autoload option on the Snapshot tab of the Data Engine Control Center enables you to backfill the Last Access field for all existing content. When invoked, Autoload runs the qSctLastAccessDateAutoload query which fills the empty (NULL) Last Access fields in Content Server's DocMeta database table with the current date and time.

However, the qSctLastAccessDateAutoload query is available as a resource and can be customized to fulfill your specific needs. For example, you may want to set the Last Access field to dCreateDate, dReleaseDate, or any other time that meets the requirements of your application. The qSctLastAccessDateAutoload query is included in the sctQuery.htm file:

IntradocDir/custom/ContentTracker/resources/SctQuery.htm

For more information about the last access field and the Autoload option, see the "Snapshot Tab".

8.7.4 External Users and Content Item Tracking

You have the option to control whether Content Tracker includes data about external user accesses in the applicable reports. These authenticated users are qualified based on their user roles and accounts. By default, the configuration parameter SctExternalUserLogEnabled is set to true (enabled). This allows Content Tracker to monitor external user logons and automatically propagate their role and account information to the UserSecurityAttributes table.

Regardless of whether the SctExternalUserLogEnabled configuration variable is enabled or disabled, all of the content item access information for external users is tracked and recorded. But when it is enabled, this variable ensures that this data is included in reports that explicitly correlate externally authenticated user names with their associated user roles and accounts. Specifically, the Top Content Items by User Role report and the Users by User Role report will include all of the content item access activity by external users. See the "Content Tracker Report Generator Main Page".

Note:

Optionally, you can manually disable the SctExternalUserLogEnabled configuration variable. If you choose to do so, however, content item accesses by externally authenticated users will be included in the more general reports, such at Top Content Items. This data is omitted from reports that use document access counts qualified by user role and account information.

To manually disable the SctExternalUserLogEnabled configuration variable, see "Manually Setting Content Tracker Configuration Variables".

8.8 Troubleshooting

Content Tracker has two execution trace mechanisms: the web server filter and the Java code. These are intended for diagnosing problems at customer installations and are not to be used in production.

This section covers the following topics:

8.8.1 Web Server Filter Debugging Support

The web server filter honors PLUGIN_DEBUG. Enable PLUGIN_DEBUG on the Content Server Filter Administration page and the Content Tracker web server filter will issue execution trace information. The trace is only meaningful to someone with access to the source. Customers with a problem are expected to enable PLUGIN_DEBUG, run the test scenario, and then send the log segments to Customer Service for evaluation. Otherwise, PLUGIN_DEBUG should be left turned off.

8.8.2 Setting the Debug Plugin

To set PLUGIN_DEBUG:

  1. In Content Server, click the Admin Applets link in the Administration tray.

    The Administration page is displayed.

  2. Click the Filter Administration icon or link.

    The Configure Web Server Filter page is displayed.

  3. Select the PLUGIN_DEBUG option check box.

  4. Click Update.

8.8.3 Java Code Debugging Support

You can use the System Audit functionality in Content Server for debugging support. See "System Audit Information" in the Oracle Fusion Middleware System Administrator's Guide for Content Server for more details. Add contenttracker to the Active Sections list. When the list is updated, the Content Tracker execution trace information appears with the other active sections.

8.8.4 DataBinder Dump Facility

This section covers the following topics:

8.8.4.1 Values for the DataBinder Dump Facility

The values for this configuration variable include:

  • SctDebugServiceBinderDumpEnabled=False prevents the Content Tracker service handler filter from writing out the DataBinder objects into dump files. This is the default value.

  • SctDebugServiceBinderDumpEnabled=True configures the Content Tracker service handler filter to write out the DataBinder objects into dump files. Consequently, you can use a dump file as a diagnostic aid when you are developing field maps for extended service logging. If you are creating field maps for services, the dump files enable you to see what data is available at the time the service events are recorded.

8.8.4.2 About DataBinder Object Dump Files

As soon as Content Tracker records a specific service in the log file, the contents of that service's DataBinder object are written to a serialized dump file. The contents of these files are useful for debugging when you are creating field maps to use the extended service call tracking function. These dump files allow you to see the available LocalData fields for the recorded service.

The Content Tracker service handler filter only creates dump files for DataBinder objects if the associated services are defined in the SctServiceFilter.hda file. For more information about this file, see "About the Service Call Configuration File".

Caution:

The dump files for DataBinder objects will continue to accumulate until you manually delete them. Therefore, it is recommended that you are careful to use the SctDebugServiceBinderDumpEnabled configuration variable only as necessary.

8.8.4.3 Location of the DataBinder Object Dump Files

The serialized DataBinder objects are written to:

IntradocDir/data/ContentTracker/DEBUG_BINDERDUMP/<dump_file_name>

8.8.4.4 Names of the DataBinder Object Dump Files

The dump file of DataBinder Objects are text files and their names consist of three parts as follows:

<service_name>_<filter_function>_<serial_number>.hda

Where:

  • service_name is the name of the logged service (such as, GET_FORM_FILE).

  • filter_function is one of the following:

    • End: Filter Event 'on EndServiceRequestActions' - Normal end-of-service event.

    • EndSub: FilterEvent 'on EndScriptSubServiceActions' - Normal end-of-service for service called as SubService.

    • Error: Filter Event 'on ServiceRequestError' - End of service where an error occurred. May happen in addition to End.

  • serial_number is the unique identification number assigned to the file. This enables Content Tracker to create more than one DataBinder object dump file for a given service.

Example:

GET_SEARCH_RESULTS_End_1845170235.hda

8.8.5 Accessing the DataBinder Object Dump File

To access the DataBinder object dump file for a specific logged service:

  1. In a text editor, open the specific data binder file in the following directory:

    <cs_root>/data/contenttracker/DEBUG_BINDERDUMP/

  2. Review the contents.

    The dump files for DataBinder objects will continue to accumulate indefinitely. Therefore, it is recommended that you manually delete them when you are finished.

8.8.6 Setting the Debugging Configuration Variables

See "Manually Setting Content Tracker Configuration Variables" for more detailed information about setting any of the Content Tracker configuration variables.