15 Customizing Content Tracker

Content Tracker, an optional component of Oracle WebCenter Content Server, is installed with Oracle WebCenter Content. When enabled, this component provides information about system usage, such as which content items are most frequently accessed and what content is most valuable to users or specific groups. You can customize this to provide specific information about the consumption patterns of your organization's content.

This chapter includes the following sections:

For information about using Content Tracker with the default settings, see Oracle Fusion Middleware Managing Oracle WebCenter Content.

15.1 About Content Tracker

Content Tracker monitors activity on your Content Server instance and records selected details of those activities. This section includes an overview about Content Tracker functionality.

Content Tracker incorporates several optimization functions which are controlled by configuration variables. The default values for the variables set Content Tracker to function as efficiently as possible for use in high volume production environments. For more information about Content Tracker configuration variable, see the Oracle Fusion Middleware Configuration Reference for Oracle WebCenter Content.

15.1.1 Content Tracker Accesses and Services

Content Tracker monitors a system and records information about various activities which is collected from various sources and then merged and written to a set of tables in your Content Server database. Content Tracker can monitor activity based on these accesses and services:

  • Content item accesses: Information about content item usage

    Data is obtained from Web filter log files, the Content Server database, and other external applications, such as portals and websites. Content item access data includes dates, times, content IDs, and current metadata.

  • Content Server services: All services that return content, as well as services that handle search requests. By default, Content Tracker logs only the services that have content access event types but by changing configuration, Content Tracker can monitor any Content Server service, even custom services.

  • User accesses: Information about other non-content access events such as the collection and synthesis of user profile summaries. This data includes user names and user profile information.

15.1.2 Content Tracker Components and Functions

Content Tracker provides the SctDebugServiceBinderDumpEnable debugging configuration variable that, if enabled, configures the service handler filter to write out the service DataBinder objects into dump files. These can be used as diagnostic tools when developing field map screens. The dump files enable you to see what data is available at the time the particular service events are recorded.

15.1.2.1 DataBinder Dump Facility

When Content Tracker records a specific service in the log file, the contents of that service's DataBinder object are written to a serialized dump file. The contents of these files are useful for debugging when creating field maps to use the extended service call tracking function. These dump files enable you to see the available LocalData fields for the recorded service.

The Content Tracker service handler filter only creates dump files for DataBinder objects if the associated services are defined in the SctServiceFilter.hda file.

Caution:

The dump files for DataBinder objects continue to accumulate until manually deleted. Use the SctDebugServiceBinderDumpEnabled configuration variable only as necessary.

15.1.2.1.1 Values for the DataBinder Dump Facility

The value of this configuration variable can be False or True:

  • SctDebugServiceBinderDumpEnabled=False prevents the Content Tracker service handler filter from writing out the DataBinder objects into dump files. This is the default value.

  • SctDebugServiceBinderDumpEnabled=True configures the Content Tracker service handler filter to write out the DataBinder objects into dump files. Use a dump file as a diagnostic aid when you are developing field maps for extended service logging. If creating field maps for services, the dump files enable you to see what data is available at the time the service events are recorded.

15.1.2.1.2 Location of the DataBinder Object Dump Files

The serialized DataBinder objects are written to a dump file:

IntradocDir/data/ContentTracker/DEBUG_BINDERDUMP/dump_file_name

15.1.2.1.3 Names of the DataBinder Object Dump Files

The dump file of DataBinder Objects are text files and their names consist of three parts, as follows:

service-name_filter-function_serial-number.hda

Where:

  • service-name is the name of the logged service, such as GET_FORM_FILE.

  • filter-function is one of the following values:

    • End: Filter Event 'on EndServiceRequestActions'

      This value is for a normal end-of-service event.

    • EndSub: FilterEvent 'on EndScriptSubServiceActions'

      This value is for a normal end-of-service for a service called as a subservice.

    • Error: Filter Event 'on ServiceRequestError'

      This value is for an end-of-service in which an error occurred. It may occur in addition to an End filter event.

  • serial-number is the unique identification number assigned to the file.

    This number enables Content Tracker to create more than one DataBinder object dump file for a given service.

Example:

GET_SEARCH_RESULTS_End_1845170235.hda

15.1.2.2 Performance Optimization

Content Tracker collects and records only content access event data. This excludes information gathering on noncontent access events like searches or the collection and synthesis of user-profile summaries.

Content Tracker incorporates several optimization functions that are controlled by configuration variables. The default values for the variables are set for Content Tracker to function as efficiently as possible for use in high-volume production environments. You can set alternate values during installation or change the values later.

These performance variables are available:

  • SctTrackContentAccessOnly

    Content Access Only: This variable determines what types of information is collected. When enabled (the default), only content access events are recorded.

  • SctDoNotPopulateAccessLogColumns

    Exclude Columns: The value of this variable is a list of columns that Content Tracker does not populate in the SctAccessLog table. By default, bulky and rarely used information is not collected, which reduces the size of the output table.

  • SctSimplifyUserAgent

    Simplify User Agent: This variable minimizes the information that is stored in the cs_userAgent column of the SctAccessLog table.

  • SctDoNotArchive

    Do Not Archive: This variable ensures that all Content Tracker database tables contain the most current data and that expired table rows are discarded rather than archived. By default, only the SctAccessLog table is populated, but expired rows are not archived. If both SctTrackContentAccessOnly and SctDoNotArchive are disabled, however, all tables are populated and their expired data archived.

15.1.2.3 Installation Considerations

Set the SctUseGMT configuration parameter to true to use Greenwich Mean Time (GMT). It is set to false by default, to use local time. When upgrading from an earlier version of Content Tracker there is a one-time retreat (or advance, depending on location) in access times. To accommodate the biannual daylight savings time changes, discontinuities in recorded user access times are used (contingent on the use of local time and the location).

15.2 Customizing Content Tracker with Configuration Variables

You can use configuration variables to customize Content Tracker.

15.2.1 About Configuration Variables

The following table lists the default values of the configuration settings used in the current version of Content Tracker. These configuration variables are contained in the Content Tracker configuration file:

cs_root/data/contenttracker/config/sct.cfg

Config. Setting Default Value Remarks
SctAutoTruncateDataStrings FALSE Used by: JAVA

Determines if the reduction process truncates data strings to fit into the corresponding table column.

SctComponentDir cs_root/data/ contenttracker/ Used by: JAVA

Path to the directory where Content Tracker is installed.

SctDebugLogEnabled FALSE Used by: JAVA

Set to TRUE to enable Java code execution trace. Used with SctDebugLogFilePath.

SctDebugLogFilePath cs_root/data/ contenttracker/log/ SCT_DEBUG_TRACE.log Used by: JAVA

Directory for Java code execution trace. Used with SctDebugLogEnabled.

SctDebugServiceBinderDumpEnabled FALSE Used by: JAVA

Set to TRUE to enable diagnostic output of Service DataBinder objects during Service logging.

SctExternalUserLogEnabled TRUE Used by: JAVA

Set to TRUE to enable replication of the External user account and role information to UserSecurityAttributes table.

SctFilterPluginLogDir cs_root/data/ contenttracker/data/ Used by: filter plug-in

Path to the directory where the filter plug-in stores the event logs.

SctIdcAuthExtraConfigParams

[none]

List of Content Tracker configuration parameters passed to the filter plug-in, merged programmatically into idcAuthExtraConfigParams by the Content Tracker startup filter.
SctIgnoreDirectories DomainHome/ucm/cs/ resources/; DomainHome/ucm/cs/ common/ Used by: filter plug-in

Directs the filter plug-in to disregard URLs contained within the listed directory roots.

SctIgnoreFileTypes gif,jpg,js,css Used by: filter plug-in

Directs the filter plug-in to disregard URLs with the listed file types.

SctLogDir cs_root/data/ contenttracker/data/ Used by: JAVA

Path to one or more directories where Content Tracker looks for the raw event logs - sctLog, and so on. This parameter can be multivalued, as dir1;dir2;...;dirn.

SctLogEnabled TRUE Used by: filter plug-in, JAVA

If FALSE, directs Service Handler filters and the web server filter plug-in to ignore all events and create no logs. This is the Content Tracker Master On/Off switch.

SctMaxRecentCount 5 Used by: JAVA

Maximum number of days worth of reduced data kept in the Recent state. Overflow from Recent is moved to the Archive state.

SctMaxRereadTime 3600 Used by: JAVA

Maximum number of seconds that can occur between consecutive references by a particular user to a particular content item, such as a PDF file, and have the adjacent references be considered a single sustained access. Consecutive references that occur further apart in time count as separate accesses.

SctReductionAvailableDatesLookback 0 Used by: JAVA

Used with SctReductionRequireEventLogs to limit the available dates range. Unit = days. Zero = unlimited.

SctReductionLogDir cs_root/data/ contenttracker/log/ Used by: JAVA

Path to the directory where the Content Tracker Reduction logs are stored.

SctReductionRequireEventLogs TRUE Used by: JAVA

Used in Detached configurations. FALSE means proceed with Reduction even if no event logs are found.

SctScheduledReductionEnable TRUE Used by: JAVA

Used in multi-JVM configurations to select which Content Server instance performs the Reduction.

SctSnapshotEnable FALSE Used by: JAVA

Set to TRUE to enable Snapshot functions. Set from the Data Engine Control Center.

SctSnapshotLastAccessEnable FALSE Used by: JAVA

Set to TRUE to enable the Last Access Date Snapshot function. Set from the Data Engine Control Center.

SctSnapshotLastAccessField

[none]

Used by: JAVA

Metadata field name for Last Access Date; for example, xLastAccessDate. Set from the Data Engine Control Center.

SctSnapshotLongCountEnable FALSE Used by: JAVA

Set to TRUE to enable the Long Interval Access Count Snapshot function. Set from the Data Engine Control Center.

SctSnapshotLongCountField

[none]

Used by: JAVA

Metadata field name for Long Interval Count; for example, xAccessesInLast90Days. Set from the Data Engine Control Center.

SctSnapshotLongCountInterval

[none]

Used by: JAVA

Number of days for the Long Interval. Set from the Data Engine Control Center.

SctSnapshotShortCountEnable FALSE Used by: JAVA

Set to TRUE to enable the Short Interval Access Count Snapshot function. Set from the Data Engine Control Center.

SctSnapshotShortCountField

[none]

Used by: JAVA

Metadata field name for the Short Interval Count; for example, xAccessesInLast10Days. Set from the Data Engine Control Center.

SctSnapshotShortCountInterval

[none]

Used by: JAVA

Number of days for the Short Interval. Set from the Data Engine Control Center.

SctUseGMT FALSE Used by: filter plug-in, JAVA

Set to TRUE for logged event times to be converted to Universal Coordinated Time. FALSE uses local time.

The following variables are not available in the sct.cfg file and are accessible only through the Component Manager.

Config. Setting Default Value Remarks
SctPostReductionExec

[none]

Used by: JAVA

Path to the Post Reduction Executable (assumed to be in IntradocDir/custom/ContentTracker/bin/)

SctProxyNameMaxLength 50 Used by: JAVA

Maximum number of characters in the name of any Content Server proxy server in the configuration. Used to increase the size of user-name fields in Content Tracker table creation.

SctUrlMaxLength 3000 Used by: JAVA

Maximum expected length (in characters) for URL fields. Used to determine column widths when creating tables. There can be several such columns in a given table.

SctWebBeaconIDList

[none]

Used by: filter plug-in

List of zero or more web-beacon objects. Required to add the ability to feed data to Content Tracker using client-side tags. Enables Content Tracker to gather data from cached pages and pages generated from cached services.

For more information about the Content Tracker configuration variables, see the Oracle Fusion Middleware Configuration Reference for Oracle WebCenter Content.

15.2.1.1 Access Control Lists and Secure Mode

During installation, leave the security checks preference checkbox blank. This means that on an ACL-based system, the secure mode must be disabled. In this case, it is possible for users other than a system administrator to see information about content items that they would not otherwise be authorized to access and view.

15.2.1.2 Values for the Security Checks Preference Variable

The security checks preference variable can have either of these values:

  • SctrEnableSecurityChecks=True enables the security checks installation preference.

    In secure mode, the same security criteria (role and account qualifications) used to limit Content Server search results are also applied to the generated reports. So, it is possible that two different users running the Top Content Items report might see different results.

  • SctrEnableSecurityChecks=False disables the security checks installation preference. This is the default setting.

    In nonsecure mode, the additional role and account criteria used to restrict Content Server search results are not applied to the generated reports. So, it is possible for a user other than a system administrator to see information about content items that the user would not be authorized to access and view.

15.2.1.3 File Types for Entries in the SctAccessLog

By default, Content Tracker does not log accesses to GIF, JPG, JS, CSS, CAB, and CLASS file types. Therefore, entries for these file types are not included in the combined output table after data reduction.

To log these file types, enable the file type in the sct.cfg file located in the IntradocDir/custom/ContentTracker/resources/ directory. Change the default setting for the SctIgnoreFileTypes configuration variable (gif,jpg,js,css). The default setting excludes these file types. To include one or more of these file types, delete each desired file type from the list. To ensure that these changes take effect, it is necessary to restart the web server and Content Server.

15.2.2 Setting Content Tracker Configuration Variables

To set or edit any of the Content Tracker configuration variables:

  1. In a text editor, open the sct.cfg file:

    cs_root/data/contenttracker/config/sct.cfg

  2. Locate the configuration variable to be edited.
  3. Enter the applicable value.
  4. Save and close the sct.cfg file.
  5. Restart Content Server to apply the changes.

Add or edit the configuration variables for the activity metrics metadata fields with the user interface included in the Data Engine Control Center. These include the following variables:

  • SctSnapshotEnable

  • SctSnapshotLastAccessEnable

  • SctSnapshotLastAccessField

  • SctSnapshotLongCountEnable

  • SctSnapshotLongCountField

  • SctSnapshotLongCountInterval

  • SctSnapshotShortCountEnable

  • SctSnapshotShortCountField

  • SctSnapshotShortCountInterval

For more information about the user interface and the activity metrics functions, see Data Tracking Functions in Oracle Fusion Middleware Managing Oracle WebCenter Content.

15.2.3 Tracking External Users and Content Items

The option exists to control if Content Tracker includes data about external user accesses in the applicable reports. These authenticated users are qualified based on their user roles and accounts. By default, the configuration parameter SctExternalUserLogEnabled is set to true (enabled). This allows Content Tracker to monitor external user logins and automatically propagate their role and account information to the UserSecurityAttributes table.

Regardless of whether the SctExternalUserLogEnabled configuration variable is enabled or disabled, all of the content item access information for external users is tracked and recorded. But when it is enabled, this variable ensures that this data is included in reports that explicitly correlate externally authenticated user names with their associated user roles and accounts. Specifically, the Top Content Items by User Role report and the Users by User Role report include all of the content item access activity by external users. For more information, see Content Tracker Reports in Oracle Fusion Middleware Managing Oracle WebCenter Content.

Note:

For information about how to manually disable the SctExternalUserLogEnabled configuration variable, see Setting Content Tracker Configuration Variables.

15.3 Configuring Service Calls

You can configure service calls in the service call configuration file, configure the Content Tracker logging service to log events, and manage service call information.

15.3.1 About the Service Call Configuration File

The Content Tracker service handler filter makes it possible to gather information about Content Server activity other than content requests. Service request details are collected by the service handler filter and stored in the SctAccessLog table in real time. The details are obtained from the DataBinder that accompanies the service call. For a Content Server service call to be logged, it must have an entry in the service call configuration file (SctServiceFilter.hda).

The SctServiceFilter.hda file is a user-modifiable configuration file that is used to limit the number of logged service calls. This enables you to selectively control which services are logged. The data logging function for any service call included in the SctServiceFilter.hda file can also be expanded, to log and track data values of specific DataBinder fields relevant to a particular service. For more information, see Extended Service Call Tracking Function.

Service tracking is limited to top-level services called through the server socket port. Subservices, or services called internally, cannot be tracked.

The purpose of the SctServiceFilter.hda file is to define which parts of Content Server are of particular interest to users. If a Content Server service is not listed in the SctServiceFilter.hda file, it is ignored by Content Tracker. Additionally, if a service is not listed in this file, it can be logged only by the Content Tracker logging service. For more information, see About the Content Tracker Logging Service.

You can make changes to the SctServiceFilter.hda file in two ways:

  • Add new services and edit the existing service call parameters in the file from the Data Engine Control Center.
  • Manually edit the SctServiceFilter.hda file.

    For more information, see Manually Editing the SctServiceFilter.hda File.

Tip:

Control the services to log by including or excluding them from the SctServiceFilter.hda file. This is an effective method to control logging for particular services or for all services. Also, the extended service call tracking function enables customization of the type of data that is logged for a specific service.

15.3.1.1 General Service Call Logging

Services listed in the SctServiceFilter.hda file are detected by the Content Tracker service handler filter and the values of selected data fields are captured. Content Tracker then logs the named service calls. The information with the timestamps, and so on, are written dynamically into the SctAccessLog table.

For each enabled service, Content Tracker automatically logs certain standard DataBinder fields, such as dUser and dDocName. Also, DataBinder fields associated with the extended service call tracking function are logged to the general purpose columns in the SctAccessLog table.

Data is inserted into the SctAccessLog table in real time using Content Tracker-specific services sequence numbers and a type designation of S for service. (A W designation indicates a static URL event type). Manual reductions, scheduled reductions, or both are required only to process the static URL access information gathered by the web server filter.

15.3.1.2 Extended Service Call Tracking Function

The extended service call tracking function enables the logging of Content Server service calls and supplement this information by also logging relevant data values from one or more additional DataBinder fields other than the standard DataBinder fields logged by each configured service call.

15.3.1.2.1 Service Call ResultSet Combinations

Each service that Content Tracker logs must have an entry in the ServiceExtraInfo ResultSet that is contained in the SctServiceFilter.hda file. Content Tracker automatically logs various standard DataBinder fields, such as dUser and dDocName. However, the service-related data logged by Content Tracker can be expanded by logging and tracking relevant data values from supplementary DataBinder fields.

The extended service call tracking function is implemented by linking the entries in the ServicesExtraInfo ResultSet to field map ResultSets. Each field map ResultSet contains one or more sets of data field names, the source location, and the destination table column name in the SctAccessLog table. This grouping allows you to select data fields relevant to the associated service call and have the data values logged into the specified column in the SctAccessLog table.

Since more than one expanded service can be logged using the extended tracking function, the contents of the general purpose columns in the SctAccessLog table cannot be properly interpreted without knowing which service is being logged. The service name is always logged in the sc_scs_idcService column. Your queries should match this column with the desired service name.

Caution:

In field map ResultSets, you can map data fields to existing, standard SctAccessLog table columns. The extended service mapping occurs after the standard field data values are collected. You can override any of the standard table column fields.

For example, the service you are logging might carry a specific user name (such as, MyUserName=john) in a data field. You could use the extended tracking function to override the contents of the sc_scs_dUser column. In this case, you simply combine MyUserName and sc_scs_dUser and use them as the data field, location, and table column set in the field map ResultSet.

It is your responsibility to ensure that the data being logged is a reasonable fit with the SctAccessLog column type.

For examples of linked service entries and ResultSets, see Linked Service Entries and Field Map ResultSets. For more information about the contents of the SctAccessLog table and the general-purpose columns intended to be mapped to data fields, see Combined Output Table in Oracle Fusion Middleware Managing Oracle WebCenter Content.

15.3.1.2.2 General Purpose Columns in the Output Table

In the field map ResultSets for extended service tracking, map the DataBinder fields to columns in the SctAccessLog table. The general purpose columns (extField_1 through extField_10) are available for mapping. These columns may be filled with any data values you consider appropriate for logging and tracking for a particular service. It is recommended and expected that you use these columns to avoid overwriting the standard table columns.

Tip:

The name of the service is always logged to the sc_scs_idcService column. Include it as a qualifier in any query that uses the contents of the extended fields. For more information about custom reports that include specific SQL queries involving SctAccessLog table columns, see Report Creation Types in Oracle Fusion Middleware Managing Oracle WebCenter Content.

15.3.1.3 Service Call Configuration File Contents

The initial contents of the service call configuration file (SctServiceFilter.hda) are the commonly used content access, search, and user authentication services native to Content Server. This file contains a ResultSet structure with one entry for each service to be logged. To support the extended service call tracking function, this file may also include field map ResultSets linked to the service entries contained in the ServiceExtraInfo ResultSet.

Add new entries or edit existing entries, or both, in the SctServiceFilter.hda file with the Services user interface accessed through the Data Engine Control Center, or change entries in the file manually. For more information, see Manually Editing the SctServiceFilter.hda File.

Note:

To review the set of initial services that Content Tracker logs into the SctAccessLog table see the SctServiceFilter.hda file:

cs_root/data/contenttracker/config/SctServiceFilter.hda

The following tables provide details of the service call configuration file ResultSet schema. The values are copied directly to the corresponding columns in the SctAccessLog table.

ServiceExtraInfo ResultSet Contents

Feature Description

Service Name (sctServiceName)

The name of the service to be logged. For example, GET_FILE. If no row is present in the ResultSet for a given service, the service is not logged.

Calling Product (sctCallingProduct)

An arbitrary string. It is generally set to "Core Server" for all standard Content Server entries.

Event Type (sctEventType)

An arbitrary string. It is generally set to "Content Access" for all standard Content Server entries.

Reference (sctReference)

Used to set the sc_scs_reference field in the SctAccessLog table. If blank, the internal getReference logic is used.

Field Map (sctFieldMap)

The name of the field map ResultSet that is added to the SctServiceFilter.hda file. This field is only required if the extended service call tracking function is used. This function enables the logging of DataBinder field information to one or more of the general purpose columns in the SctAccessLog table.

Field Map ResultSet Contents

Feature Description

Field Map Link

The name of the field map ResultSet.

A configuration variable can be set that writes out the service DataBinder object. This enables you to see the data available at the time the event is recorded.

DataBinder Field (dataFieldName)

The name of the DataBinder field name whose data values are logged to a general purpose column in the SctAccessLog table. See also the Field Name field on the Field Map screen.

Data Location (dataLocation)

The section in the Content Server service DataBinder where the field to be logged is located. See also the Field Location field on the Field Map screen.

Access Log Column (accessLogColumnName)

The specific general purpose column in the SctAccessLog table where data values from a specified DataBinder field are logged. See also the Column Name field on the Field Map screen.

These fields are copied from the DataBinder and inserted into the SctAccessLog table: dID, dDocName, IdcService, dUser, SctCallingProduct, SctEventType, and SctReference. If the values for the latter three fields are included in a service's entry in the SctServiceFilter.hda file, they override the corresponding values in the data field.

There should be no duplication or conflicts between services logged through the service handler filter and those logged through the Content Tracker logging service. If a service is named in the Content Tracker service handler filter file, then the service is automatically logged, so there is no need for the Content Tracker logging service to do it.

Note:

Adding desired service calls to the SctServiceFilter.hda file and using this method to log specific activity gives you the advantage of providing values for the CallingProduct, EventType, and Reference fields. The assigned values are copied directly to the corresponding columns in the in the SctAccessLog table.

15.3.1.4 ResultSet Examples

The default SctServiceFilter.hda file includes various common service calls.

Note:

To review the initial set of services that Content Tracker logs into the SctAccessLog table and the service entries and field map ResultSets, see the SctServiceFilter.hda file:

cs_root/data/contenttracker/config/SctServiceFilter.hda

For more detailed information about these services, see the Oracle Fusion Middleware Services Reference for Oracle WebCenter Content

15.3.1.4.1 ServiceExtraInfo ResultSet Entries

The following list provides examples of several service entries contained in the SctServiceFilter.hda file's ServiceExtraInfo ResultSet.

  • GET_FILE_BY_NAME

    Core Server

    Content Access

  • GET_DYNAMIC_URL

    Core Server

    Content Access

  • GET_DYNAMIC_CONVERSION

    Core Server

    Content Access

  • GET_EXTERNAL_DYNAMIC_CONVERSION

    Core Server

    Content Access

  • GET_ARCHIVED_FILE

    Core Server

    Content Access

  • COLLECTION_GET_FILE

    Folders

    Content Access

15.3.1.4.2 Linked Service Entries and Field Map ResultSets

The following table lists several examples of service entries linked to field map ResultSets. These examples, or other similar ones, are included in the initial SctServiceFilter.hda file.

Service Entries Field Map ResultSets
GET_SEARCH_RESULTS
Core Server
Search

SearchFieldMap
@ResultSet SearchFieldMap
3
dataFieldName 6 255
dataLocation 6 255
accessLogColumnName 6 255
MiniSearchText
LocalData
extField_1
TranslatedQueryText
LocalData
extField_2
IsSavedQuery
LocalData
extField_7
@end
PNE_GET_SEARCH_RESULTS
Core Server
Search

SearchFieldMap
@ResultSet SearchFieldMap
3
dataFieldName 6 255
dataLocation 6 255
accessLogColumnName 6 255
MiniSearchText
LocalData
extField_1
TranslatedQueryText
LocalData
extField_2
IsSavedQuery
LocalData
extField_7
@end
GET_FILE
Core Server
Content Access

GetFileFieldMap
@ResultSet GetFileFieldMap
3
dataFieldName 6 255
dataLocation 6 255
accessLogColumnName 6 255
RevisionSelectionMethod
LocalData
extField_1
Rendition
LocalData
extField_2
@end

15.3.2 About the Content Tracker Logging Service

The Content Tracker logging service is a single service call (SCT_LOG_EVENT) that allows an application to log a single event to the SctAccessLog table. You can call this service directly through a URL, as an action in a service script, or from Idoc Script with the executeService() function. The calling application is responsible for setting any and all fields in the service DataBinder to be recorded, including the descriptive fields in the Content Tracker SctServiceFilter.hda configuration file.

The SCT_LOG_EVENT service copies information out of the service DataBinder. This data is inserted into the SctAccessLog table in real time using the Content Tracker specific services sequence numbers and a type designation of “S" for service. Manual or scheduled reductions, or both, are required only to process the static URL access information gathered by the web server filter. For more information, see Data Reduction in Oracle Fusion Middleware Managing Oracle WebCenter Content.

Note:

There should be no duplication or conflicts between services logged through the service handler filter and those logged through the Content Tracker logging service. If a service is named in the Content Tracker service handler filter file then such services are automatically logged so there is no need for the Content Tracker logging service to do it. However, Content Tracker makes no attempt to prevent such duplication.

15.3.3 Managing Service Call Information

This section provides information and task procedures for mapping and logging data from Content Server services to the combined output database table (SctAccessLog).

15.3.3.1 Manually Editing the SctServiceFilter.hda File

To add or change entries in the SctServiceFilter.hda file:

  1. In a text editor, open the SctServiceFilter.hda file:

    cs_root/data/contenttracker/config/.../SctServiceFilter.hda

  2. Edit an existing entry or add a new service entry. For example, to add the GET_FORM_FILE service, enter the following service entry to the ServiceExtraInfo ResultSet in the file:
    GET_FORM_FILE
    Threaded Discussion
    Content Access
    optional_reference_value
    optional_field_map_link_value
    

    Specify optional_field_map_link_value in the service entry when you are implementing the extended service call tracking function. In this case, add or edit the corresponding field map ResultSet.

  3. If you are using extended service tracking, add or edit the corresponding field map ResultSet. For example, to add the SS_GET_PAGE service and track additional data-field values, enter the following service entry and corresponding field map ResultSets into the file.
    Service Entry Field Map ResultSet
    SS_GET_PAGE
    Site Studio
    Web Hierarchy Access
    web
    SSGetPageFieldMap
    
    @ResultSet SSGetPageFieldMap
    3
    dataFieldName 6 255
    dataLocation 6 255
    accessLogColumnName 6 255
    DataBinder_field_name
    data_field_location_name
    access_log_column_name
    @end
    

    Note:

    Include as many sets of DataBinder field, location, and table column names as necessary.

  4. Save and close the file.
  5. Restart the Content Server to apply the new definitions.

    Note:

    Search request events are logged into the SctAccessLog table in real time and do not need to be reduced. Add or edit services with the user interface included in the Data Engine Control Center.

15.3.3.2 Setting Required DataBinder Fields to Call the Content Tracker Logging Service

The following table provides the SctAccessLog column names and the corresponding DataBinder fields that Content Tracker looks for when the Content Tracker logging service (SCT_LOG_EVENT) is called. When an application calls the Content Tracker logging service, the application is responsible for setting the necessary fields in the service DataBinder for Content Tracker to find. For more detailed information about the SctAccessLog fields, see “Combined Output Table" in Oracle Fusion Middleware Managing Oracle WebCenter Content.

SctAccessLog Column Name Service DataBinder LocalData Field

SctDateStamp

[computed]

SctSequence

SctSequence

SctEntryType

"S"

eventDate

[computed]

SctParentSequence

SctParentSequence

c_ip

REMOTE_HOST

cs_username

HTTP_INTERNETUSER

cs_method

REQUEST_METHOD

cs_uriStem

HTTP_CGIPATHROOT

cs_uriQuery

QUERY_STRING

cs_host

SERVER_NAME

cs_userAgent

HTTP_USER_AGENT

cs_cookie

HTTP_COOKIE

cs_referer

HTTP_REFERER

sc_scs_dID

dID

sc_scs_dUser

dUser

sc_scs_idcService

IdcService (or SctIdcService)

sc_scs_dDocName

dDocName

sc_scs_callingProduct

sctCallingProduct

sc_scs_eventType

sctEventType

sc_scs_status

StatusCode

sc_scs_reference

sctReference (also . . .)

comp_username

[computed - HTTP_INTERNETUSER or . . .]

sc_scs_isPrompt

n/a

sc_scs_isAccessDenied

n/a

sc_scs_inetUser

n/a

sc_scs_authUser

n/a

sc_scs_inetPassword

n/a

sc_scs_serviceMsg

StatusMessage

15.3.3.3 Calling the Content Tracker Logging Service from an Application

You can call the SCT_LOG_EVENT service from an application. This can be done by the application developer, or by a user willing to modify the application service scripts.

  1. The application can call SCT_LOG_EVENT from Java.
  2. Or, the application can include calls to SCT_LOG_EVENT in the service script.

15.3.3.4 Calling the Content Tracker Logging Service from Idoc Script

You can call the SCT_LOG_EVENT service indirectly from Idoc Script, using the executeService() function. This is the same as calling the SCT_LOG_EVENT service from an application except that it occurs from Idoc Script instead of the application Java code. Content Tracker cannot distinguish if the SCT_LOG_EVENT service is called from Java or from Idoc Script.

15.3.4 Service Call Management and the User Interface

Content Tracker enables the logging of service calls with data values relevant to the associated services. Every service to be logged must have a service entry in the service call configuration file (SctServiceFilter.hda). In addition to the logged services, their corresponding field map ResultSets can be included in the SctServiceFilter.hda.

Content Tracker only logs services that have event types for content access or services that cause an entry to be made in the DocHistory table. This ensures maximum performance, but some service events are not logged.

The enabled services automatically log general DataBinder fields, such as dUser and dDocName. Linking a field map ResultSet to a service entry enables the use of the extended service call tracking function.

The SctAccessLog database table provides additional columns for use with the extended service call tracking function which can be filled with any data values appropriate for the associated service call. When listing the data field names in the field map ResultSet, also list the location name for the source of the data field and the table column name where the data is logged.

Caution:

In field map ResultSets, you can map data fields to existing, standard SctAccessLog table columns. The extended service mapping occurs after the standard field data values are collected. Therefore, any of the standard table column fields can be overwritten.

For example, the service you log might carry a specific user name (MyUserName=john) in a data field. You could use the extended tracking function to overwrite the contents of the sc_scs_dUser column. In this case, combine MyUserName and sc_scs_dUser and use them as the data field, location, and table column set in the field map ResultSet.

It is your responsibility to ensure that the data being logged is a reasonable fit with the SctAccessLog column type.

15.3.4.1 Adding, Editing, or Deleting Service Entries

Follow these steps to add or edit a service:

  1. Choose Administration then Content Tracker Administration from the Main menu. Choose Data Engine Control Center.

    The Data Engine Control Center opens.

  2. Click the Services tab.
  3. Click Add to create a new service entry, or choose an existing service entry from the Service Name list and click Edit.

    The Extended Services Tracking screen opens. The fields are empty when adding a new service entry. When editing an existing service entry, the fields are populated with values that can be edited.

  4. Enter or modify the applicable field values (except in the Field Map field).

    To link this service entry to a field map ResultSet, enter the applicable name in the Field Map field, and then link the field. For more information, see Linking Activity Metrics to Metadata Fields in Oracle Fusion Middleware Managing Oracle WebCenter Content.

  5. Click OK.

    A confirmation dialog box is displayed.

  6. Click OK.

    The Services tab is redisplayed with the new service or newly edited service in the Services list. The services state and the Content Tracker SctServiceFilter.hda file are updated.

    Content Tracker does not perform error checking (such as field type or spelling verification) for the extended services tracking function in the Data Engine Control Center. Errors are not generated until a reduction is done. These fields are case-sensitive. When adding new services or editing existing services, be careful to enter the proper service call names. Ensure that all field values are spelled and capitalized correctly.

To delete an entry, follow the previous steps, highlight an entry, and select Delete.

15.3.4.2 Adding, Editing, or Deleting Field Map ResultSets

To implement the extended service call tracking function, service entries must be linked to field map ResultSets in the SctServiceFilter.hda file.

Follow these steps to add a field map and link it:

  1. Choose Administration then Content Tracker Administration from the Main menu. Choose Data Engine Control Center.

    The Data Engine Control Center opens.

  2. Click the Services tab.
  3. To add a new entry, follow the procedure in Adding_ Editing_ or Deleting Service Entries. Choose the service entry from the Service Name list.
  4. Click Edit.

    The Extended Services Tracking screen opens. If necessary, edit this service entry's values now in addition to adding the field map ResultSet.

    If the service is already linked to a field map ResultSet, the name is listed in the Field Map field and one or more data field, location, and table column set are listed in the Field area.

  5. If the selected service is not linked to a secondary ResultSet, the Field Map field is empty. Enter the name of the field map ResultSet. If the selected service is already linked, skip this step.
  6. Click Add.

    The Field Map screen opens.

  7. Enter the appropriate values in the fields:
    • Field Name: The name of the data field in the service DataBinder whose data values are logged to a general purpose column in the SctAccessLog table.
    • Field Location: The section in the Content Server service DataBinder where the data field to be logged is located. You can use the following values:
      • LocalData (the default value)
      • Environment
      • BinderResultSet. This returns a comma-delimited string containing all values in the ResultSet. Size is restricted to 255 characters, allowing for commas and so on, so this value is useful only for small ResultSets.

        To accommodate more characters, enlarge or redefine the SctAccessLog table columns using standard database tools. For example, if you open up extField_3 to 2047, then it holds the equivalent amount of data. However, most databases have page-size limitations. In addition, SQL does not parse strings efficiently.

    • Column Name: The column in the SctAccessLog table where data values from a specified DataBinder field are logged.
  8. Click OK.

    The Field Map screen closes, and the values are added to the Field Name and Column Name fields.

  9. Click OK again.

    A confirmation dialog box opens.

    The Services tab is redisplayed with the updated information.

  10. Click OK.

Content Tracker does not perform error checking (such as field type or spelling verification) for the extended services tracking function in the Data Engine Control Center. Errors are not generated until a reduction is done. These fields are case-sensitive. When adding new field map ResultSets or editing existing field map ResultSets, be sure to enter the proper names and ensure that all field values are spelled and capitalized correctly.

To edit a field map, perform the previous steps, and edit the entries as needed.

To delete an entry, perform the previous steps, highlight a service entry, and select Delete.

15.4 Customizing the Activity Metrics SQL Queries

The snapshot feature enables you to log and track search relevance custom metadata fields. Content Tracker fills these fields with content item usage and access information that reflects the popularity of particular content items. The information includes the date of the most recent access and the number of accesses in two distinct time intervals. For more information about the snapshot feature, see Activity Snapshots in Oracle Fusion Middleware Managing Oracle WebCenter Content.

If the snapshot feature and activity metrics are enabled, the values in the custom metadata fields are updated following the reduction processing phase. When users access content items, the values of the applicable search relevance metadata fields change accordingly. Subsequently, Content Tracker runs three SQL queries as a postreduction processing step to determine which content items were accessed during the reporting period. For more information about the postprocessing reduction step, see Data Reduction Process with Activity Metrics in Oracle Fusion Middleware Managing Oracle WebCenter Content.

The SQL queries are available as a resource and can be customized to filter information from the final tracking data. For example, you might want to exclude accesses by certain users in the tabulated results.

The SQL queries are included in the sctQuery.htm file:

IntradocDir/custom/ContentTracker/resources/SctQuery.htm

Note:

In general, the WHERE clause can be modified in any of the SQL queries. It is recommended that nothing else be modified.

The following SQL queries are used for the search relevance custom metadata fields:

  • qSctLastAccessDate: For the last access function, this query uses the SctAccessLog table. It checks for all content item accesses on the reduction date and collects the latest timestamp for each dID. The parameter for the query is the reduction date. In this case, dates may be reduced in random order because the comparison test for the last access date only signals a change if the existing DocMeta value is older than the proposed new value.

  • qSctAccessCountShort and qSctAccessCountLong: For the short and long access count functions, the qSctAccessCountShort and qSctAccessCountLong SQL queries are identical except for the "column name" for the count. They use the SctAccessLog table to calculate totals for all accesses for each dID across the time intervals specified (in days) for each. The parameters are the beginning and ending dates for the applicable rollups.

15.4.1 Tracking Access to Content Items by External Users

The option exists to control if Content Tracker includes data about external user accesses in the applicable reports. These authenticated users are qualified based on their user roles and accounts. By default, the configuration parameter SctExternalUserLogEnabled is set to true (enabled). This allows Content Tracker to monitor external user logins and automatically propagate their role and account information to the UserSecurityAttributes table.

Regardless of whether the SctExternalUserLogEnabled configuration variable is enabled or disabled, all of the content item access information for external users is tracked and recorded. But when it is enabled, this variable ensures that this data is included in reports that explicitly correlate externally authenticated user names with their associated user roles and accounts. Specifically, the Top Content Items by User Role report and the Users by User Role report include all of the content item access activity by external users. For more information, see Creating Custom Report Queries in Oracle Fusion Middleware Managing Oracle WebCenter Content.

Note:

To manually disable the SctExternalUserLogEnabled configuration variable, see Setting Content Tracker Configuration Variables.

15.5 Tracking Indirect Access to Content with Web Beacons

Note:

The implementation requirements for the web beacon feature are contingent on the system configurations involved. All of the factors cannot be addressed in this documentation. Information about the access records collected and processed by Content Tracker are an indication of general user activity and not exact counts.

A web beacon is a managed object that facilitates specialized tracking support for indirect user accesses to web pages or other managed content. In earlier releases, Content Tracker was unable to gather data from cached pages and pages generated from cached services. When users accessed cached web pages and content items, Content Server and Content Tracker were unaware that these requests ever happened. Without using web beacon referencing, Content Tracker does not record and count such requests.

The web beacon involves the use of client side embedded references that are invisible references to the managed beacon objects within Content Server. This enables Content Tracker to record and count user access requests for managed content items that have been copied by an external entity for redistribution without obtaining content directly from Content Server. For details about circumstances when this might be used, see Web Beacon Use Cases.

When cached content is served to consumers, users perceive that the requested object was served by Content Server. The managed content is actually provided using non-dynamic content delivery methods. In these situations, the managed content is served by a static website, a reverse proxy server, or out of a file system. The web beacon feature ensures that this type of activity can be tracked.

15.5.1 Web Beacon Use Cases

Two situations in particular may merit the use of the web beacon functionality: reverse proxy activity and when using Site Studio.

In a reverse proxy scenario, the reverse proxy server is positioned between the users and Content Server. The reverse proxy server caches managed content items by making a copy of requested objects. The next time another user asks for the document, it displays its copy from the private cache. If the reverse proxy server does not already have the object in its cache, it requests a copy.

Because it is delivering cached content, the reverse proxy server does not directly interact with Content Server. Therefore, Content Tracker cannot detect these requests and does not track this type of user access activity.

A reverse proxy server is often used to improve web performance by caching or by providing controlled web access to applications and sites behind a firewall. Such a configuration provides load balancing by moving copies of frequently accessed content to a web server where it is updated on a scheduled basis.

For the web beacon feature to work, each user access includes an additional request to the managed beacon object in Content Server. This adds overhead to normal requests, but the web beacon object is very small and does not significantly interfere with the reverse proxy server's performance. Note that it is only necessary to embed the web beacon references in objects you specifically want to track.

Another usage scenario involves Site Studio, a product that is used to create websites which are stored and managed in Content Server. When Site Studio and Content Server are located on the same server, Content Tracker is configured to automatically track the applicable user accesses. The gathered Site Studio activity data is then used in predefined reports. For more information, see Site Studio Website Activity Reporting in Oracle Fusion Middleware Managing Oracle WebCenter Content.

Note:

Two modes of Site Studio integration are available with Content Tracker. One type is the existing built-in integration that automatically occurs when Site Studio is installed. This is typically used when a website is under construction and the web pages are managed in Content Server.

The other form uses the web beacon feature and Content Tracker regards Site Studio the same as any other website generator. This is typically used when a website is in production mode and content is no longer managed in Content Server.

If your website is intended for an external audience, you may decide to create a copy of the site and transfer it to another server. In addition to being viewed publicly, this solution also ensures that site development remains separate from the production site. In this arrangement, however, implement the web beacon feature to make sure that Content Tracker can collect and process user activity.

15.5.2 Web Beacon Overview

Content Tracker records and counts requests for objects managed by Content Server. The web beacon feature counts requests for managed objects copied by an external entity (such as a reverse proxy server or other functions not involving Content Server).

The following list provides a brief overview of the web beacon feature's functionality and implementation requirements.

  • To begin implementing the web beacon feature, create a Web Beacon object. This is usually a small object such as a 1x1 pixel transparent image. The object is then checked in and added to the Content Tracker list of web beacon object names.

  • Next create the Web Beacon references to the checked-in web beacon object and embedding them into cached HTML pages or managed content items. The first part of the reference is a URL reference to the web beacon object and the second part is identification information encoded as pseudo query parameters.

  • Content Tracker logs the web beacon reference to the beacon object and performs Reduction Processing for Web Beacon references. During data reduction, Content Tracker checks the dDocName value of each referenced object against the list of registered web beacons. If the dDocName value is on the list, the query parameters are processed in such a manner to ensure that the URL request is logged as a request for the tagged object (web page or managed content item) rather than the web beacon object.

15.5.3 Web Beacon Object

One or more content items must be created to use as the web beacon object or objects. These are usually a 1x1 pixel transparent image or anything with low overhead that won't disrupt the page being rendered. The ideal web beacon object has zero content. Multiple web beacon objects can be created but only one is required. Make sure the object is not a file type included in the SctIgnoreFileType configuration variable.

Check in the completed object, then update the Content Tracker SctWebBeaconIDList configuration variable. During data reduction, Content Tracker checks the SctWebBeaconIDList settings to determine how the web beacon reference listings in the event logs should be processed. If the applicable web beacon object is listed, Content Tracker processes the data appropriately. For details about configuration variables, see the Oracle Fusion Middleware Configuration Reference for Oracle WebCenter Content.

During installation, the dDocName values of web beacon objects can be entered into the SctWebBeaconIDList preference variable, or they can be added or edited later. Follow these steps to add or edit object names in the ID list:

  1. From the Administration tray or menu, choose Admin Server, then Component Manager.

    The Component Manager page opens.

  2. In the first paragraph, click advanced component manager.

    The Advanced Component Manager page opens.

  3. In the Update Component Configuration field, choose Content Tracker from the list.
  4. Click Update.

    The Update Component Configuration page opens.

  5. In the SctWebBeaconIDList preference field, enter the applicable web beacon object dDocName values, separated by commas.
  6. Click Update.
  7. Restart Content Server to apply the changes.

15.5.4 Web Beacon References

After creating and checking in the web beacon object(s), create their corresponding reference(s). A single web beacon object works in most systems because different query strings appended to the web beacon static URL make each reference unique. Each query parameter set also consists of distinct combinations of variables that identify specific cached web pages or managed content items.

15.5.4.1 Format Structure for URL References

Web beacon URL references consist of the web beacon static URL used to access the web beacon object managed by Content Server and a pseudo query string with content item variables.

When creating the references, make sure the web beacon static URL in Content Server does not use a directory root that is included in the SctIgnoreDirectories configuration variable. If the URL is one of the listed values, Content Tracker does not collect the activity data. For more information about the SctIgnoreDirectories configuration variable, see the Oracle Fusion Middleware Configuration Reference for Oracle WebCenter Content.

The query parameter set functions as a code that informs Content Tracker what the actual managed content item is that the user accessed. One of the query parameters is the item's dID. Including a unique set of query parameter values allows monitoring of indirect user access activity for managed objects that have been copied and cached. The query string is never actually executed but the query parameter values provide information for Content Tracker to be able to identify the associated managed object.

The following examples illustrate general format structures associated with the web beacon feature. The examples demonstrate how to use one web beacon object while creating an unbounded number of different query strings. The same web beacon object (dDocName = bcn_2.txt) is used in all of the examples. By varying the query parameters, the requests for this web beacon object can convey to Content Tracker a 'request' for any managed object in the repository.

These examples have the following assumptions:

  • The web beacon object (bcn_2.txt) is checked in and is included in the web beacon list (SctWebBeaconIDList).

  • The applicable web beacon references are embedded into the associated managed content items (doc1, doc2, and doc3).

  • To resolve the web beacon reference, the browser must request a copy of the web beacon object from Content Server.

  • The web beacon requests occur because users are indirectly requesting the related content items.

Example 15-1 Web Beacon Request Without Query Parameters

http://myhost.somewhere.else/idc/groups/public/documents/adacct/bcn_2.txt

This begins with a static web reference to the web beacon object. Although it is a legitimate direct access to the web beacon object, there are no appended query parameters. Content Tracker processes this access event as a request for the web beacon object itself.

Example 15-2 Web Beacon Request for Tracking doc1

http://myhost.somewhere.else/idc/groups/public/documents/adacct/bcn_2.txt?sct_dDocName=doc1&sct_dID=1234&...

This also begins with the usual static web reference to this beacon object. It has a pseudo query string appended to it that contains an arbitrary number of query parameters. The values contained in these query parameters convey the information about the specific managed object (doc1) the user has requested.

Example 15-3 Web Beacon Request for Tracking doc2

http://myhost.somewhere.else/idc/groups/public/documents/adacct/bcn_2.txt?sct_dDocName=doc2&sct_Ext_2=WebSite4

This is similar to {Example - Web Beacon Request for tracking doc3}. The parameter values provide information about the user requested content item (doc2). In this example the query string includes another parameter to convey additional information about the tagged item. The added parameter uses an extField column name. The value WebSite4 is copied into the extField_2 column of the SctAccessLog table. The extField column substitution is optional and application dependent.

Example 15-4 Web Beacon Request for Tracking doc3

http://myhost.somewhere.else/idc/groups/public/documents/adacct/bcn_2.txt?sct_dDocName=doc2&sct_Ext_2=WebSite4&sct_Ext_8=SubscriptionClass6

This example modifies {Example - Web Beacon Request for Tracking doc2} by adding a second (although non-sequential) extField column name. In this case, WebSite4 is copied into the extField_2 column of the SctAccessLog table, and SubscriptionClass6 is copied into the extField_8 column. The extField column substitutions are optional and application dependent.

15.5.4.2 Placement and Retrieval Scheme

The specially constructed web beacon references must be embedded in the managed object to track. Web beacon references can be embedded in any HTML page. Users indirectly request access to the modified managed content items through an external Site Studio website or a reverse proxy server.

The browser encounters the web beacon reference while rendering the page. Each display of the managed object, regardless of how the object was obtained, causes the browser to request a copy of the web beacon object directly from Content Server. When the browser resolves the web beacon reference, Content Tracker captures the data that includes the web beacon reference with the set of pseudo query parameters that identify the managed content item.

15.5.4.3 Data Capture and Storage

Ordinarily, query parameters in static URLs serve no function for the web browser. But when resolving the web beacon static URL, the browser ignores the appended query parameters long enough for the Content Tracker web server filter plug-in to record them. Although the pseudo query string is never executed, Content Tracker captures the query parameter values with other data such as the client IP address and date-and-time stamp. Content Tracker records the data in web access event logs.

15.5.5 Reduction Processing for Web Beacon References

When these specially constructed web beacon references are processed during data reduction, Content Tracker compares the web beacon's dDocName value to the list of dDocName values in the SctWebBeaconID list to determine if the request was for a web beacon object rather than a regular content item.

If there is no match or if no query parameters are appended to the web beacon reference, Content Tracker processes the access event normally. If the web beacon's dDocName is identified, Content Tracker continues to process and interpret the associated URL query parameters with the data reduction process treating the web beacon access request as a request for the web page or content item.

During data reduction, Content Tracker completes the processing by parsing the query parameters and performing various value substitutions for fields ultimately written to the SctAccessLog. The query parameter values are mapped as follows:

  • sct_dID replaces the web beacon object's dID value.
  • sct_dDocName replaces the web beacon object's dDocName value.
  • sct_uriStem replaces the web beacon object's URI stem (everything preceding the question mark (?)).
  • sct_uriQuery replaces the web beacon object's URI query portion (everything following the question mark (?)).
  • sct_Ext_n is copied directly into the SctAccessLog Extended Field n.

Example 15-5 Data Reduction Processing for Query Parameter Values

/idc/groups/public/documents/adacct/bcn_2.txt?sct_dDocName=WW_1_21&sct_dID=42&sct_Ext_1=WillowStreetServer&sct_Ext_2=SubscriptionTypeA

After data reduction, Content Tracker records this web beacon type request in the SctAccessLog table as an access to WW_1_21 rather than to bcn_2.txt. Other data, such as the user name, time of access, and client IP, is derived from the HTTP request. Additionally, WillowStreetServer is copied into the extField_1 column of the SctAccessLog table, and SubscriptionTypeA is copied into the extField_2 column. (These last two field substitutions are optional and application dependent.)

15.5.6 Limitations and Guidelines

Perform the following tasks to implement Content Tracker's web beacon feature:

  1. Create the web beacon object.
  2. Check it in.
  3. Update the SctWebBeaconIDList.
  4. Define the web beacon references.
  5. Embed them into the cached content items, websites, or both to track.

15.5.6.1 Limitations

The following limitations should be considered:

  • One difficulty is determining the means by which the web beacon reference is attached to a tagged object. There are situations where the requested object does not allow embedded references (for example, a PDF or Word document). In this case, the web beacon object must be requested directly from Content Server before the actual content item is requested.

  • The web beacon feature does not work in many situations, such as with certain browser configurations. If the user has disabled cross-domain references in their browser, and both the web page and Content Server instance are in different domains, the web beacon object is never requested from Content Server and the user access is not counted.

  • The first time a managed content item is accessed through a reverse proxy server, it is counted twice: once when the Content Server provides the item to the reverse proxy server, and a second time when the browser requests the web beacon object.

  • Depending on the specific configuration, it might be necessary to devise a method to prevent the reverse proxy server and external Site Studio from caching the web beacon object itself. Browsers also do caching. This situation would prevent Content Tracker from counting any relevant content accesses. To avoid this append a single-use query parameter to the web beacon reference that contains a random number as in this example:

    dDocName=vvvv_1_21&FoolTheProxyServer=12345654321

    By changing the number on each request, the cache, web server and the browser view the request as new.

15.5.6.2 Guidelines

The following guidelines should be considered:

  • The sct_dDocName and sct_dID parameter values in the web beacon reference must resolve to an actual managed content item in the same Content Server instance that provides the requested web beacon object.

  • Using the ExtField columns in the SctAccssLog table is optional and application dependent.

  • Use of ExtField_10 is reserved for the web beacon object's dDocName value. This allows report writers a way to determine which web beacon object was used to signal the access to the actual managed content item.

  • Spelling and capitalization of the query parameter names must be exact.

  • Embedded commas or spaces in the query parameter values are not allowed.

  • The dDocName and dID values of a managed object are usually included in the web beacon reference although to be considered a legitimate access request, it is not necessary to provide both. If any of the standard fields are missing, Content Tracker resolves the identification parameters as follows:

    • Given a dID, Content Tracker can determine the content item's dDocName value.

    • Given a dDocName, Content Tracker can determine the content item's dID values. The dID is the content item's most current revision. If the revision changes after the content item is cached, then the user sees the older version. However, Content Tracker counts this access request as a view of the most recent revision of the content item.

    • Given a proper URI Stem, Content Tracker can determine the content item's dDocName value but assumes the dID value of the most recent revision.

  • Restart Content Server after making changes to the web beacon list (SctWebBeaconIDList).

  • Do not create a web beacon object that uses a file type or is located in a directory that Content Tracker is configured to disregard.

  • Content Tracker is unable to verify if the cached content item was delivered.

  • Content Tracker performs normal folding of static URL accesses. If a user repeatedly requests the same content item and makes no intervening requests for another document, then Content Tracker assumes that the consecutive requests are the same document. In this case, these access requests are considered to be all one access request.

  • The query parameters can represent any managed object and need not necessarily be what the user is actually viewing.

15.5.7 Examples of Web Beacon Embedding

Several embedding methods can be used to implement the web beacon feature. Each technique has advantages and disadvantages and one may be more appropriate for a particular situation than another. Because of differences in system configurations, there is no optimal single technique.

All of the examples below use the following information:

  • WebBeacon.bmp web beacon object

  • Content Server instance IFHE.comcast.net/idc/

  • dDocName value wb_bmp

Code fragment files for all of the examples are included in Content Tracker's documentation directory. These examples are intended to demonstrate general approaches and are provided as a starting point. They will need to be adapted to work with your specific application and network topology.

15.5.7.1 Embedded HTML Example

The simplest, most direct use of a web beacon for tracking managed content access is to embed a reference to the beacon directly into the HTLM source for the containing web page. When the requesting user's browser attempts to render the page, it sends a request to the instance where the web beacon object resides.

In this example, the technique places an image tag in the web page to be tracked. The src attribute of the image refers to the web beacon object (wb_bmp) which was checked into an instance. When the user's browser loads the image the instance, the additional query information is recorded and ultimately interpreted as a reference to the dDocName BOPR.

This approach is simple but has the disadvantage that the user's browser or a reverse proxy server, might cache a copy of the web beacon object. As such, no additional requests are posted directly to the instance, and no additional accesses to any content tagged with this web beacon are counted.

The HTML fragment for this method might be written as follows:

<!-- WebBeaconEmbeddedHtml.htm - Adjust the Web Beacon web location and managed object identfiers in the img src attribute, then paste into your web page -->
<img src="http://IFHE.comcast.net/idc/groups/public/documents/adacct/wb_bmp.bmp?sct_dID=1&sct_dDocName=BOPR&sct_uriStem=http://IFHE.comcast.net/idc/groups/public/documents/adacct/bopr.pdf&sct_Ext_1=Sample_Html_Beacon_Access" width="21" height="21" />

15.5.7.2 Embedded JavaScript Example

The cached web beacon problem can be overcome by using JavaScript instead of HTML Using the embedded JavaScript method requires two script tags:

  • The cs_callWebBeacon function that issues the actual web beacon request.

  • An unnamed block that assigns context values to certain JavaScript variables, then calls the cs_callWebBeacon function.

The identifying information for the managed content object is defined in a list of variables which improves readability. The web beacon request is also made effectively unique by adding a random number to the pseudo query parameters.

Disadvantages include more code to manage and the URL of the web beacon server is hard coded in each web page. In addition, the user's browser might not have JavaScript enabled.

The JavaScript fragment for this method might be written as follows:

// WebBeaconEmbeddedJavascript.js - Adjust the managed object and Web Beacon descriptors,
then paste this into your web page.
//

<script type="text/javascript" >

    var cs_obj_dID = "" ;
    var cs_obj_dDocName = "" ;
    var cs_obj_uriStem = "" ;
    var cs_extField_1 = "" ;
    var cs_extField_2 = "" ;
    var cs_extField_3 = "" ;
    var cs_extField_4 = "" ;
    var cs_extField_5 = "" ;
    var cs_extField_6 = "" ;
    var cs_extField_7 = "" ;
    var cs_extField_8 = "" ;
    var cs_extField_9 = "" ;
    var cs_beaconUrl = "" ;

    function cs_void( ) { return ; }

    function cs_callWebBeacon( ) {
        //
        var cs_imgSrc = "" ;
        var cs_inQry = false ;

        if ( cs_beaconUrl && cs_beaconUrl != "" ) {
            cs_imgSrc += cs_beaconUrl ;
        }

        if ( cs_obj_dID && cs_obj_dID != "" ) {
            if ( cs_inQry ) {
                cs_imgSrc += "&" ;
            } else {
                cs_imgSrc += "?" ;
                cs_inQry = true ;
            }
            cs_imgSrc += "sct_dID=" + cs_obj_dID ;
        }

        if ( cs_obj_dDocName && cs_obj_dDocName != "" ) {
            if ( cs_inQry ) {
                cs_imgSrc += "&" ;
            } else {
                cs_imgSrc += "?" ;
                cs_inQry = true ;
            }
            cs_imgSrc += "sct_dDocName=" + cs_obj_dDocName ;
        }
        if ( cs_obj_uriStem && cs_obj_uriStem != "" ) {
            if ( cs_inQry ) {
                cs_imgSrc += "&" ;
            } else {
                cs_imgSrc += "?" ;
                cs_inQry = true ;
            }
            cs_imgSrc += "sct_uriStem=" + cs_obj_uriStem ;
        }

        if ( cs_extField_1 && cs_extField_1 != "" ) {
            if ( cs_inQry ) {
                cs_imgSrc += "&" ;
            } else {
                cs_imgSrc += "?" ;
                cs_inQry = true ;
            }
            cs_imgSrc += "sct_Ext_1=" + cs_extField_1 ;
        } 

        if ( cs_extField_2 && cs_extField_2 != "" ) {
            if ( cs_inQry ) {
                cs_imgSrc += "&" ;
            } else {
                cs_imgSrc += "?" ;
                cs_inQry = true ;
            }
            cs_imgSrc += "sct_Ext_2=" + cs_extField_2 ;
        } 

        <!-- and so on for the remaining extended fields -->

        if ( cs_inQry ) {
            cs_imgSrc += "&" ;
        } else {
            cs_imgSrc += "?" ;
            cs_inQry = true ;
        }

        var dc = Math.round( Math.random( ) * 2147483647 ) ;
        cs_imgSrc += "sct_defeatCache=" + dc ;

        var wbImg = new Image( 1, 1 ) ;
        wbImg.src = cs_imgSrc ;
        wbImg.onload = function( ) { cs_void( ) ; }

    }

</script>

<script type="text/javascript">
    //
    var cs_obj_dID = "1" ;
    var cs_obj_dDocName = "BOPR" ;
    var cs_obj_uriStem = "http://IFHE.comcast.net/idc/groups/public/documents/adacct/bopr.pdf" ;
    var cs_extField_1 = "Sample_Javascript_Beacon_Access" ;
    var cs_beaconUrl = "http://IFHE.comcast.net/idc/groups/public/documents/adacct/wb_bmp.bmp" ;

    cs_callWebBeacon( ) ;

</script>

15.5.7.3 Served JavaScript Example

The hard-coded web beacon server problem described in the Embedded JavaScript Example can be overcome by splitting the code into two fragments:

  • The managed code fragment contains the cs_callWebBeacon function. It can be checked in and managed by a Content Server instance, either the instance that manages the web beacon or some other instance. The src attribute contained in the in-page code fragment refers to the managed code fragment and causes it to be dynamically loaded into the web page.

  • The in-page code fragment still consists of two <script> tags, but the first contains only a reference to the cs_callWebBeacon code instead of the code itself. The advantage for this is that changes to the cs_callWebBeacon function can be managed centrally instead of having to modify each and every tagged web page.

    This solution incurs the additional network overhead of loading the managed code into the web page on the user's browser. However, the requirement for a web beacon assist to tracking implies that the network environment includes an efficient reverse proxy server, or other caching mechanism. The same cache that conceals managed object access also minimizes the impact of the code download.

Managed Code Fragment

// WebBeaconServedJavascript_Checkin.js - Check this in to your Content Server, then fixup
// the JavaScript include src attribute in WebBeaconManagedJavascriptIncludeSample.js
//
    var cs_obj_dID = "" ;
    var cs_obj_dDocName = "" ;
    var cs_obj_uriStem = "" ;
    var cs_extField_1 = "" ;
    var cs_extField_2 = "" ;
    var cs_extField_3 = "" ;
    var cs_extField_4 = "" ;
    var cs_extField_5 = "" ;
    var cs_extField_6 = "" ;
    var cs_extField_7 = "" ;
    var cs_extField_8 = "" ;
    var cs_extField_9 = "" ;
    var cs_beaconUrl = "http://IFHE.comcast.net/idc/groups/public/documents/adacct/wb_bmp.bmp" ;

    function cs_void( ) { return ; }

    function cs_callWebBeacon( ) {
        //
        var cs_imgSrc = "" ;
        var cs_inQry = false ;

        if ( cs_beaconUrl && cs_beaconUrl != "" ) {
            cs_imgSrc += cs_beaconUrl ;
        }

        if ( cs_obj_dID && cs_obj_dID != "" ) {
            if ( cs_inQry ) {
                cs_imgSrc += "&" ;
            } else {
                cs_imgSrc += "?" ;
                cs_inQry = true ;
            }
            cs_imgSrc += "sct_dID=" + cs_obj_dID ;
        }

        if ( cs_obj_dDocName && cs_obj_dDocName != "" ) {
            if ( cs_inQry ) {
                cs_imgSrc += "&" ;
            } else {
                cs_imgSrc += "?" ;
                cs_inQry = true ;
            }
            cs_imgSrc += "sct_dDocName=" + cs_obj_dDocName ;
        }

        if ( cs_obj_uriStem && cs_obj_uriStem != "" ) {
            if ( cs_inQry ) {
                cs_imgSrc += "&" ;
            } else {
                cs_imgSrc += "?" ;
                cs_inQry = true ;
            }
            cs_imgSrc += "sct_uriStem=" + cs_obj_uriStem ;
        }

        if ( cs_extField_1 && cs_extField_1 != "" ) {
            if ( cs_inQry ) {
                cs_imgSrc += "&" ;
            } else {
                cs_imgSrc += "?" ;
                cs_inQry = true ;
            }
            cs_imgSrc += "sct_Ext_1=" + cs_extField_1 ;
        }

        if ( cs_extField_2 && cs_extField_2 != "" ) {
            if ( cs_inQry ) {
                cs_imgSrc += "&" ;
            } else {
                cs_imgSrc += "?" ;
                cs_inQry = true ;
            }
            cs_imgSrc += "sct_Ext_2=" + cs_extField_2 ;
        }

        <!-- and so on for the remaining extended fields -->

        if ( cs_inQry ) {
            cs_imgSrc += "&" ;
        } else {
                cs_imgSrc += "?" ;
                cs_inQry = true ;
        }

        var dc = Math.round( Math.random( ) * 2147483647 ) ;
        cs_imgSrc += "sct_defeatCache=" + dc ;

        var wbImg = new Image( 1, 1 ) ;
        wbImg.src = cs_imgSrc ;
        wbImg.onload = function( ) { cs_void( ) ; }

    }

In-Page Code Fragment

<script type="text/javascript" src="http://IFHE.comcast.net/idc/groups/public/documents/adacct/wbmjcs.js" >
</script>

<script type="text/javascript">
    //
    var cs_obj_dID = "1" ;
    var cs_obj_dDocName = "BOPR" ;
    var cs_obj_uriStem = "http://IFHE.comcast.net/idc/groups/public/documents/adacct/bopr.pdf" ;
    var cs_extField_1 = "Sample_Managed_Javascript_Beacon_Access" ;

    cs_callWebBeacon( ) ;

</script>