For each type of log file, there is a corresponding data loader component. These components are all of class atg.reporting.datawarehouse.loader.Loader, and are located in the /atg/reporting/datawarehouse/loaders/ Nucleus folder. The loader components are:

The following table summarizes the key properties of the loader components. Some of these properties are described in more detail below the table:

Property

Description

charset

The character set used for the log files. The default value is null, which means the loader expects the log files to be encoded with the character set specified by the JVM file.encoding system property. Typically you should not need to change the value of this property.

defaultRoot

The directory path to the location of the log files. The default is logs/. See Specifying the Location of Log Files for more information.

noLoadSleepMillis

The number of milliseconds to wait if, during the loading process, the loader determines that there are no log files currently ready to be loaded. After the interval specified by this property, the loader will check whether any new log files are ready to be loaded. This process is repeated until the loader shuts down. The default is 600000 (ten minutes). See Specifying the Loading Schedule for more information.

loadStatusListeners

The component to use for archiving log files when a successful load completion event is detected.

LogFileDeleteListener—Deletes the log file.

LogFileMoveListener—Moves the log file to the destination directory you specify (can be used to rename the file at the same time).

LogFileGZipListener—Leaves the file in its original location and creates a .zip archive. This is the default component.

You can write your own component based on LogStatusListener if you require other archiving behavior.

pipelineDriver

The Nucleus pathname of the pipeline driver component that the loader passes log file lines to for processing. The default value is different for each loader component. See Pipeline Drivers and Processors for more information.

queueName

A String that identifies the queue from which the loader reads log files. Each loader component has a different default value for this property. For example, the default value of the queueName property of the OrderSubmitLoader component is atg.reporting.submitOrder; for the SegmentLoader component, it is atg.reporting.segmentUpdate. You should not need to change the value of this property.

runSchedule

A CalendarSchedule that specifies the schedule for starting up the loader. See Specifying the Loading Schedule for more information.

skipRecordOnError

If set to true, if the data loader encounters an error while loading a log file, it skips the record that caused the error and moves on to the next record.

Caution: A record that causes a loading error can indicate a serious problem, and should be investigated. Under normal circumstances, this flag should be false.

stopSchedule

A CalendarSchedule that specifies the schedule for shutting down the loader. See Specifying the Loading Schedule for more information.

transactionBatchSize

Specifies the number of lines in a log file to process as a single transaction. Default is 100.

Note: Data loading for Commerce Search information is done by the existing Search loading components; see Search Loader Components in the ATG Search Administration Guide

Specifying the Loading Schedule

The Loader class implements the Schedulable interface, which enables it to be run automatically on a specified schedule. By default, each loader component is configured to start up once a day and run for a period of several hours. When a loader first starts up, it checks the corresponding queue to see if there are entries for any log files. If so, the loader claims an entry and begins to process the file. When it finishes processing the file, the loader checks the queue again, and if there are any more entries, claims another entry and processes that file. This process continues until there are no more entries in the queue.

If there are no entries in the queue, the loader sleeps for a specified period of time, and then checks again. If at this point the queue is still empty, the loader sleeps again. This process continues until a specified time, at which point the loader stops checking and waits until the next scheduled start time.

The schedule for starting up the loader is specified by the component’s runSchedule property. This property is a CalendarSchedule object that is set by default to run the loader once each day. The schedule for shutting down the loader is specified by the component’s stopSchedule property. This property is also a CalendarSchedule object, and is set by default to shut down the loader once each day, several hours after the component starts up. You can change these settings to run the loader on a different schedule.

The period of time (in milliseconds) that the loader sleeps if there are no entries in the queue is set by the component’s noLoadSleepMillis property. The default value of this property is 600000 (equivalent to 10 minutes).

For more information about scheduler components and properties, and the syntax used to set those properties, see the Core Dynamo Services chapter of the ATG Platform Programming GuideATG Platform Programming Guide.

Specifying the Location of Log Files

The data loaders may be running on a different machine (or group of machines) from the production site that creates the log files. Therefore the machines will typically use some file-sharing mechanism such as NFS, and thus may require different pathnames to access the same files. For example, the directory /u1/logfiles/ on the production environment might be accessed as /nfs/livesite/u1/logfiles/ on the loader environment.

To make it easier to share files, the loggers and loaders always specify files relative to a root location, rather than using absolute paths. You configure each component to point to this root location by setting the component’s defaultRoot property to the correct pathname for the machine the component is running on. In the example above, you would set defaultRoot for the loggers to /u1/logfiles/, and set defaultRoot for the loaders to /nfs/livesite/u1/logfiles/.

Loading Existing Order Data from a Repository

If you have existing order data in a repository and you want to load into the Data Warehouse for reporting, you can use the OrderRepositoryLoader to do so. This component treats the data in your order repository as if it were a log file and loads it accordingly.

This component uses the same SubmitOrder pipeline chain as the OrderSubmitLoader, but is managed by the OrderRepositoryPipelineDriver component.

To use the OrderRepositoryLoader, first start the OrderRepositoryPipelineDriver component. You can either start this component in the ACC, or enter the following browser path to start it in the Dynamo Admin UI:

http://host:port/dyn/admin/nucleus/atg/reporting/datawarehouse/
loaders/OrderRepositoryPipelineDriver

In the Admin UI, go to /atg/reporting/datawarehouse/loaders/OrderRepositoryPipelineDriver. In the text field, enter an RQL statement corresponding to a query against the order repository. Check the Preview box if you want to see how many records will be retrieved before actually processing the orders.

The OrderRepositoryPipelineDriver includes the following configurable properties:

When you click Submit, the query is issued against the repository. For each result, the component creates a new pipeline parameter object and sends that object down the pipeline, just as if it were a line item in a log file.

The Dynamo Admin UI keeps a count of how many records have been processed out of the total retrieved.

Note: The following data is not included for orders in the repository, and has the values indicated:

These fields are not populated in the Data Warehouse when loading existing orders. This may affect the conversion rate as it appears in reports, because that calculation relies on the session ID.

Configuring the Data Warehouse Time Zone

In addition to data loaders for orders, site visits, etc., there are other loaders whose purpose is to add information that is static, rarely-changing, or not site-dependent to the Data Warehouse. One such loader is the TimeRepositoryLoader, which populates the ARF_TIME_YEAR, ARF_TIME_MONTH, etc. tables (see the ATG Data Warehouse Guide for information).

The TimeRepositoryLoader is run by the TimeRepositoryLoaderJob, a scheduled service that runs every two weeks by default. When the job runs, the loader populates the next two weeks worth of days into the appropriate tables. For example, if the loader runs on December 20, it loads the next fourteen days, two weeks, one month, and one year into the ARF_TIME_DAY, ARF_TIME_WEEK, ARF_TIME_MONTH, and ARF_TIME_YEAR Data Warehouse tables respectively.

The TimeRepositoryLoader includes a datetimezoneID property, which identifies the time zone to use when loading this data. The default is UTC, meaning that, if an order is loaded into the warehouse, the date and time of that order’s placement are converted from the values used in the order repository to the UTC equivalent, and the UTC times are used in reports.