For each type of log file, there is a corresponding data loader component. These components are all of class atg.reporting.datawarehouse.loader.Loader, and are located in the /atg/reporting/datawarehouse/loaders/ Nucleus folder. The loader components are:
- OrderSubmitLoader
- ProductCatalogLoader
- SegmentLoader
- SiteVisitLoader
- UserUpdateLoader
The following table summarizes the key properties of the loader components. Some of these properties are described in more detail below the table:
| Property | Description | 
|---|---|
| 
 | The character set used for the log files. The default value is null, which means the loader expects the log files to be encoded with the character set specified by the JVM  | 
| 
 | The directory path to the location of the log files. See Specifying the Location of Log Files for more information. | 
| 
 | The number of milliseconds to wait if, during the loading process, the loader determines that there are no log files currently ready to be loaded. After the interval specified by this property, the loader will check whether any new log files are ready to be loaded. This process is repeated until the loader shuts down. The default is  | 
| 
 | The component to use for archiving log files when a successful load completion event is detected. 
 
 
 You can write your own component based on  | 
| 
 | The Nucleus pathname of the pipeline driver component that the loader passes log file lines to for processing. The default value is different for each loader component. See Pipeline Drivers and Processors for more information. | 
| 
 | A String that identifies the queue from which the loader reads log files. Each loader component has a different default value for this property. For example, the default value of the  | 
| 
 | A  | 
| 
 | If set to  Caution: A record that causes a loading error can indicate a serious problem, and should be investigated. Under normal circumstances, this flag should be  | 
| 
 | A  | 
| 
 | Specifies the number of lines in a log file to process as a single transaction. Default is  | 
Note: Data loading for Commerce Search information is done by the existing Search loading components; see Search Loader Components in the ATG Search Administration Guide
Specifying the Loading Schedule
The Loader class implements the Schedulable interface, which enables it to be run automatically on a specified schedule. By default, each loader component is configured to start up once a day and run for a period of several hours. When a loader first starts up, it checks the corresponding queue to see if there are entries for any log files. If so, the loader claims an entry and begins to process the file. When it finishes processing the file, the loader checks the queue again, and if there are any more entries, claims another entry and processes that file. This process continues until there are no more entries in the queue.
If there are no entries in the queue, the loader sleeps for a specified period of time, and then checks again. If at this point the queue is still empty, the loader sleeps again. This process continues until a specified time, at which point the loader stops checking and waits until the next scheduled start time.
The schedule for starting up the loader is specified by the component’s runSchedule property. This property is a CalendarSchedule object that is set by default to run the loader once each day. The schedule for shutting down the loader is specified by the component’s stopSchedule property. This property is also a CalendarSchedule object, and is set by default to shut down the loader once each day, several hours after the component starts up. You can change these settings to run the loader on a different schedule.
The period of time (in milliseconds) that the loader sleeps if there are no entries in the queue is set by the component’s noLoadSleepMillis property. The default value of this property is 600000 (equivalent to 10 minutes).
For more information about scheduler components and properties, and the syntax used to set those properties, see the Core Dynamo Services chapter of the ATG Platform Programming Guide.
Specifying the Location of Log Files
The data loaders may be running on a different machine (or group of machines) from the production site that creates the log files. Therefore the machines will typically use some file-sharing mechanism such as NFS, and thus may require different pathnames to access the same files. For example, the directory /u1/logfiles/ on the production environment might be accessed as /nfs/livesite/u1/logfiles/ on the loader environment.
To make it easier to share files, the loggers and loaders always specify files relative to a root location, rather than using absolute paths. You configure each component to point to this root location by setting the component’s defaultRoot property to the correct pathname for the machine the component is running on. In the example above, you would set defaultRoot for the loggers to /u1/logfiles/, and set defaultRoot for the loaders to /nfs/livesite/u1/logfiles/.
Loading Existing Order Data from a Repository
If you have existing order data in a repository and you want to load into the Data Warehouse for reporting, you can use the OrderRepositoryLoader to do so. This component treats the data in your order repository as if it were a log file and loads it accordingly.
This component uses the same SubmitOrder pipeline chain as the OrderSubmitLoader, but is managed by the OrderRepositoryPipelineDriver component.
To use the OrderRepositoryLoader, first start the OrderRepositoryPipelineDriver component. You can either start this component in the ACC, or enter the following browser path to start it in the Dynamo Admin UI:
http://host:port/dyn/admin/nucleus/atg/reporting/datawarehouse/
loaders/OrderRepositoryPipelineDriverIn the Admin UI, go to /atg/reporting/datawarehouse/loaders/OrderRepositoryPipelineDriver. In the text field, enter an RQL statement corresponding to a query against the order repository. Check the Preview box if you want to see how many records will be retrieved before actually processing the orders.
The OrderRepositoryPipelineDriver includes the following configurable properties:
- skipRecordOnError—If set to- true, the component skips any records for which processing results in an error. The default is false.
- errorDataListener—Set this property to a component that implements the- atg.service.datacollection.DataListenerinterface. If- skipRecordOnErroris- true, the driver notifies this component when an error occurs. The record that produced the error is added to the- DataListener.
When you click Submit, the query is issued against the repository. For each result, the component creates a new pipeline parameter object and sends that object down the pipeline, just as if it were a line item in a log file.
The Dynamo Admin UI keeps a count of how many records have been processed out of the total retrieved.
Note: The following data is not included for orders in the repository, and has the values indicated:
- Session ID—Null 
- Site visit—Unspecified 
- SegmentclusterID—Unspecified 
These fields are not populated in the Data Warehouse when loading existing orders. This may affect the conversion rate as it appears in reports, because that calculation relies on the session ID.
Configuring the Data Warehouse Time Zone
In addition to data loaders for orders, site visits, etc., there are other loaders whose purpose is to add information that is static, rarely-changing, or not site-dependent to the Data Warehouse. One such loader is the TimeRepositoryLoader, which populates the ARF_TIME_YEAR, ARF_TIME_MONTH, etc. tables (see the ATG Data Warehouse Guide for information).
The TimeRepositoryLoader is run by the TimeRepositoryLoaderJob, a scheduled service that runs every two weeks by default. When the job runs, the loader populates the next two weeks worth of days into the appropriate tables. For example, if the loader runs on December 20, it loads the next fourteen days, two weeks, one month, and one year into the ARF_TIME_DAY, ARF_TIME_WEEK, ARF_TIME_MONTH, and ARF_TIME_YEAR Data Warehouse tables respectively.
The TimeRepositoryLoader includes a datetimezoneID property, which identifies the time zone to use when loading this data. The default is UTC, meaning that, if an order is loaded into the warehouse, the date and time of that order’s placement are converted from the values used in the order repository to the UTC equivalent, and the UTC times are used in reports.

