4.2 BD Ingestion Flat File Data Load
The loading process receives, transforms, and loads Market, Business, and Reference data that alert detection and assessment investigation processing requires. After loading the base tables, the Oracle client’s job scheduling system invokes BD datamaps to derive and aggregate data.
Overview
All DIS datamaps in the Behavior Detection Flat File Interface for which staging representation is marked as Yes are applicable for Flat File loading. For more information, see Behavior Detection Flat File Interface.
Using Behavior Detection Datamaps
The Behavior Detection (BD) datamap takes the data from the flat files, enhances it, and then loads it into a target database table (FSDM).
To load data in the FSDM using Flat Files, follow these steps:
- Place the ASCII.dat flat files in the
<OFSAAI Installed Directory>/bdf/inbox
directory. - Configure the DIS.source parameter to FILE. For more
information on configuring other parameters, see Appendix D - Managing Data.
Configure the DIS.Source parameter to FILE-EXT for loading flat files through the external table. In order to load the flat files using the external table, the ext_tab_dir_path variable must also be set to the inbox directory and the database UNIX account must have read and write privileges to it.
- Execute the Account datamap which loads into the Account (ACCT) table:
<OFSAAI Installed Directory>/bdf/scripts/execute.sh Account
Note:
If there are any errors in loading, refer to the<OFSAAI Installed
Directory>/bdf/logs path.
Using Pre-processing and Loading
The pre-processor component (runDP) use XML configuration files in the /config/datamaps directory to verify that the format of the incoming Oracle client data is correct and validate its content, specifically:
- Error-checking of input data
- Assigning sequence IDs to records
- Resolving cross-references to reference data
- Checking for missing records
- Flagging data for insertion or update
Note:
The Pre-processor addresses only those files that match naming conventions that the DIS describes, and which have the date and batch name portions of the file names that match the current data processing date and batch. Oracle clients must only supply file types required by the solution sets on their implementation.To load data in the FSDM using Pre-processing and Loading, follow these steps:
- Place the ASCII.dat flat files in the
<OFSAAI Installed Directory>/ ingestion_manager/inbox
directory. The component then performs data validation and prepares the data for further processing. - Execute runDP and runDL using the following sample scripts:
- For runDP:
<OFSAAI InstalledDirectory>/ingestion_manager/scripts/runDP.sh AccessEvents
- For runDL:
<OFSAAI InstalledDirectory>/ingestion_manager/scripts/runDL.sh AccessEvents
- For runDP:
For more information on the directory structure, see Appendix D - Managing Data.
Configuring RunDP/RunDL
For flat files, Behavior Detection receives firm data in ASCII.dat flat files, which an Oracle client's data extraction process places in the /inbox directory.
Ways of Data Loading
- Full Refresh Data Loading: For full refresh data loading, first data is
truncated and then new data is inserted. For example, suppose five records
are loaded on Day 1. If new data is required on Day 2 based on the business
keys defined on the DIS files, a full refresh data load can be done.
To do a full refresh data load, set load.fullrefresh to true in the <OFSAAI Installed Directory>/bdf/config/BDF.xml path. For more information, see BDF.xml Configuration Parameters.
The time taken to do a full refresh data load is less than for an incremental load, although complete data must be provided every time.
- Incremental (Delta) Data Loading: For incremental data loading, the
following can be done:
- Data can be merged
- Existing data can be updated
- New data can be inserted
<OFSAAI Installed Directory>/bdf/config/BDF.xml
path. For more information, see BDF.xml Configuration Parameters.Note:
The time taken to do an incremental data load is more than for a full refresh data load, although there is no need to give complete data every time. Only updated or new data is required.