22 Developing a Custom Loader for OAAM Offline

This chapter describes the overall data loader framework for OAAM Offline:

Basic framework and the default implementation
How to override the default functionality

This document assumes that you are familiar with the concepts of OAAM Offline.

22.1 Base Framework

A custom loader is required only if the data from sources other than a database, data other than login, or complex data is needed for the OAAM Offline task.

22.1.1 Overview

The OAAM Offline custom loader consists of the following key parts:

loadable object
data source
loader
run modes

Figure 22-1 Basic Framework of a Custom Loader

The loadable object represents an individual data record. The data source represents the entire store of data records and the loader processes the records. There are two types of run mode: load and playback. The run modes encapsulate the differences between loading a Session Set and running a Session Set.

22.1.2 Important Classes

Table 22-1 provides a summary of the different data loader classes.

Table 22-1 Data Loader Classes

Class	Description
RunMode	There are two basic types of RunMode: load and playback. Load run modes are responsible for importing session set data into the OAAM Offline system, and the playback run mode is responsible for processing preloaded session set data. Each run mode is responsible for constructing data source and loader. An additional responsibility is determining how to start where a previous job ended, in the cases of recurring schedules of autoincrementing session sets or paused and resumed run sessions. `AbstractLoadRunMode` and `AbstractPlaybackRunMod`e each have a factory method named `getInstance()`. These methods verify if the default run modes have been overridden.
RiskAnalyzerDataSource	The `RiskAnalyzerDataSource` is responsible for acquiring the data and iterating through it. `RiskAnalyzerDataSource` has two abstract implementors: `AbstractJDBCRiskAnalyzerDataSource` and `AbstractTextFile-RiskAnalyzerDataSource`. The `AbstractJDBCRiskAnalyzerDataSource` implements the base functionality for iterating through a JDBC result set, and the `AbstractTextFileRiskAnalyzerDataSource` implements the base functionality for iterating through a text file.
AbstractTransactionRecord	The `AbstractTransactionRecord` class only contains the state and behavior required to manage the overall risk analysis process. Subclasses will add additional state and behavior to satisfy client requirements.
AbstractRiskAnalyzerLoader	The `AbstractRiskAnalyzer`Loader is the base implementation of ObjectLoader for the Risk Analyzer process. It provides basic exception handling, but otherwise leaves the implementation up to its subclasses.

22.1.3 General Framework Execution

The following pseudocode shows the general framework execution.

AbstractRiskAnalyzerLoader loader = runMode.buildObjectLoader();
RiskAnalyzerDataSource dataSource = runMode.acquireDataSource();
try{
   while (dataSource.hasMoreRecords()) {
      AbstractTransactionRecord eachRecord = dataSource.nextRecord();
      loader.process(eachRecord);
   }
} finally {
   dataSource.close();
}

22.2 Default Implementation

The default implementation for the Risk Analyzer data loader framework works as follows:

Load mode: When in load mode, it uses any database as a data source, it expects login data, and it performs device fingerprinting.

Playback mode: When in playback mode, it uses the VCRYPT_TRACKER_USERNODE_LOGS and V_FPRINTS tables as its data source, and it runs each record through all active models.

22.2.1 Default Load Implementation

The default load implementation is summarized below.

Figure 22-2 Default Load Implementation

The default load implementation is shown.

Table 22-2 Default Implementation

Components	Description
LoadRunMode	The default `LoadRunMode` class instantiates a `DatabaseRiskAnalyzerDatasource` as its data source and a `AuthFingerprintLoader` as its loader.
DatabaseRiskAnalyzerDatasource	The `DatabaseRiskAnalyzerDatasource` creates `LoginRecords` from a JDBC data source. It uses a set of configuration properties to tell it how to connect to the JDBC data source and to tell it how to build a `LoginRecord` from the tables and fields in the remote database. The default values for these properties map to the tables in an OAAM database.
LoginRecord	The login record contains all of the available fields required to call the methods for device fingerprinting on the TrackerAPIUtil class.
AuthFingerprintLoader	The `AuthFingerprintLoader` uses the data in the `LoginRecord` to simulate a login. This causes the system to perform device fingerprinting, run device identification time rules, and store the user node log and fingerprint data in the OAAM Offline database.

22.2.2 Default Playback Implementation

The default playback implementation is summarized below.

Figure 22-3 Default Playback Implementation

The default playback implementation is shown.

Table 22-3 Default Playback Implementation

Components	Description
PlaybackRunMode	The default P`laybackRunMode` class instantiates a `UserNodeLogsRiskAnalyzerDataSource` as its data source and a `RunRulesLoader` as its loader.
UserNodeLogsRiskAnalyzerDatasource	The `UserNodeLogsRiskAnalyzerDatasource` creates `LoginRecords` from the `VCRYPT_TRACKER_USERNODE_LOGS` and `V_FPRINTS` tables in the OAAM Offline database.
LoginRecord	The login record contains all of the fields required to call the methods for rules processing on the `TrackerAPIUtil` class.
RunRulesLoader	The `RunRulesLoader` processes pre-auth rules on all `LoginRecords`, and processes post-auth rules on all `LoginRecords` with a successful authentication status.

22.3 Implementation Details: Overriding the Loader or Playback Behavior

There are several cases that would require the default behavior to be overridden. You would need to override the default loading behavior to load data from a source other than a database or to load transactional data into the system. You would need to override the default playback behavior if you needed to perform a procedure other than rules processing.

Figure 22-4 Overriding the Loader or Playback Behavior

Override behavior for the custom loader is shown.

22.4 Implement RiskAnalyzerDataSource

If you are loading login data from a data source other than a JDBC database, or if you are loading transactional data, then you will need to create your own subclass of RiskAnalyzerDataSource. There are a couple of ways to do this: extending AbstractJDBCRiskAnalyzerDataSource or extending AbstractRiskAnalyzerDataSource.

22.4.1 Extending AbstractJDBCRiskAnalyzerDataSource

This is the appropriate choice if you are loading any sort of data through a JDBC connection. It includes default behavior for opening a JCBC connection, issuing a subclass specified SQL query to build a JDBC result set, and querying the database for a count of the total number of records.

There are three abstract methods that you must implement.

buildBaseSelect() returns the SQL query you will use to read the data. It should not include any order by statement. The superclass will use your implementation of getOrderByField() to add the order by statement.
getOrderByField() returns the name of the database field that your query should be sorted on. This is usually the date field.
buildNextRecord() turns one or more records from the JDBC result set into your loadable data record.

There are protected fields in the superclass available for your use, and you will need them when you implement the abstract methods. The most important is resultSet, which refers to your JDBC result set. When hasMoreRecords() has been called and returns true, you are guaranteed that resultSet is in a valid state and pointing at the current record. In addition, when you implement buildNextRecord(), you can safely assume that resultSet is in a valid state and pointing at the current record.

Other fields you might need to know about are connection and controller. connection refers to your JDBC to the remote database. controller is an instance of RiskAnalyzer and contains context information about your current OAAM Offline job.

Other methods that you can override if the default behavior is not what you need are buildConnection(), buildSelectCountStatement(), getTotalNumberToProcess(), and buildSelectStatement().

You would override buildConnection() if you wanted to change how you instantiate the remote JDBC connection.

You would override buildSelectCountStatement() if you wanted to change the SQL used to count the number of records to be read in.

You would override getTotalNumberToProcess() if you wanted to replace the algorithm that returns the number of records to be read in. You would only do this if overriding buildSelectCountStatement() was not enough to give you the behavior you need.

Finally, you would override buildSelectStatement() if you wanted to make changes to the SQL used to read the records from the remote databases, such as changing how the order by clause is applied.

22.4.2 Extending AbstractRiskAnalyzerDataSource

If AbstractJDBCRiskAnalyzerDataSource is is not appropriate, then you will need to extend AbstractRiskAnalyzerDataSource instead. For example, if you are reading from a binary file or if you are implementing a data source for a custom playback mode and using TopLink to read from the OAAM Offline database.

The constructor should put your class into a state so that you are ready to iterate through the data. There are four abstract methods you will have to implement.

getTotalNumberToProcess() will return the total number of records in the data source that satisfy the conditions that define a given Session Set.

hasMoreRecords() will return true if there are more records to be processed, and will move any sort of record pointer to the next available record if required. There is a flag named nextRecordIsReady that is necessary for signaling here. The superclass sets this flag to false when it has made use of the next available record. Your implementation of hasMoreRecords() should check the value of the nextRecordIsReady flag, move the pointer to the next record only if the flag's value is false, and change the flag's value to true when you successfully move the pointer to a new record. If you are following this paradigm, then if your implementation of hasMoreRecords() is called while nextRecordIsReady is true, then you should return true without changing the state of any record pointers.

buildNextRecord() will return a new instance of the required subclass of AbstractTransactionRecord.

close() is called when you have finished processing all of the records. Any required clean-up should be performed here.

Loading from a Text File

If a file based custom loader has to be used, extend the AbstractRiskAnalyzerDataSource and implement the custom class by seeing what AbstractTextFileRiskAnalyzerDataSource does and copying the code from AbstractTextFileRiskAnalyzerDataSource.

22.4.3 Extending AbstractRiskAnalyzerDataSource

If neither AbstractJDBCRiskAnalyzerDataSource nor AbstractTextFile-RiskAnalyzerDataSource is appropriate, then you will need to extend AbstractRiskAnalyzerDataSource instead. You might find yourself in this situation if you are reading from a binary file or if you are implementing a data source for a custom playback mode and using TopLink to read from the OAAM Offline database.

The constructor should put your class into a state so that you are ready to iterate through the data. There are four abstract methods you will have to implement.

getTotalNumberToProcess() will return the total number of records in the data source that satisfy the conditions that define a given Session Set.

hasMoreRecords() will return true if there are more records to be processed, and will move any sort of record pointer to the next available record if required. There is a flag named nextRecordIsReady that should be used for signaling here. The superclass sets this flag to false when it has made use of the next available record. Your implementation of hasMoreRecords() should check the value of the nextRecordIsReady flag, move the pointer to the next record only if the flag's value is false, and change the flag's value to true when you successfully move the pointer to a new record. If you are following this paradigm, then if your implementation of hasMoreRecords() is called while nextRecordIsReady is true, then you should return true without changing the state of any record pointers.

buildNextRecord() will return a new instance of the required subclass of AbstractTransactionRecord.

close() is called when you have finished processing all of the records. Any required clean-up should be performed here.

22.5 Implement RunMode

If you have created any customized classes for the load or playback behavior, you are required to create a customized subclass of AbstractLoadLoginsRunMode, AbstractLoadTransactionsRunMode, or PlaybackRunMode, depending on your requirements.

The most important RunMode methods are acquireDataSource and buildObjectLoader.

acquireDataSource(RiskAnalyzer) returns an instance of the RiskAnalyzerDataSource required to run your process. The RiskAnalyzer parameter contains context information that the RunMode can use to instantiate the data source object.

buildObjectLoader(RiskAnalyzer) returns an instance of the AbstractRiskAnalyzerLoader required to run your process. The RiskAnalyzer parameter contains context information that the RunMode can use to instantiate the object loader.

When implementing RunMode, it is critical that your object loader and data source are compatible, meaning that the data source you return produces the specific type of loadable object that your object loader expects.

The chooseStartDateRange(VCryptDataAccessMgr, RunSession) method is used to determine the start date range for your OAAM Offline job. All of your implementors of RunMode have a default implementation of this method. The default behavior is as follows. If this is the first time the job has run, you return the start date from the run session's session set if any, or an arbitrary date guaranteed to be earlier than the earliest date in your data source if your session set has no begin date. If this is a resumed job, then you determine, in an implementation specific way, which record you must start from when the job is resumed.

22.5.1 Extending AbstractLoadLoginsRunMode

This is the appropriate choice if you are loading login data, and you need a custom data source. You must implement the acquireDataSource(RiskAnalyzer) method, and return a new instance of your custom data source. If you need a custom implementation of AbstractRiskAnalyzerLoader, you can override buildObjectLoader(RiskAnalyzer) to return it.

AbstractLoadLoginsRunMode implements the logic to determine the login date at which to resume as follows. The superclass method retrieveLowerBoundDateFromQuery calls an abstract method buildQueryToRetrieveLowerBound, which returns a BharosaDBQuery. The implementation of buildQueryToRetrieveLowerBound in this class selects the most recent VCryptTrackerUserNodeLog.createTime.

Depending on your requirements, you might need to override that behavior. You could override buildQueryToRetrieveLowerBound to add additional criteria to the query or replace the entire query. The only requirement is that the query return a single Date type result. You could instead override the retrieveLowerBoundDateFromQuery or chooseStartDateRange methods, to replace or extend the algorithm.

22.5.2 Extending AbstractLoadTransactionsRunMode

This is the appropriate choice if you are loading transactional data, because you will need a custom data source. You must implement the acquireDataSource(RiskAnalyzer) method, and return a new instance of your custom data source. If you need a custom implementation of AbstractRiskAnalyzerLoader, you can override buildObjectLoader(RiskAnalyzer) to return it.

AbstractLoadTransactionsRunMode implements the logic to determine the login date at which to resume as follows. The superclass method retrieveLowerBoundDateFromQuery calls an abstract method buildQueryToRetrieveLowerBound, which returns a BharosaDBQuery. The implementation of buildQueryToRetrieveLowerBound in this class selects the most recent VTransactionLog.createTime.

22.5.3 Extending PlaybackRunMode

This is the appropriate choice if you have requirements that make it necessary to replace the default playback data source or processing behavior. There are no abstract methods to be implemented, but you can override superclass methods to fulfill your requirements.

If you need a custom data source, you can override acquireDataSource(RiskAnalyzer) to return it. If you need a custom implementation of AbstractRiskAnalyzerLoader, you can override buildObjectLoader(RiskAnalyzer) to return it.

PlaybackRunMode implements the logic to determine the login date at which to resume as follows. The chooseStartDateRange method picks the most recent date out of the following choices, the session set's start date if not null, the run session's last processed date if not null, and arbitrary date guaranteed to be earlier than the earliest date in your data source. The third option will only be chosen if the first two are null.