This chapter describes the overall data loader framework for OAAM Offline:
Basic framework and the default implementation
How to override the default functionality
This document assumes that you are familiar with the concepts of OAAM Offline.
A custom loader is required only if the data from sources other than a database, data other than login, or complex data is needed for the OAAM Offline task.
The OAAM Offline custom loader consists of the following key parts:
loadable object
data source
loader
run modes
The loadable object represents an individual data record. The data source represents the entire store of data records and the loader processes the records. There are two types of run mode: load and playback. The run modes encapsulate the differences between loading a Session Set and running a Session Set.
Table 22-1 provides a summary of the different data loader classes.
Table 22-1 Data Loader Classes
Class | Description |
---|---|
RunMode |
There are two basic types of RunMode: load and playback. Load run modes are responsible for importing session set data into the OAAM Offline system, and the playback run mode is responsible for processing preloaded session set data. Each run mode is responsible for constructing data source and loader. An additional responsibility is determining how to start where a previous job ended, in the cases of recurring schedules of autoincrementing session sets or paused and resumed run sessions.
|
RiskAnalyzerDataSource |
The |
AbstractTransactionRecord |
The |
AbstractRiskAnalyzerLoader |
The |
The following pseudocode shows the general framework execution.
AbstractRiskAnalyzerLoader loader = runMode.buildObjectLoader(); RiskAnalyzerDataSource dataSource = runMode.acquireDataSource(); try{ while (dataSource.hasMoreRecords()) { AbstractTransactionRecord eachRecord = dataSource.nextRecord(); loader.process(eachRecord); } } finally { dataSource.close(); }
The default implementation for the Risk Analyzer data loader framework works as follows:
Load mode: When in load mode, it uses any database as a data source, it expects login data, and it performs device fingerprinting.
Playback mode: When in playback mode, it uses the VCRYPT_TRACKER_USERNODE_LOGS
and V_FPRINTS
tables as its data source, and it runs each record through all active models.
The default load implementation is summarized below.
Table 22-2 Default Implementation
Components | Description |
---|---|
LoadRunMode |
The default |
DatabaseRiskAnalyzerDatasource |
The |
LoginRecord |
The login record contains all of the available fields required to call the methods for device fingerprinting on the TrackerAPIUtil class. |
AuthFingerprintLoader |
The |
The default playback implementation is summarized below.
Table 22-3 Default Playback Implementation
Components | Description |
---|---|
PlaybackRunMode |
The default P |
UserNodeLogsRiskAnalyzerDatasource |
The |
LoginRecord |
The login record contains all of the fields required to call the methods for rules processing on the |
RunRulesLoader |
The |
There are several cases that would require the default behavior to be overridden. You would need to override the default loading behavior to load data from a source other than a database or to load transactional data into the system. You would need to override the default playback behavior if you needed to perform a procedure other than rules processing.
If you are loading login data from a data source other than a JDBC database, or if you are loading transactional data, then you will need to create your own subclass of RiskAnalyzerDataSource
. There are a couple of ways to do this: extending AbstractJDBCRiskAnalyzerDataSource
or extending AbstractRiskAnalyzerDataSource
.
This is the appropriate choice if you are loading any sort of data through a JDBC connection. It includes default behavior for opening a JCBC connection, issuing a subclass specified SQL query to build a JDBC result set, and querying the database for a count of the total number of records.
There are three abstract methods that you must implement.
buildBaseSelect() returns the SQL query you will use to read the data. It should not include any order by statement. The superclass will use your implementation of getOrderByField()
to add the order by statement.
getOrderByField()
returns the name of the database field that your query should be sorted on. This is usually the date field.
buildNextRecord()
turns one or more records from the JDBC result set into your loadable data record.
There are protected fields in the superclass available for your use, and you will need them when you implement the abstract methods. The most important is resultSet
, which refers to your JDBC result set. When hasMoreRecords()
has been called and returns true, you are guaranteed that resultSet
is in a valid state and pointing at the current record. In addition, when you implement buildNextRecord()
, you can safely assume that resultSet
is in a valid state and pointing at the current record.
Other fields you might need to know about are connection and controller. connection refers to your JDBC to the remote database. controller is an instance of RiskAnalyzer and contains context information about your current OAAM Offline job.
Other methods that you can override if the default behavior is not what you need are buildConnection()
, buildSelectCountStatement()
, getTotalNumberToProcess()
, and buildSelectStatement()
.
You would override buildConnection()
if you wanted to change how you instantiate the remote JDBC connection.
You would override buildSelectCountStatement()
if you wanted to change the SQL used to count the number of records to be read in.
You would override getTotalNumberToProcess()
if you wanted to replace the algorithm that returns the number of records to be read in. You would only do this if overriding buildSelectCountStatement()
was not enough to give you the behavior you need.
Finally, you would override buildSelectStatement()
if you wanted to make changes to the SQL used to read the records from the remote databases, such as changing how the order by clause is applied.
If AbstractJDBCRiskAnalyzerDataSource
is is not appropriate, then you will need to extend AbstractRiskAnalyzerDataSource
instead. For example, if you are reading from a binary file or if you are implementing a data source for a custom playback mode and using TopLink to read from the OAAM Offline database.
The constructor should put your class into a state so that you are ready to iterate through the data. There are four abstract methods you will have to implement.
getTotalNumberToProcess()
will return the total number of records in the data source that satisfy the conditions that define a given Session Set.
hasMoreRecords()
will return true if there are more records to be processed, and will move any sort of record pointer to the next available record if required. There is a flag named nextRecordIsReady
that is necessary for signaling here. The superclass sets this flag to false when it has made use of the next available record. Your implementation of hasMoreRecords()
should check the value of the nextRecordIsReady
flag, move the pointer to the next record only if the flag's value is false, and change the flag's value to true when you successfully move the pointer to a new record. If you are following this paradigm, then if your implementation of hasMoreRecords()
is called while nextRecordIsReady
is true, then you should return true without changing the state of any record pointers.
buildNextRecord()
will return a new instance of the required subclass of AbstractTransactionRecord
.
close()
is called when you have finished processing all of the records. Any required clean-up should be performed here.
If a file based custom loader has to be used, extend the AbstractRiskAnalyzerDataSource and implement the custom class by seeing what AbstractTextFileRiskAnalyzerDataSource does and copying the code from AbstractTextFileRiskAnalyzerDataSource.
If neither AbstractJDBCRiskAnalyzerDataSource
nor AbstractTextFile-RiskAnalyzerDataSource
is appropriate, then you will need to extend AbstractRiskAnalyzerDataSource
instead. You might find yourself in this situation if you are reading from a binary file or if you are implementing a data source for a custom playback mode and using TopLink to read from the OAAM Offline database.
The constructor should put your class into a state so that you are ready to iterate through the data. There are four abstract methods you will have to implement.
getTotalNumberToProcess()
will return the total number of records in the data source that satisfy the conditions that define a given Session Set.
hasMoreRecords()
will return true if there are more records to be processed, and will move any sort of record pointer to the next available record if required. There is a flag named nextRecordIsReady
that should be used for signaling here. The superclass sets this flag to false when it has made use of the next available record. Your implementation of hasMoreRecords()
should check the value of the nextRecordIsReady
flag, move the pointer to the next record only if the flag's value is false, and change the flag's value to true when you successfully move the pointer to a new record. If you are following this paradigm, then if your implementation of hasMoreRecords()
is called while nextRecordIsReady
is true, then you should return true without changing the state of any record pointers.
buildNextRecord()
will return a new instance of the required subclass of AbstractTransactionRecord
.
close()
is called when you have finished processing all of the records. Any required clean-up should be performed here.
If you have created any customized classes for the load or playback behavior, you are required to create a customized subclass of AbstractLoadLoginsRunMode
, AbstractLoadTransactionsRunMode
, or PlaybackRunMode
, depending on your requirements.
The most important RunMode
methods are acquireDataSource
and buildObjectLoader
.
acquireDataSource(RiskAnalyzer)
returns an instance of the RiskAnalyzerDataSource
required to run your process. The RiskAnalyzer
parameter contains context information that the RunMode
can use to instantiate the data source object.
buildObjectLoader(RiskAnalyzer)
returns an instance of the AbstractRiskAnalyzerLoader
required to run your process. The RiskAnalyzer
parameter contains context information that the RunMode can use to instantiate the object loader.
When implementing RunMode
, it is critical that your object loader and data source are compatible, meaning that the data source you return produces the specific type of loadable object that your object loader expects.
The chooseStartDateRange(VCryptDataAccessMgr, RunSession)
method is used to determine the start date range for your OAAM Offline job. All of your implementors of RunMode
have a default implementation of this method. The default behavior is as follows. If this is the first time the job has run, you return the start date from the run session's session set if any, or an arbitrary date guaranteed to be earlier than the earliest date in your data source if your session set has no begin date. If this is a resumed job, then you determine, in an implementation specific way, which record you must start from when the job is resumed.
This is the appropriate choice if you are loading login data, and you need a custom data source. You must implement the acquireDataSource(RiskAnalyzer)
method, and return a new instance of your custom data source. If you need a custom implementation of AbstractRiskAnalyzerLoader
, you can override buildObjectLoader(RiskAnalyzer)
to return it.
AbstractLoadLoginsRunMode
implements the logic to determine the login date at which to resume as follows. The superclass method retrieveLowerBoundDateFromQuery
calls an abstract method buildQueryToRetrieveLowerBound
, which returns a BharosaDBQuery. The implementation of buildQueryToRetrieveLowerBound
in this class selects the most recent VCryptTrackerUserNodeLog.createTime
.
Depending on your requirements, you might need to override that behavior. You could override buildQueryToRetrieveLowerBound
to add additional criteria to the query or replace the entire query. The only requirement is that the query return a single Date type result. You could instead override the retrieveLowerBoundDateFromQuery
or chooseStartDateRange
methods, to replace or extend the algorithm.
This is the appropriate choice if you are loading transactional data, because you will need a custom data source. You must implement the acquireDataSource(RiskAnalyzer)
method, and return a new instance of your custom data source. If you need a custom implementation of AbstractRiskAnalyzerLoader
, you can override buildObjectLoader(RiskAnalyzer)
to return it.
AbstractLoadTransactionsRunMode
implements the logic to determine the login date at which to resume as follows. The superclass method retrieveLowerBoundDateFromQuery
calls an abstract method buildQueryToRetrieveLowerBound
, which returns a BharosaDBQuery
. The implementation of buildQueryToRetrieveLowerBound
in this class selects the most recent VTransactionLog.createTime
.
Depending on your requirements, you might need to override that behavior. You could override buildQueryToRetrieveLowerBound
to add additional criteria to the query or replace the entire query. The only requirement is that the query return a single Date type result. You could instead override the retrieveLowerBoundDateFromQuery
or chooseStartDateRange
methods, to replace or extend the algorithm.
This is the appropriate choice if you have requirements that make it necessary to replace the default playback data source or processing behavior. There are no abstract methods to be implemented, but you can override superclass methods to fulfill your requirements.
If you need a custom data source, you can override acquireDataSource(RiskAnalyzer)
to return it. If you need a custom implementation of AbstractRiskAnalyzerLoader
, you can override buildObjectLoader(RiskAnalyzer)
to return it.
PlaybackRunMode
implements the logic to determine the login date at which to resume as follows. The chooseStartDateRange
method picks the most recent date out of the following choices, the session set's start date if not null, the run session's last processed date if not null, and arbitrary date guaranteed to be earlier than the earliest date in your data source. The third option will only be chosen if the first two are null.