Command line configuration

Configure the Batch Processor

The Batch Processor is invoked in the following ways.

java -jar determinations-batch.jar <command line parameters>

Determinations.Batch.exe <command line parameters>

Command line configuration

The following is a list and description of each of the Batch Processor's command line parameters:

--rulebase <rulebase path>

Specifies the rulebase to be used for the batch processor.

--csv <folder>

Specifies the folder in which the csv data files are located. This parameter must be provided if the --database parameter is not used.

--delimiter <character>

Identifies the value delimiter to be used when reading and writing CSV files. Defaults to a single comma (,) character. This parameter will be ignored if the batch processor is not reading from or writing to CSV files.

As white space characters cannot be passed easily as command line parameters, special values of \t (tab) and \s (space) can be used to specify a tab or space character as the delimiter; for example: --delimiter \t.

--coverage <coverage file>

Outputs a coverage file that can be imported into Oracle Policy Modeling's Analyze Coverage File feature.

--database <db-connection-string>

Specifies the connection string of the database to be used as the source of input data. This parameter must be provided if the --csv parameter is not used; for example: jdbc:oracle:thin:user/password@localhost:1521/example.

--dboutput

Writes the results of the batch run back to the database. This parameter can only be used if the output came from the database (--database option).

--userid <db-userid>

Specifies the user id for a database connection. This parameter can only be used when the userid is not provided in the connection string (--database option).

--password <db-password>

Specifies the password for a database connection. This parameter can only be used when the password is not provided in the connection string (--database option).

--dbprovider <db-connection-string>

Specifies the provider Invariant name for a .NET database connection.

--driver <driver-name>

Specifies the name of the database driver to be used to connect to the database specified by the --database parameter; for example, oracle.jdbc.OracleDriver. This parameter will be ignored if the --database parameter is not included.

--driversrc <path>

Specifies the full path of the external resource containing the database driver identified by the --driver parameter; for example, jar file name. This parameter will be ignored if the --database parameter is not included.

--base <name>

Specifies the 'base' table that represents the cases. Multiple csv files represent a database, but one must be identified as the one corresponding to cases. This parameter is optional if there is only a single csv file, or there is a csv file named 'global'; otherwise it is mandatory.

--processors <number>

Specifies the number of processors to use for the batch processor; default value is the number of processors available.

--blocksize <number>

Specifies the number of cases included in each data block read or updated.

--output <folder>

Specifies the path of the file to write any input or output attributes in csv format. If included, the path for the output file must not be the same as the data folder specified by the --csv parameter.

--limit <number>

For database input only, this sets a limit to the number of cases processed by the batch processor. This can be useful if you are operating on a large data set, but don't necessarily want to process all the cases; for example you may be verifying that the configuration is correct.

--export <folder>

Exports cases as saved sessions into the specified folder. The --limit parameter is handy if you wish to limit the number of cases to be exported.

--exporttsc <filename>

Exports cases into a single .tsc test case file, suitable for adding to an Oracle Project Modeling project. The test file will have the extension '.tsc' appended if it is missing; for example, --exporttsc c:\temp\my_test.tsc.

--config <filename>

Specifies the xml configuration file that is used for mapping (non-zero configuration).

--version

Display the version of the Batch Processor.

XML file configuration

As well as specifying options on the command line (see Command line configuration above) you can also specify options in a Batch Processor configuration file.

When the Batch Processor starts, it looks for a file in the current working directory called config.xml and if this file is found, it will read in the configuration from this file.

Set options in the XML configuration file

All options that can be set on the command line can also be set in the <options> section of the configuration file; note that if an option is found on the command line and in the config file, then the command line overrides the configuration file setting.

The following options can be set:

 

Element Name Description Example
base Name of the 'base' table that represents the cases; equivalent to --base on the command line. <base>tablename</base>
processors The number of slave processors to start; equivalent to --processors on the command line. <processors>2</processors>
limit Limits the number of rows to process; equivalent to --limit on the command line. <limit>1000</limit>
rulebase The rulebase to use; equivalent to --rulebase on the command line. <rulebase>SocialServicesScreening.zip</rulebase>
csv The csv directory to get input from; equivalent to --csv on the command line. <csv>./data/csv</csv>
delimiter

The value delimiter to be used when reading from or writing to CSV files. Equivalent to --delimiter on the command line.

Special values of \t (tab) and \s (space) can be used to specify a tab or space character as the delimiter respectively.

<delimiter>\t</delimiter>
blocksize Specifies the number of cases included in each data block read or updated. <blocksize>800</blocksize>
database The definition for a database source; this element has sub-elements which are equivalent to the --database, --driver, --driversrc, --userid, --password and --dbprovider options on the command line. <database>
    <url>http://localhost/db:8001</url>
    <driver></driver>
    <driversrc></driversrc>
    <userid></userid>
    <password></password>
</database>
output

The output location. The "type" attribute indicates the type of output (defaults to "csv"). Equivalent to the --export, --exporttsc, --db and --coverage options on the command line.

  • If the type is "db" then output is written back to the database. No value is expected here.
  • If the type is "csv" the value is a directory where the csv files with outcomes will be written.
  • If the type is "coverage", "export" or "exporttsc" the value represents the file the exported test case, session or coverage file.




<ouput type="csv">./data/out/csv</output>

Data mapping in the XML configuration file

Mappings are used to map csv and database structures to Oracle Policy Automation data structures: boolean format, entities, relationships and attributes.

If the input data is csv files, much of the mapping from csv data to Oracle Policy Automation data may be done automatically (see Zero-configuration conventions for CSV input). Specifying data mappings can be used to enhance or change the default mappings of csv data.

If the input data is database tables, mapping information must be specified, as there are no zero configuration conventions for database input.

Specify the global boolean format

The global boolean format defines the format for boolean values read from and written to a csv or database data source.

This element must include the following attributes:

  1. The xml attribute true-value defines the value for true when reading from, or writing to, the data source
  2. The xml attribute false-value defines the value for false when reading from, or writing to, the data source.

Example global boolean mapping

<mappings>
    <boolean-format true-value="" false-value="" />
    <!-- global entity mapping -->
    <mapping entity="global" table="global" primary-key="#">
        <!-- entity attributes and relationships -->
    </mapping>
    <!-- other entity mappings -->
</mappings>

Specify an entity mapping

Specifying an entity mapping is done as follows:

  1. The xml attribute entity is used to specify the entity on the rulebase. In this case, it refers to an entity called customer in the rulebase.
  2. The xml attribute table is used to define the source table. This states the source is either from a csv file called customer.csv or a database table called customer.
  3. The optional xml attribute output-table can be used when writing to a database, to identify an alternate table into which the output attributes of the entity are to be inserted.

    The output table must have a primary key field matching the configured primary-key attribute, and attributes matching all output attributes identified in the entity mapping.

    The output table should be empty before the batch processor is run to prevent primary key collisions. This attribute will be ignored when writing to CSV output.

  4. The xml attribute primary-key is used to define the primary key for the source. For a database source, this is where you specify the primary key of your table. Note that by default, the primary key for a csv source is the '#' column in the csv file.
  5. The optional xml attribute primary-key-type can be used to specify whether the source table has a text or integer primary key. If the attribute is not provided, the batch processor will assume an integer key.

  6. The optional xml attribute primary-key-auto can be used to specify whether primary key values for the underlying table are automatically generated by the data source. If the attribute has been set to "true" the Batch Processor will not include a primary key value when attempting to insert a row into the table. If the attribute has not been provided, the Batch Processor will use the default value of "false", and will attempt to generate the primary key values itself.

    This attribute will be ignored when writing to CSV output, or if this is not an inferred entity; see Batch Processor output for more information.

Example Entity mapping

<mapping entity="customer" table="customer" output-table="customer_out"
         primary-key="customer_id" primary-key-type="text" primary-key-auto="false">
    <!-- entity attributes and relationships -->
</mapping>

Specify an attribute mapping

Attribute mappings are contained within an entity mapping. Each attribute element specifies the mapping for the rulebase attribute.

  1. 'name' is used to define the name of the attribute on the rulebase.
  2. 'field' is used to define the source field (for example, column in the csv file).
  3. The optional 'output' is used to identify the field as an output field. If not included the value will default to "false" and the field will not be used as output.
  4. If we are writing csv output, we can use the optional 'csv-output-field' to change the column name on the output.

 

IMPORTANT:

If a field was specified as an output field in the csv via parentheses '(' and ')', but is specified again in the configuration XML file without the output="true" flag, it will not be an output field. The information specified in the configuration file supersedes the CSV information if present in both places.

Example attribute mappings

<entity entity="customer" table="customer" primary-key="#">
    <attribute name="income" field="income" />
    <attribute name="result" field="result" output="true" />
    <attribute name="result" field="result" csv-output-field="newcolumnname" />
</entity>

Specify a relationship mapping

All relationships must have the two xml attributes:

  1. 'name' - must match the relationship name of the source entity.
  2. 'source-entity' - is the name for the source entity of the relationship.

Specific for one-to-one, one-to-many/many-to-one relationships.

'foreign-key' - is the column field name which is used as the foreign-key for the relationship. The foreign-key has to specified on the many side of the one-to-many relationship.

Specific for many-to-many relationships

  1. 'rel-source' - is the source of the many-to-many mapping. In the example below, it states that the source of the many-to-many mapping is coming from the csv file called plansproducts.
  2. 'source-key' - is the foreign key reference to the primary key of source table.
  3. 'target-key' - is the foreign key reference to the primary key of target table.

 

Note: Many-to-many relationships can be specified at either side.

Example relationship mappings

The example shows the relationships for the entities "customer" and "product". The customer entity has a one-to-many relationship from the global entity (applicanttopincomeearner) and a one-to-many relationship to the product entity (customersfavoriteproductrev). The product entity has a many-to-many relationship with the plan entity.

<mapping entity="customer" table="customer" primary-key="#">
    <relationship name="applicanttopincomeearner" source-entity="global" foreign-key="applicanttopincomeearner" />
    <relationship name="customersfavoriteproductrev" source-entity="product" foreign-key="customersfavoriteproduct" />
</mapping>
<mapping entity="product" table="product" primary-key="#">
    <!-- many-to-many -->
    <relationship name="plansproducts" source-entity="plan" rel-source="plansproducts" source-key="plan" target-key="product"/>
</mapping>

Structure of the XML configuration file

<configuration>
    <options>
    ...
    </options>
    <mappings>
    ...
    </mappings>
</configuration>

The root element of the configuration file is the <configuration> element.

Next there can be a single <options> element which contains all the Batch Processor configuration options (see Set options in the XML configuration file).

Next there can be a single <mappings> element which contains all of the Batch Processor data mappings.