Understanding Sun Master Index Configuration Options (Repository)

Master Index Object Definition Configuration (Repository)

The properties for the objects you will store in the master index database are defined in the Object Definition file. This file defines the parent and child objects to be indexed and the fields contained in each object, including key properties for each field, such as the field size, unique record identifiers, and whether certain fields are required or can be updated. After you define the master index framework and create the configuration files, you can modify the object structure that you defined.

The Object Definition is used as a basis for most of the master index application components. The information you specify for this file defines the dynamic Java API and the database structure for the primary tables that store object information in the master index application.

The following topics describe the Object Definition file, which defines the object structure.

Master Index Object Definition Components (Repository)

The object definition includes three primary components that together define the structure of the data in the master index application, the database structure, and the method OTD. Most configuration files in the master index application rely on the objects and fields defined in the Object Definition. For example, the fields you specify for the match string, queries, standardization, and the survivor calculator must all be defined in the Object Definition.

The following topics describe each component of the object definition:

Master Index Object Definition Objects

In a master index application, information is stored in objects. Each object in the data structure represents a different type of information. For example, if you are indexing businesses, you might have one object type to store general information about the business (such as the business name and type), one to store address information, and one to store contact information. When indexing personal information, you might have one object type to store general information about the person (such as their name, date of birth, and gender), one to store address information, and one to store telephone information. The object structure can have several objects, but only one primary object (called the parent object). This object is the parent to all other objects defined in the Object Definition. The object structure can have multiple child objects or no child objects at all.

Generally, a record in the master index application has information in one parent object and multiple child objects. A record can also have multiple instances of each child object. For example, in the person index example above, a record for a single person would have one name, one date of birth, and one gender, all three stored in the parent object. However, the same record might have several different addresses, each of which is stored in a separate Address object.

Master Index Object Definition Fields

Each object in the object structure contains fields that store the data elements of the object. You can specify properties for each field in the object structure, such as a length, name, data type, formatting rules, and so on. The fields you define in the object structure also determine the structure of the method OTD and the database tables. You can also specify certain properties for each field that determine how the database columns are defined, including the length, name, and required data type.

Master Index Object Definition Relationships

In the Object Definition, you must specify the parent and child objects. The object structure must contain one parent object. All remaining objects defined in the structure must be specified as child objects to that parent object.

The Master Index Object Definition File (Repository)

The object structure is defined in the Object Definition file in XML format. The information entered into the default configuration file is based on the objects and fields you defined in the wizard. Depending on how completely you defined the object structure in the wizard, this file should not require customization.

The following topics provide information about working with the Object Definition file:

Modifying the Master Index Object Definition

When you use the wizard to define the object structure, all the configuration files for the master index application are automatically generated based on the information you provide. You can modify the Object Definition file at any time prior to deploying the associated project, but you must regenerate the application and redeploy the project after doing so. If you modify the object structure using the configuration editor, the remaining configuration files are updated accordingly to keep them synchronized. If you update object structure by modifying the file directly, you also need to update the remaining configuration files. For example, if you modify the file directly and you delete a field from the object structure that also appears on the EDM, appears in the SBR, and is defined for standardization and matching, you must remove the field from the Enterprise Data Manager file, the Best Record file, and the Match Field file. Any changes made to the file without regenerating the project will not take effect.

The possible modifications to this file are restricted by the schema definition, so be sure to validate the file after making any changes.

Object Definition File Description

Table 1 lists each element in the Object Definition file and provides a description of each element along with any requirements or constraints for each element.

Table 1 Object Definition File Structure

Element/Attribute 

Description 

name

The name of the master index application. This name must match the name of the parent object. 

database

The database platform used by the master index application. Currently, the only values allowed are Oracle and SQL Server.

dateformat

The date format to use for the master index application. Three formats are allowed for the date: MM/dd/yyyy, yyyy/MM/dd, and dd/MM/yyyy. 

nodes

The configuration information for an object. There can be multiple nodes elements, each defining one parent or child object in the object structure. Each nodes element also defines the fields contained in each object along with the field attributes. The object structure must include one parent object and can include several child objects or no child objects.

tag

The name of the parent or child object defined by the nodes element.


Note –

Due to database naming constraints, the length of the name of the parent object plus the length of any child object name must be 21 characters or less.


fields

The configuration information for a field. There can be multiple fields elements.

field-name

The name of the field. Follow these guidelines when naming fields. 

  • The name cannot be longer than 30 characters.

  • The name cannot be objectId, where object is the name of an object in the data structure. For example, you cannot create a field named AddressId if there is an Address object in the structure.

  • Field names must conform to Oracle or SQL Server naming conventions, must conform to Java naming standards, and cannot contain XML reserved characters.

field-type

The data type for each field. Possible data types are: 

  • string - Fields of this type contain a string of characters.

  • date - Fields of this type contain a date value.

  • float - Fields of this type contain a floating point integer.

  • int - Fields of this type contain an integer.

  • byte - Fields of this type contain a single character.

  • boolean - Fields of this type can contain either true or false.

size

The number of characters allowed in each field. If you modify this value, be sure to modify the corresponding database column accordingly. 

updateable

An indicator of whether the field can be updated using the EDM or from back-end messages. Specify true if the field can be updated, or specify false if it cannot.

required

An indicator of whether the field is required in order to save an enterprise object in the database. Specify true if the field is required, or specify false if it is not.

code-module

The identification code for the menu list that appears for this field in the EDM. This must match a value in the code column of the sbyn_comment_header database table. This element is optional. 

maximum-value

The maximum value allowed in the field. To specify a value for a date field, use the format YYYY-MM-DD. This element is optional. 

minimum-value

The minimum value allowed in the field. To specify a value for a date field, use the format YYYY-MM-DD. This element is optional. 

pattern

The required pattern for the field. For more information about possible values and using Java patterns, see “Patterns” in the class list for java.util.regex in the Javadocs provided with J2SE Platform. This element is optional.

user-code

The processing code for the drop-down list that appears on the MIDM for the fields defined by the constraint–by property, described below. These codes are used for non-unique IDs, such as account numbers, insurance policies, credit cards, and so on.


Note –

This must match an entry in the code_list column of the sbyn_user_code database table.


constraint-by

The name of the field that contains the corresponding user–code value (described above) to use to validate the current field. The user–code and constraint-by properties are used in conjunction to define non-unique ID types, such as credit card numbers or account numbers. The first purpose is to define a drop-down list for the field that contains the user code value. The second purpose is to validate the field that contains the constraint value against definitions for the field with the user code value.

For example, if you store non-unique IDs such as credit card numbers or insurance policy numbers, you could create a field named ID Type with a user–code of CREDCARD (CREDCARD also needs to be defined as a code in the sbyn_user_code table). This gives the ID Type field a drop-down list based on the definitions for CREDCARD in the sbyn_user_code table. Definitions would be VISA, MASTERCARD, AMEX, and so on. You could then create a field named ID that would be constrained by the formats defined for the ID Type field. Any credit card numbers you enter would be validated against the format defined for the type of credit card you selected in ID Type.

key-type

An indicator of whether the field is used to identify unique objects. Specify true if the element is a unique object identifier; specify false if it is not. This element is optional.


Note –

Each child object should contain at least one field that is a unique object identifier, but it is not required. If two or more fields are unique identifiers, the combined value of these fields must be unique in a given enterprise record.


relationships

The configuration information for the hierarchy of the objects you defined in the nodes elements. Only one object can be the parent object; the remaining objects must be defined as children. The relationship definition allows a record to contain multiple instances of each child object. For example, if you define Address and Telephone child objects, the record can contain multiple addresses and telephone numbers.

name

The name of the parent object, as defined in the nodes elements.

children

The name of a child object, as defined in the nodes elements. You can define multiple children elements.

Object Definition File Example

Following is a short sample illustrating the elements in the Object Definition file. The DOB field shows usage of the minimum-value element, the SSN field shows usage of the pattern element, and the AddressType field illustrates the code-module element. The AddressType field also has the key-type set to true, meaning that each record can only contain one address of each address type.


   <name>Person</name>
   <database>oracle</database>
   <dateformat>MM/dd/yyyy</dateformat>
   <nodes>
      <tag>Person</tag>
      <fields>
         <field-name>LastName</field-name>
         <field-type>string</field-type>
         <size>40</size>
         <updateable>true</updateable>
         <required>true</required>
         <key-type>false</key-type>
      </fields>
      <fields>
         <field-name>FirstName</field-name>
         <field-type>string</field-type>
         <size>40</size>
         <updateable>true</updateable>
         <required>true</required>
         <key-type>false</key-type>
      </fields>
      <fields>
         <field-name>DOB</field-name>
         <field-type>date</field-type>
         <updateable>true</updateable>
         <required>true</required>
         <minimum-value>1900-01-01</minimum-value>
         <key-type>false</key-type>
      </fields>
      <fields>
         <field-name>SSN</field-name>
         <field-type>string</field-type>
         <size>16</size>
         <updateable>true</updateable>
         <required>false</required>
         <pattern>[0-9]{9}</pattern>
         <key-type>false</key-type>
      </fields>
   </nodes>
   <nodes>
      <tag>Address</tag>
      <fields>
         <field-name>AddressType</field-name>
         <field-type>string</field-type>
         <size>8</size>
         <updateable>true</updateable>
         <required>true</required>
         <code-module>ADDRTYPE</code-module>
         <key-type>true</key-type>
      </fields>
      ...
   </nodes>
   <nodes>
      <tag>Phone</tag>
      ...
   </nodes>
   <relationships>
      <name>Person</name>
      <children>Address</children>
      <children>Phone</children>
   </relationships>