8
OP Schemas

Oracle9iAS Personalization (OP) uses several database schemas as follows:

Mining Object Repository (MOR), with tables containing mining objects (deployable packages, reports, schedule items, etc., as well as package build results such as rule tables, etc.)
Mining Table Repository (MTR), holding mining tables (bin boundaries, customer information, hot picks, taxonomy, etc.)
RE Schema, with tables supporting current session information, model rules
Site database holding demographic information

All OP schemas reside on the system where Oracle9i is installed.

To see a small example of the MOR, MTR, and RE schemas that are correctly populated, install the REAPI Demo, as described in Chapter 5, and examine the tables there. Alternatively, you can install an unpopulated MTR when you install OP. You can examine the schema of the unpopulated MTR and populate it with your own data.

Note that OP uses a fixed schema for the MTR. By "fixed," we mean that the MTR must be populated with tables matching OP table and column names.

Before you can obtain recommendations, you must create and deploy a package. You cannot create a package until there is some data available in the MTR. There are two ways to populate an MTR:

Collect data using the REProxyRT method addItem or addItems.
Populate with existing data, i.e., convert from existing data that was collected by your Web application and stored in an Oracle database.

Mining Table Repository

The OP MTR consists of several tables and views. They are listed below.

`MTR_ATTR_ID_BIN_BOUNDARY`	`TABLE`
`MTR_ATTR_NAME_TO_ID_MAP`	`TABLE`
`MTR_BIN_BOUNDARY`	`TABLE`
`MTR_CATEGORY`	`TABLE`
`MTR_CONFIGURATION`	`TABLE`
`MTR_CUSTOMER`	`TABLE`
`MTR_CUSTOMER_NAV_DETAIL`	`TABLE`
`MTR_CUSTOMER_RATING_DETAIL`	`TABLE`
`MTR_HOTPICK`	`TABLE`
`MTR_HOTPICK_GROUP`	`TABLE`
MTR_INTERNAL_CONFIGURATION	TABLE
`MTR_ITEM`	`TABLE`
MTR_NAVIGATION_DETAIL	VIEW
MTR_PROFILE_DATA	VIEW
`MTR_PROXY`	`TABLE`
`MTR_PURCHASING_DETAIL`	`TABLE`
MTR_RATING_DETAIL	VIEW
`MTR_RECOMMENDATION_DETAIL`	`TABLE`
MTR_SCHEAM_VERSION	VIEW
`MTR_SESSION`	`TABLE`
`MTR_TAXONOMY`	`TABLE`
`MTR_TAXONOMY_CATEGORY`	`TABLE`
`MTR_TAXONOMY_CATEGORY_ITEM`	`TABLE`
`MTR_VISITOR_NAV_DETAIL`	`TABLE`
`MTR_VISITOR_RATING_DETAIL`	`TABLE`

Certain tables that make up the OP MTR must be populated with data specific to your Web site in accordance with the MTR schema. Other tables, such as the tables associated with sessions and recommendations, are automatically populated by OP.

The rest of this section describes the schemas for the MTR tables; tables that you must populate are described in detail.

Item Table

The item table contains a list of all the individual items that a Web site deals with. When OP returns a recommendation, it returns the ID and type of the item; the item table provides more information. The item table is usually mapped to the catalog tables in the site database. The schema for MTR_ITEM has four fields; they are listed below, in order, with their data types.

`ID`	`NUMBER PK`
`ITEM_TYPE`	`VARCHAR2(30) PK`
`LABEL`	`VARCHAR2(150)`
`DESCRIPTION`	`VARCHAR2(4000)`

Bin Boundaries

The model building algorithms in Oracle9iAS Personalization require discrete data. All numerical data must be discretized or binned before the data is used to build a model. In OP, discretization is performed in a transformation step before model build. The value ranges for discretization (the bin boundaries) must be specified in order for OP to perform discretization.

In release 9.0.1 of OP, the bin boundaries must be explicitly specified in the bin boundaries table.

Discretization is performed by joining the input data and the bin boundaries table.

Categorical data in cases where there are a large number of distinct values should also be binned. If you bin categorical data, bin boundaries must be specified as for numerical data

In summary, OP requires all numerical data to be binned, and high cardinality categories should also be binned.

When you create bins of numeric values, specify the bounds for each bin (upper and lower values); when you create bins of categorical data, specify the items in each bin. To map several values to the same bin, use several records with the same bin numbering.

The table MTR_BIN_BOUNDARY has seven fields; they are listed below, in order, with their data types:

`DATA_SOURCE_TYPE`	`NUMBER(3)`
`ITEM_TYPE`	`VARCHAR2(30)`
`ATTRIBUTE_NAME`	`VARCHAR2(30)`
`LOWER_VALUE`	`NUMBER`
`UPPER_VALUE`	`NUMBER`
`STRING_VALUE`	`VARCHAR2(60)`
`BIN_NUMBER`	`NUMBER(15)`

Examples of Specifying Bin Boundaries

The following examples illustrate how to specify bin boundaries.

Consider movie rating data on a scale of 1 - 5. Suppose that you want to discretized ratings as follows:

1 and 2 are in bin number 1
3 is in bin number 2
4 and 5 are bin number 3

You should insert the following entries into the bin boundaries table:

(3, 'MOVIE', 'VALUE', 1, 2.1, NULL, 1),
(3, 'MOVIE', 'VALUE', 3, 3.1, NULL, 2),
(3, 'MOVIE', 'VALUE', 4, 5.1, NULL, 3).

The range of the bin includes all values that are greater than or equal to the lower value and strictly less than the upper value. The data source type for rating is 3 and string value is set to NULL for numeric data.

The following bin boundary table entries discretize martial status, a categorical attribute:

(1, 'NONE', 'MARITAL_STATUS', NULL, NULL, 'Single', 1),
(1, 'NONE', 'MARITAL_STATUS', NULL, NULL, 'Divorced', 2),
(1, 'NONE', 'MARITAL_STATUS', NULL, NULL, 'Separated', 2),
(1, 'NONE', 'MARITAL_STATUS', NULL, NULL, 'Married', 3),
(1, 'NONE', 'MARITAL_STATUS', NULL, NULL, 'Widowed', 4)

The data source type is 1 and the item type is NONE for demographic data. Lower value and upper value are NULL for categorical data.

Taxonomy

In OP, a taxonomy refers to the structural organization of items in a company's catalog or site. Typically the catalog and/or the site has a hierarchical structure like a tree or collection of trees (branching from broader groups at the top all the way down to individual items at the leaves).

Items can belong to more than one category and to different levels of the structure. The structure of the OP taxonomy is a DAG (direct acyclic graph), which can contain multiple top-level nodes. The different portions of the taxonomy can be disconnected too. Any node can connect to any other node but there cannot be a path that connects a node's child back to the node itself.

OP also supports multiple taxonomies (different ways of organizing the items).

The taxonomy is implemented using a group of tables (they are all specified by the customer at installation time):

MTR_TAXONOMY: Lists the different taxonomies used by the site. The schema for this table has three fields; they are listed below, in order, with their data types:

ID

NUMBER PK

NAME

VARCHAR2(150)

DESCRIPTION

VARCHAR2(4000)
MTR_TAXONOMY_CATEGORY: Specifies which categories belong to the different taxonomies. (A category can belong to multiple taxonomies; however, for a given taxonomy, there can be only one instance of any category.) The schema for this table has four fields; they are listed below, in order, with their data types:

TAXONOMY_LEVEL

NUMBER PK

TAXONOMY_ID

NUMBER PK

PARENT_ID

NUMBER PK

CHILD_ID

NUMBER PK
MTR_TAXONOMY_CATEGORY_ITEM: Specifies which items go with a given taxonomy, category pair. The schema for this table has four fields; they are listed below, in order, with their data types:

CATEGORY_ID

NUMBER PK

TAXONOMY_ID

NUMBER PK

ITEM_ID

NUMBER PK

ITEM_TYPE

VARCHAR2(30) PK
MTR_CATEGORY: Specifies the different categories used by the site. The schema for this table has three fields; they are listed below, in order, with their data types.

ID

NUMBER PK

NAME

VARCHAR2(150)

DESCRIPTION

VARCHAR2(4000)

Samples of the MTR Taxonomy Tables

The REAPI Demo includes a taxonomy; you can examine the demo MTR to see examples of all of these tables.

Customer Table

The MTR_CUSTOMER table contains demographic information about the customer. Some customer attributes are common to all OP applications and some can be tailored to your application. The common attributes are customer ID, name, creation date, gender, age, marital status, personal income, whether or not the customer is the head of household, household income, household size, and whether the customer rents or owns.

You can specify up to 50 attributes specific to your Web application. These variable attributes are all strings.

The schema of the MTR_CUSTOMER table has the following fields. They are listed below, in order, with their data types.

`ID`	`VARCHAR2(32)`
`NAME`	`VARCHAR2(80)`
`CREATION_DATE`	`DATE`
`GENDER`	`VARCHAR2(10)`
`AGE`	`NUMBER(3)`
`MARITAL_STATUS`	`VARCHAR2(20)`
`PERSONAL_INCOME`	`NUMBER`
`IS_HEAD_OF_HOUSEHOLD`	`CHAR(1)`
`HOUSEHOLD_INCOME`	`NUMBER`
`HOUSEHOLD_SIZE`	`NUMBER(2)`
`RENT_OWN_INDICATOR`	`VARCHAR2(30)`
`ATTRIBUTE1`	`VARCHAR2(150)`
`ATTRIBUTE2`	`VARCHAR2(150)`
`ATTRIBUTE3`	`VARCHAR2(150)`
`ATTRIBUTE4`	`VARCHAR2(150)`
`...`
`ATTRIBUTE49`	`VARCHAR2(150)`
`ATTRIBUTE50`	`VARCHAR2(150)`

Hot Picks

Hot picks are used by some Web sites; for example, the daily specials might be hot picks. Information about hot picks is stored in two MTR tables, as follows:

MTR_HOTPICK_GROUP lists the distinct hot picks groups used by the site. There is one record for each group. Each record contains a group ID, the group name (label), and a brief description of the group. The schema for this table has three fields; they are listed below, in order, with their data types.
```
ID



NUMBER PK




LABEL



VARCHAR2(150)




DESCRIPTION



VARCHAR2(400)




 
```
MTR_HOTPICK lists the items in each hot picks group, arranged according to group ID. Each record consists of a group ID, an item ID, and an item type. The schema for this table has three fields; they are listed below, in order, with their data types.

ITEM_ID

NUMBER

ITEM_TYPE

VARCHAR2(30)

GROUP_ID

NUMBER

A hot pick group can also contain categories. In this case, the item-type is set to CATEGORY and item ID is set to the appropriate ID value in the MTR_CATEGORY table.

Detail Tables

Several tables in the MTR store the details of various activities.

MTR_CUSTOMER_NAV_DETAIL stores the navigation data corresponding to a customer session. This table is populated with data collected in the RE.
MTR_CUSTOMER_RATING_DETAIL stores rating data for customers. This table is populated using the data collected in the RE during data collection.
MTR_PURCHASING_DETAIL stores purchasing data on a per-session basis. Typically this data is collected by the Web application.
MTR_RECOMMENDATION_DETAIL stores the results of recommendation requests. The data stored in this table is used to generate reports on the performance of OP.
MTR_VISITOR_NAV_DETAIL stores the navigation data corresponding to a visitor session. This table is populated with data collected in the RE.
MTR_VISITOR_RATING_DETAIL stores rating data for visitors. This table is populated using the data collected in the RE during data collection.

Miscellaneous MTR Tables

The following tables are used internally by OP:

MTR_ATTR_NAME_TO_ID_MAP is used to speed up package building.
MTR_CONFIGURATION and MTR_INTERNAL_CONFIGURATION stores configuration information.
The MTR_SESSION table stores information about the session that OP creates internally on behalf of the application.
MTR_ATTR_ID_BIN_BOUNDARY is a materialized view of the join of the BIN_BOUNDARIES table and the ATTR_NAME_TO_ID_MAP table. It is used when transforming data during package builds.
MTR_PROXY is used to set up proxies for new items. When a new item is introduced, there may not be enough detail information about it to build packages; OP uses data about a similar existing product.

Recommendation Engine

The RE schema stores current session data. The data is sychronized back to the MTR automatically. The RE includes the following tables (partial list):

ATTR_ID_BIN_BOUNDARY	`TABLE`
`HOTPICK`	`TABLE`
`HOTPICK_GROUP`	`TABLE`
I_I_ANTECEDENT	TABLE
I_I_RULE	TABLE
P_I_CATEGORY_RULES	TABLE
P_I_ITEM_RULES	TABLE
RE_ACTIVE_USER	TABLE
RE_CONFIGURATION	TABLE
RE_CURRENT_SESSION_DATA	TABLE
RE_DEPLOYABLE_PACKAGE	TABLE
RE_DEPLOYABLE_PKG_CONTENTS	TABLE
RE_ERROR_TABLE	TABLE
RE_INTERNAL_CONFIGURATION	TABLE
RE_LOG	TABLE
RE_MESSAGE_LOG	TABLE
RE_PROFILE_DATA	TABLE
RE_RECOMMENDATION_DETAIL	TABLE
RE_SCHEMA_ACCESS	TABLE
TAXONOMY_CATEGORY	TABLE
TAXONOMY_CATEGORY_ITEM	TABLE
TAXONOMY_TRANS_CLOSURE	TABLE

HOT_PICKGROUP and HOTPICK are copies of the corresponding tables in the MTR.

RE_CURRENT_SESSION_DATA holds all the data collected using the data collection methods. This data is written back to the MTR using data synchronization.

RE_PROFILE_DATA stores the historical profiles of active users. When a user is detected, the profile of that user is loaded from the MTR to this table.

RE_RECOMMENDATION_DETAIL is the source of data for the corresponding table in the MTR. The data is synchronized back to the MTR.

ATTR_ID_BIN_BOUNDARY is a copy of the corresponding table in the MTR.

RE_CONFIGURATION and RE_INTERNAL_CONFIGURATION store the configuration parameters for the RE.

RE_DEPLOYABLE_PACKAGE keeps track of the deployable package that is currently deployed in the RE.

RE_LOG records events occurring in the RE.

RE_ACTIVE_USER stores information about all users who are currently active in the system. Data from this table is used to populate the session table in the MTR.

All other tables are used internally by the RE.

Mining Object Repository

Much of the work that is done by OP uses MOR tables and views. The MOR includes the following tables (partial list):

MOR_VISITOR_TO_BROWSER_REPORT	TABLE
MOR_CONFIGURATION	TABLE
MOR_CROSS_SOLD_ITEMS_REPORT	TABLE
MOR_DEPLOYABLE_PACKAGE	TABLE
MOR_EFFECTIVENESS_REPORT	TABLE
MOR_EMAIL_ADDRESS	TABLE
MOR_ERROR_TABLE	TABLE
MOR_INTERNAL_CONFIGURATION	TABLE
MOR_MESSAGE_LOG	TABLE
MOR_MINING_MODEL	TABLE
MOR_MINING_RESULT	TABLE
MOR_MTR_CONNECTION	TABLE
MOR_RECOMMENDATION_ENGINE	TABLE
MOR_RECOMMENDATION_REPORT	TABLE
MOR_RECOMMENDATION_STRATEGY	TABLE
MOR_RE_FARM	TABLE
MOR_SCHEDULE_EVENT	TABLE
MOR_SCHEDULE_ITEM	TABLE
MOR_SCHEMA_ACCESS	TABLE
MOR_TAXONOMY_TRANS_CLOSURE	TABLE
MOR_TRANS_SUPERVISED_RESULT	TABLE

`TAXONOMY_LEVEL`	`NUMBER PK`
`TAXONOMY_ID`	`NUMBER PK`
`PARENT_ID`	`NUMBER PK`
`CHILD_ID`	`NUMBER PK`

`CATEGORY_ID`	`NUMBER PK`
TAXONOMY_ID	NUMBER PK
`ITEM_ID`	`NUMBER PK`
`ITEM_TYPE`	`VARCHAR2(30) PK`

8OP Schemas