Oracle9iAS Personalization Administrator's Guide Release 9.0.1 Part Number A87539-02 |
|
Oracle9iAS Personalization (OP) uses several database schemas as follows:
All OP schemas reside on the system where Oracle9i is installed.
To see a small example of the MOR, MTR, and RE schemas that are correctly populated, install the REAPI Demo, as described in Chapter 5, and examine the tables there. Alternatively, you can install an unpopulated MTR when you install OP. You can examine the schema of the unpopulated MTR and populate it with your own data.
Note that OP uses a fixed schema for the MTR. By "fixed," we mean that the MTR must be populated with tables matching OP table and column names.
Before you can obtain recommendations, you must create and deploy a package. You cannot create a package until there is some data available in the MTR. There are two ways to populate an MTR:
The OP MTR consists of several tables and views. They are listed below.
Certain tables that make up the OP MTR must be populated with data specific to your Web site in accordance with the MTR schema. Other tables, such as the tables associated with sessions and recommendations, are automatically populated by OP.
The rest of this section describes the schemas for the MTR tables; tables that you must populate are described in detail.
The item table contains a list of all the individual items that a Web site deals with. When OP returns a recommendation, it returns the ID and type of the item; the item table provides more information. The item table is usually mapped to the catalog tables in the site database. The schema for MTR_ITEM has four fields; they are listed below, in order, with their data types.
|
|
|
|
|
|
|
|
The model building algorithms in Oracle9iAS Personalization require discrete data. All numerical data must be discretized or binned before the data is used to build a model. In OP, discretization is performed in a transformation step before model build. The value ranges for discretization (the bin boundaries) must be specified in order for OP to perform discretization.
In release 9.0.1 of OP, the bin boundaries must be explicitly specified in the bin boundaries table.
Discretization is performed by joining the input data and the bin boundaries table.
Categorical data in cases where there are a large number of distinct values should also be binned. If you bin categorical data, bin boundaries must be specified as for numerical data
In summary, OP requires all numerical data to be binned, and high cardinality categories should also be binned.
When you create bins of numeric values, specify the bounds for each bin (upper and lower values); when you create bins of categorical data, specify the items in each bin. To map several values to the same bin, use several records with the same bin numbering.
The table MTR_BIN_BOUNDARY has seven fields; they are listed below, in order, with their data types:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The following examples illustrate how to specify bin boundaries.
Consider movie rating data on a scale of 1 - 5. Suppose that you want to discretized ratings as follows:
You should insert the following entries into the bin boundaries table:
(3, 'MOVIE', 'VALUE', 1, 2.1, NULL, 1), (3, 'MOVIE', 'VALUE', 3, 3.1, NULL, 2), (3, 'MOVIE', 'VALUE', 4, 5.1, NULL, 3).
The range of the bin includes all values that are greater than or equal to the lower value and strictly less than the upper value. The data source type for rating is 3 and string value is set to NULL for numeric data.
The following bin boundary table entries discretize martial status, a categorical attribute:
(1, 'NONE', 'MARITAL_STATUS', NULL, NULL, 'Single', 1), (1, 'NONE', 'MARITAL_STATUS', NULL, NULL, 'Divorced', 2), (1, 'NONE', 'MARITAL_STATUS', NULL, NULL, 'Separated', 2), (1, 'NONE', 'MARITAL_STATUS', NULL, NULL, 'Married', 3), (1, 'NONE', 'MARITAL_STATUS', NULL, NULL, 'Widowed', 4)
The data source type is 1 and the item type is NONE for demographic data. Lower value and upper value are NULL for categorical data.
In OP, a taxonomy refers to the structural organization of items in a company's catalog or site. Typically the catalog and/or the site has a hierarchical structure like a tree or collection of trees (branching from broader groups at the top all the way down to individual items at the leaves).
Items can belong to more than one category and to different levels of the structure. The structure of the OP taxonomy is a DAG (direct acyclic graph), which can contain multiple top-level nodes. The different portions of the taxonomy can be disconnected too. Any node can connect to any other node but there cannot be a path that connects a node's child back to the node itself.
OP also supports multiple taxonomies (different ways of organizing the items).
The taxonomy is implemented using a group of tables (they are all specified by the customer at installation time):
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
TAXONOMY_ID |
NUMBER PK |
|
|
|
|
|
|
|
|
|
|
The REAPI Demo includes a taxonomy; you can examine the demo MTR to see examples of all of these tables.
The MTR_CUSTOMER table contains demographic information about the customer. Some customer attributes are common to all OP applications and some can be tailored to your application. The common attributes are customer ID, name, creation date, gender, age, marital status, personal income, whether or not the customer is the head of household, household income, household size, and whether the customer rents or owns.
You can specify up to 50 attributes specific to your Web application. These variable attributes are all strings.
The schema of the MTR_CUSTOMER table has the following fields. They are listed below, in order, with their data types.
Hot picks are used by some Web sites; for example, the daily specials might be hot picks. Information about hot picks is stored in two MTR tables, as follows:
|
|
|
|
|
|
|
|
|
|
|
|
A hot pick group can also contain categories. In this case, the item-type is set to CATEGORY and item ID is set to the appropriate ID value in the MTR_CATEGORY table.
Several tables in the MTR store the details of various activities.
The following tables are used internally by OP:
The RE schema stores current session data. The data is sychronized back to the MTR automatically. The RE includes the following tables (partial list):
HOT_PICKGROUP and HOTPICK are copies of the corresponding tables in the MTR.
RE_CURRENT_SESSION_DATA holds all the data collected using the data collection methods. This data is written back to the MTR using data synchronization.
RE_PROFILE_DATA stores the historical profiles of active users. When a user is detected, the profile of that user is loaded from the MTR to this table.
RE_RECOMMENDATION_DETAIL is the source of data for the corresponding table in the MTR. The data is synchronized back to the MTR.
ATTR_ID_BIN_BOUNDARY is a copy of the corresponding table in the MTR.
RE_CONFIGURATION and RE_INTERNAL_CONFIGURATION store the configuration parameters for the RE.
RE_DEPLOYABLE_PACKAGE keeps track of the deployable package that is currently deployed in the RE.
RE_LOG records events occurring in the RE.
RE_ACTIVE_USER stores information about all users who are currently active in the system. Data from this table is used to populate the session table in the MTR.
All other tables are used internally by the RE.
Much of the work that is done by OP uses MOR tables and views. The MOR includes the following tables (partial list):
|
Copyright © 2001 Oracle Corporation. All Rights Reserved. |
|