Understanding the Waveset Repository (Oracle Waveset 8.1.1 System Administrator's Guide)

Oracle Waveset 8.1.1 System Administrator's Guide

Understanding the Waveset Repository

You must understand the Waveset repository database scripts and how Waveset uses that content before you can effectively tune the Waveset repository.

This section provides an overview of the database and the information is organized into the following topics:

Repository Table Types

The Waveset repository contains three types of tables, and each table has slightly different usage characteristics.

Attribute Tables. These tables enable you to query for predefined single-valued or multi-valued object attributes.

For most object types, stored attributes are hard-coded.

Note –
The User and Role object types are exceptions to this rule. The inline attributes that are stored in the object table for User and Role are configurable, so you can configure additional custom attributes as queryable.

When you search for objects based on attribute conditions, Waveset accesses attribute tables in joins with the corresponding object tables. Some form of join (such as a JOIN, an EXISTS predicate, or a SUB-SELECT) occurs for each attribute condition.

The number of rows in the attribute table are proportional to the number of rows in the corresponding object table. The values distribution might exhibit skew, where multi-valued attributes have a row per value and some objects might have more attributes or more attribute values than others. Typically, there is a many-to-one relation between rows in the attribute table and rows in the object table.

Attribute tables have ATTR in the table name.
Change Tables. Waveset uses a change table to track changes made to a corresponding object table. These table sizes are proportional to the rate of object change, but the tables are not expected to grow without bound. Waveset automatically truncates change tables.

Change tables can be highly volatile because the lifetime of a row is relatively short and new rows can be created frequently.

Access to a change table is typically performed by a range scan on the time-stamp field.

Change tables have CHANGE in the table name.
Object Tables. The Waveset repository uses object tables to hold serialized data objects, such as Large Objects (LOBs). Object tables can also hold commonly queried, single-valued object attributes.

For most object types, stored attributes are hard-coded.

Note –
The User and Role object types are exceptions to this rule. The inline attributes that are stored in the object table are configurable, and you can configure additional custom attributes as queryable for User and Role.

The number of rows in an object table equals the number of objects being stored. The number of objects stored in each object table depends on which object types are being stored in the table. Some object types are numerous, while other types are few.

Generally, Waveset accesses an object table by object ID or name, though Waveset can also access the table by using one of the attributes stored in the table. Object IDs and names are unique across a single object type, but attribute values are not unique or evenly distributed. Some attributes have many values, while other attributes have relatively few values. In addition, several object types can expose the same attribute. An attribute may have many values for one object type and few values for another object type. The uneven distribution of values might cause an uneven distribution of index pages, which is a condition known as skew.

Object tables are tables that do not have ATTR or CHANGE suffixes in the table name.

XML Columns

Every object table contains an XML column, which is used to store each serialized object except the LOG table-set. Certain LOG table-set optional attributes are stored in the XML column if these attributes are present. For example, if digital signing is enabled.

Data Classes

You can roughly divide Waveset data into a number of classes that exhibit similar properties with respect to access patterns, cardinality, lifetime, volatility, and so forth. Each of the following classes corresponds to a set of tables in the repository:

User Data. User data consists of user objects.

You can expect this data to grow quite large because there is an object for each managed identity. After an initial population phase, you can expect a proportionally small number of creates because the majority of operations will be updates to existing objects.

User objects are generally long-lived and they are removed at a relatively low rate.

User data is stored in USEROBJ, USERATTR, and USERCHANGE tables.
Role Data. Role data consists of Role objects, including Roles subtypes such as Business Roles, IT Roles, Applications, and Assets.

Role data is similar to organization data, and these objects are relatively static after a customer deploys Waveset.

Note –
An exception to the preceding statement is a deployment that is integrated with an external source containing an authoritative set of roles. One integration style might be to feed role changes into Waveset, which causes Waveset Role data to be more volatile.

Generally, the number of role objects is small when compared to the number of identity objects such as users (assuming that multiple users share each role), but this depends on how each enterprise defines its roles.

Role data is stored in ROLEOBJ, ROLEATTR, and ROLECHANGE tables.
Account Data. Account data solely consists of account objects in the Account Index.

As with user data, account data can become rather large, with an object for each known resource account. Account objects are generally long-lived, removed at a relatively low rate, and after initial population, are created infrequently. Unless you frequently add or remove native accounts, or enable native change detection, account object modifications occur infrequently.

Waveset stores account data in ACCOUNT, ACCTATTR, and ACCTCHANGE tables.
Compliance Violation Data. Compliance Violation data contains violation records that indicate when the evaluation of an Audit Policy failed. These violation records exist until the same Audit Policy is evaluated against the same User and the policy passes. Violation records are created, modified, or deleted as part of an Audit Policy Scan or as part of an Access Review.

The number of violation records is proportional to the number of Audit Policies that are used in scans and the number of Users. An installation with 5000 users and 10 Audit Policies might have 500 violation records (5000 x 10 x 0.01), where the 0.01 multiplier depends on how strict the policies are and how user accounts are changed.

Waveset stores Compliance Violation records in OBJECT, ATTRIBUTE, and OBJCHANGE tables.
Entitlement Data. Entitlement data predominately consists of user entitlement objects, which are only created if you are doing compliance access reviews.

Entitlement records are created in large batches, modified slowly (days) after initial creation, and are then untouched. These records are deleted after an Access Review is deleted.

Waveset stores entitlement data in ENTITLE, ENTATTR, and ENTCHANGE tables.
Organization Data. This data consists of object group or organization objects.

Object group data is similar to configuration data, and this data is relatively static after being deployed. Generally, the number of objects is small (one for each defined organization) when compared to task objects or to identity objects such as users or accounts, however, the number can become large compared to other configuration objects.

Organization data is stored in ORG, ORGATTR, and ORGCHANGE tables.
Task Data. Task data consists of objects that are related to tasks and workflows, including state and result data.

The data contained in these tables is short-lived compared to other classes because objects are created, modified, and deleted at a high rate. The volume of data in this table is proportional to the amount of activity on the system.

Task data is stored in TASK, TASKATTR, and TASKCHANGE tables.
Configuration Data. Configuration data consists of objects related to Waveset system configuration, such as forms, roles, and rules.

Generally, configuration data is:
- Relatively small compared to other classes
- Only expected to change during deployment and upgrade, and changes occur in large batches
- Not expected to change much after being deployed
Waveset stores configuration data in ATTRIBUTE, OBJCHANGE, and OBJECT tables.
Export Queue Data. If you enable Data Exporting, some records are queued inside Waveset until the export task writes those records to the Data Warehouse. The number of records that are queued is a function of Data Exporting configuration and the export interval for all queued types.

The following data types are queued by default, and all other data types are not:
- ResourceAccount
- WorkflowActivity
- TaskInstace
- WorkItem
The number of records in these tables grows until the export task drains the queue. The current table size is visible through a JMX Bean.

Records added to this table are never modified. These records are written during other Waveset activities, such as reconciliation, provisioning, and workflow execution. When the Data Exporter export task runs, the task drains the table.

Waveset stores Export Queue data records in QUEUE, QATTR, and QCHANGE tables.
Log Data. Log data consists of audit and error log objects. Log data is write-once only, so you can create new audit and error log objects, but you cannot modify these objects.

Log data is long-lived and can potentially become very large because you can only purge log data by explicit request. Access to log data frequently relies on attributes that are stored in the object table instead of in the attribute table. Both the distribution of attribute values and queries against the log specifically depend on how you are using Waveset.

For example, the distribution of attribute values in the log tables depends on the following:
- What kind of changes are made
- Which Waveset interface was used to make the changes
- Which types of objects were changed
The pattern of queries against the log table also depends on which Waveset reports, which custom reports, or which external data mining queries a customer runs against the log table.

Waveset stores audit log records in LOG and LOGATTR tables, and error log records in SYSLOG and SLOGATTR tables. This data does not have corresponding change tables.

Object IDs

Waveset generates globally unique identifiers (GUIDs) for objects by using the VMID class provided in the JDK software.

These GUID values exhibit a property that gets sorted by its string representations, based on the order in which the objects are created. For example, when you create new objects with Waveset, the newer objects have object IDs that are greater than the older objects. Consequently, when Waveset inserts new objects into the database, the index based on object IDs can encounter contention for the same block or blocks.

Prepared Statements

Generally, Waveset uses prepared statements for activities such as inserting and updating database rows, but does not use prepared statements for queries.

If you are using Oracle, this behavior can create issues with the library cache. In particular, the large number of statements versions can cause contention on the library cache latch.

To address this contention, change the Oracle CURSOR_SHARING parameter value from EXACT to SIMILAR. Changing this value causes Oracle to replace literals in SQL statements with bind variables, thereby reducing the number of versions.

Character Sets and Encodings

Because Waveset is a Java application that generally reads and writes character data rather than bytes, it does not restrict which encoding the database uses.

Waveset only requires that the data is sent and returned correctly. For example, the data does not become corrupted when written or reread. Use an encoding that supports multi-byte characters and is appropriate for the customer’s data. Generally, UTF-8 encoding is sufficient, but enterprises with a large number of true multi-byte characters, such as Asian or Arabic, might prefer UTF-16.

Most database administrators prefer to use an encoding that supports multi-byte characters because of the following:

Their deployments often grow to support international characters.
Their end users cut-and-paste from a Microsoft application’s text containing characters that look like ASCII but are actually multi-byte, such as em dashes (—).