Sun Identity Manager 8.1 System Administrator's Guide

Tuning Your Deployment Environment

This section provides information about tuning your deployment environment, including:

Tuning Your Java EE Environment

This section describes some tuning suggestions you can use to optimize your Java Platform, Enterprise Edition (Java EE platform) environment.

These tuning suggestions were derived from a series of experiments in which a considerable increase in throughputs was observed for the use cases tested. The increases were attributed to JVM sizing and to switches that affected garbage collector behavior.

Note –

For more information about tuning Java, JConsole, or JVM, visit the web sites noted in Table 4–1 and Table 4–2.

The following sections provide information about tuning Java and the JVM in your Java EE environment.

Tuning Java

For information, best practices, and examples related to Java performance tuning, see the Java Tuning White Paper at:

http://java.sun.com/performance/reference/whitepapers/tuning.html

Tuning the JVM

The following tuning scripts were used to derive the tuning suggestions noted in this section. These scripts were added to the domain.xml file (located in the domain configuration directory, which is typically domain-dir/config ) on a Sun Java System Application Server.

PrintGCStats – A data mining shell script that collects data from verbose:gc logs and displays information such as garbage collection pause times, parameter calculations, and timeline analyses over the application’s runtime by sampling the data at user-specified intervals.

Note –
For more information about how to use this script and garbage collection statistics to derive optimal JVM tunings, see the following web site:

http://java.sun.com/developer/technicalArticles/Programming/turbo/#PrintGCStats
PrintGCDetails – A shell script that can provide more verbose garbage collection statistics.
PrintGCTimeStamps – A shell script that adds time-stamp information to the garbage collection statistics collected by using the PrintGCDetails script.

To help ensure the best JVM performance, verify the following:

Be sure that you are using the required Java version noted in the “Supported Software and Environments” section of the Sun Identity Manager 8.1 Release Notes to ensure you are using the most current features, bug fixes, and performance enhancements.
Be sure that you are using a newer version of garbage collection.

Frequently, customers do not remove the older, default garbage collection scheme when installing an application server. Running Identity Manager with an older garbage collector creates many objects, which forces the JVM to constantly collect garbage.
If you deployed Identity Manager on Sun Java System Application Server, you can increase throughput by adding garbage collection elements to the deployed Identity Manager instance server.xml file.

If you expect a peak load of more than 300 users, try modifying the following settings to increase performance:
- For HTTP listeners configured for the deployed Identity Manager instance, edit the listener definition element in the server.xml file and set the number of acceptor threads to the Number of active CPUs on the host divided by the Number of active HTTP listeners on the host.
  
  For example:
  
  <http-listener id=”http-listener-1” \address=”0.0.0.0” port=”80” \acceptor threads=”Calculated Acceptor Threads” ...>
- Because the static content of most Identity Manager deployments is not projected to change frequently, you can edit the File Cache settings (on the File Cache Configuration page) for static content. Specify a high number (such as the number of seconds in 24 hours) for the maximum age of content within the file cache before the content is reloaded.
  
  To access the File Cache Configuration page, click the File Caching tab on the web-based Administrative Console for the HTTP server node. (See the latest Sun Java System Web Server Administrator’s Guide for detailed instructions.)
  
  Note –
  Sun Java System Application Server exposes tunables that affect the size of various thread pools and connection queues that are maintained by the HTTP container.
  
  By default, most of these tunables are set for a concurrent user load of 300 users or less.

Tuning Your Application Server

The following guidelines are provided to help you tune your application server:

Note –

Other than heap size, you can use the default parameter settings for most application servers. You might want to modify the server’s heap size, depending on the release being used.

Tuning a Sun Java System Application Server

The “Tuning the Application Server” chapter, in the latest Sun Java System Application Server Performance Tuning Guide, contains information about tuning a Sun Java System Application Server. This document is available from the following URL at http://docs.sun.com/app/docs.

In addition, if you are using Sun Java System Application Server 8.2 Enterprise Edition, the following changes solve “concurrent mode failures,” and should give you better and more predictable performance:

If you are constantly running old generation collections, review your application’s heap footprint and consider increasing the size. For example:
- 500 Mbytes is considered a modest size, so increasing this value to 3 Gbytes might improve performance.
- With a 2–Gbyte young generation collection, each scavenge promotes about 70 Mbytes. Consider giving this 70 Mbytes at least one more scavenge before promoting. For example, you might need the following SurvivorRatio:
  
  2 GB/70 M X 2 = 4096/70 = 55
  
  where:
  
  -XX:SurvivorRatio=32 -XX:MaxTenuringThreshold=1
  
  This ratio prevents premature promotion and the added problem of “nepotism,” which can degrade scavenge performance.
If you specified -XX:CMSInitiatingOccupancyFraction=60, and the CMS collections are still starting before they reach that threshold. For example:

56402.647: [GC [1 CMS-initial-mark: 265707K(1048576K)] 1729560K(3129600K), 3.4141523 secs]

Try removing -XX:CMSInitiatingOccupancy=60 (using the default value of 69percent), and add the following line:

-XX:UseCMSInitiatingOccupancyOnly
If your young generation collection is 2 Gbytes and the old generation collection is 1 Gbyte, this situation might also be causing premature CMS collections. Consider reversing this ratio. Use a 1–Gbyte young generation collection and a 2–Gbyte old generation collection, as follows

-Xms3G -Xmx3G -Xmn1G

Also, remove -XX:NewRatio. This ratio is redundant when you have explicitly specified young generation and overall heap sizes.
If you are using a 5uXX version of the Java Development Kit (JDK^TM software), and have excessively long “abortable preclean” cycles, you can use -XX:CMSMaxAbortablePrecleanLoops=5 as a temporary workaround.

You might have to adjust this value further.

Add the following line to view more information about garbage collector performance.

-XX:+PrintHeapAtGC

Note –
Use this command with caution because it increases how much verbose garbage collection data is produced. Be sure that you have enough disk space on which to save the garbage collector output.
If you are using the Sun Fire^TM T2000 server, large-heap data Translation Look-aside Buffers (DTLBs) can become a scarce resource. Using large pages for the Java heap often helps performance. For example:

-XX:+UseLargePages

Tuning a WebSphere Application Server

If you are tuning Identity Manager on an IBM WebSphere® application server, consider limiting how much memory is allocated for the heap because heap memory can affect the memory used by threads.

If many threads are created simultaneously and the heap size increases, the application’s space limit can be quickly impacted and the following error results:

JVMCI015:OutOfMemoryError

Tuning Your Repository Database

Identity Manager relies on the repository database to store and manage its identity and configuration data. For this reason, database performance can greatly influence Identity Manager’s performance.

Note –

Detailed information about performance tuning and managing databases is beyond the scope of this document because this information is dataset-specific and vendor-specific. In addition, customer database administrators (DBAs) should already be experts on their own databases.

This section characterizes the Identity Manager application and provides general information about the nature of Identity Manager data and its typical usage patterns to help you plan, tune, and manage your databases.

This information is organized into the following sections:

Repository Table Types

The Identity Manager repository contains three types of tables, and each table has slightly different usage characteristics. Information about these tables is organized into the following sections:

Attribute Tables

Attribute tables enable you to query for predefined single-valued or multi-valued object attributes.

For most object types, stored attributes are hard-coded.

Note –

The User and Role object types are exceptions to this rule. The inline attributes that are stored in the object table for User and Role are configurable, so you can configure additional custom attributes as queryable.

When you search for objects based on attribute conditions, Identity Manager accesses attribute tables in joins with the corresponding object tables. Some form of join (such as a JOIN, an EXISTS predicate, or a SUB-SELECT) occurs for each attribute condition.

The number of rows in the attribute table are proportional to the number of rows in the corresponding object table. The values distribution might exhibit skew, where multi-valued attributes have a row per value and some objects might have more attributes or more attribute values than others. Typically, there is a many-to-one relation between rows in the attribute table and rows in the object table.

Attribute tables have ATTR in the table name.

Change Tables

Identity Manager uses a change table to track changes made to a corresponding object table. These table sizes are proportional to the rate of object change, but the tables are not expected to grow without bound. Identity Manager automatically truncates change tables.

Change tables can be highly volatile because the lifetime of a row is relatively short and new rows can be created frequently.

Access to a change table is typically performed by a range scan on the time-stamp field.

Change tables have CHANGE in the table name.

Object Tables

The Identity Manager repository uses object tables to hold serialized data objects, such as Large Objects (LOBs). Object tables can also hold commonly queried, single-valued object attributes.

For most object types, stored attributes are hard-coded.

Note –

The User and Role object types are exceptions to this rule. The inline attributes that are stored in the object table are configurable, and you can configure additional custom attributes as queryable for User and Role.

The number of rows in an object table equals the number of objects being stored. The number of objects stored in each object table depends on which object types are being stored in the table. Some object types are numerous, while other types are few.

Generally, Identity Manager accesses an object table by object ID or name, though Identity Manager can also access the table by using one of the attributes stored in the table. Object IDs and names are unique across a single object type, but attribute values are not unique or evenly distributed. Some attributes have many values, while other attributes have relatively few values. In addition, several object types can expose the same attribute. An attribute may have many values for one object type and few values for another object type. The uneven distribution of values might cause an uneven distribution of index pages, which is a condition known as skew.

Object tables are tables that do not have ATTR or CHANGE suffixes in the table name.

XML Columns

Every object table contains an XML column, which is used to store each serialized object except the LOG table-set. Certain LOG table-set optional attributes are stored in the XML column if these attributes are present. For example, if digital signing is enabled.

Data Classes

You can roughly divide Identity Manager data into a number of classes that exhibit similar properties with respect to access patterns, cardinality, lifetime, volatility, and so forth. Each of the following classes corresponds to a set of tables in the repository:

User Data

User data consists of user objects.

You can expect this data to grow quite large because there is an object for each managed identity. After an initial population phase, you can expect a proportionally small number of creates because the majority of operations will be updates to existing objects.

User objects are generally long-lived and they are removed at a relatively low rate.

User data is stored in USEROBJ, USERATTR, and USERCHANGE tables.

Role Data

Role data consists of Role objects, including Roles subtypes such as Business Roles, IT Roles, Applications, and Assets.

Role data is similar to organization data, and these objects are relatively static after a customer deploys Identity Manager.

Note –

An exception to the preceding statement is a deployment that is integrated with an external source containing an authoritative set of roles. One integration style might be to feed role changes into Identity Manager, which causes Identity Manager Role data to be more volatile.

Generally, the number of role objects is small when compared to the number of identity objects such as users (assuming that multiple users share each role), but this depends on how each enterprise defines its roles.

Role data is stored in ROLEOBJ, ROLEATTR, and ROLECHANGE tables.

Account Data

Account data solely consists of account objects in the Account Index.

As with user data, account data can become rather large, with an object for each known resource account. Account objects are generally long-lived, removed at a relatively low rate, and after initial population, are created infrequently. Unless you frequently add or remove native accounts, or enable native change detection, account object modifications occur infrequently.

Identity Manager stores account data in ACCOUNT, ACCTATTR, and ACCTCHANGE tables.

Compliance Violation Data

Compliance Violation data contains violation records that indicate when the evaluation of an Audit Policy failed. These violation records exist until the same Audit Policy is evaluated against the same User and the policy passes. Violation records are created, modified, or deleted as part of an Audit Policy Scan or as part of an Access Review.

The number of violation records is proportional to the number of Audit Policies that are used in scans and the number of Users. An installation with 5000 users and 10 Audit Policies might have 500 violation records (5000 x 10 x 0.01), where the 0.01 multiplier depends on how strict the policies are and how user accounts are changed.

Identity Manager stores Compliance Violation records in OBJECT, ATTRIBUTE, and OBJCHANGE tables.

Entitlement Data

Entitlement data predominately consists of user entitlement objects, which are only created if you are doing compliance access reviews.

Entitlement records are created in large batches, modified slowly (days) after initial creation, and are then untouched. These records are deleted after an Access Review is deleted.

Identity Manager stores entitlement data in ENTITLE, ENTATTR, and ENTCHANGE tables.

Organization Data

Organization data consists of object group or organization objects.

Object group data is similar to configuration data, and this data is relatively static after being deployed. Generally, the number of objects is small (one for each defined organization) when compared to task objects or to identity objects such as users or accounts, however, the number can become large compared to other configuration objects.

Organization data is stored in ORG, ORGATTR, and ORGCHANGE tables.

Task Data

Task data consists of objects that are related to tasks and workflows, including state and result data.

The data contained in these tables is short-lived compared to other classes because objects are created, modified, and deleted at a high rate. The volume of data in this table is proportional to the amount of activity on the system.

Task data is stored in TASK, TASKATTR, and TASKCHANGE tables.

Configuration Data

Configuration data consists of objects related to Identity Manager system configuration, such as forms, roles, and rules.

Generally, configuration data is:

Relatively small compared to other classes
Only expected to change during deployment and upgrade, and changes occur in large batches
Not expected to change much after being deployed

Identity Manager stores configuration data in ATTRIBUTE, OBJCHANGE, and OBJECT tables.

Export Queue Data

If you enable Data Exporting, some records are queued inside Identity Manager until the export task writes those records to the Data Warehouse. The number of records that are queued is a function of Data Exporting configuration and the export interval for all queued types.

The following data types are queued by default, and all other data types are not:

ResourceAccount
WorkflowActivity
TaskInstace
WorkItem

The number of records in these tables grows until the export task drains the queue. The current table size is visible through a JMX^TM Bean.

Records added to this table are never modified. These records are written during other Identity Manager activities, such as reconciliation, provisioning, and workflow execution. When the Data Exporter export task runs, the task drains the table.

Identity Manager stores Export Queue data records in QUEUE, QATTR, and QCHANGE tables.

Log Data

Log data consists of audit and error log objects. Log data is write-once only, so you can create new audit and error log objects, but you cannot modify these objects.

Log data is long-lived and can potentially become very large because you can only purge log data by explicit request. Access to log data frequently relies on attributes that are stored in the object table instead of in the attribute table. Both the distribution of attribute values and queries against the log specifically depend on how you are using Identity Manager.

For example, the distribution of attribute values in the log tables depends on the following:

What kind of changes are made
Which Identity Manager interface was used to make the changes
Which types of objects were changed

The pattern of queries against the log table also depends on which Identity Manager reports, which custom reports, or which external data mining queries a customer runs against the log table.

Identity Manager stores audit log records in LOG and LOGATTR tables, and error log records in SYSLOG and SLOGATTR tables. This data does not have corresponding change tables.

Object IDs

Identity Manager generates globally unique identifiers (GUIDs) for objects by using the VMID class provided in the JDK software.

These GUID values exhibit a property that gets sorted by its string representations, based on the order in which the objects are created. For example, when you create new objects with Identity Manager, the newer objects have object IDs that are greater than the older objects. Consequently, when Identity Manager inserts new objects into the database, the index based on object IDs can encounter contention for the same block or blocks.

Prepared Statements

Generally, Identity Manager uses prepared statements for activities (such as inserting and updating database rows), but does not use prepared statements for queries.

If you are using Oracle, this behavior can create issues with the library cache. In particular, the large number of statements versions can cause contention on the library cache latch.

To address this contention, change the Oracle CURSOR_SHARING parameter value from EXACT to SIMILAR. Changing this value causes Oracle to replace literals in SQL statements with bind variables, thereby reducing the number of versions.

Character Sets and Encodings

Because Identity Manager is a Java application that generally reads and writes character data rather than bytes, it does not restrict which encoding the database uses.

Identity Manager only requires that the data is sent and returned correctly. For example, the data does not become corrupted when written or reread. Use an encoding that supports multi-byte characters and is appropriate for the customer’s data. Generally, UTF-8 encoding is sufficient, but enterprises with a large number of true multi-byte characters, such as Asian or Arabic, might prefer UTF-16.

Most database administrators prefer to use an encoding that supports multi-byte characters because of the following:

Their deployments often grow to support international characters.
Their end users cut-and-paste from a Microsoft application’s text containing characters that look like ASCII but are actually multi-byte, such as em dashes (—).

General Guidelines for Tuning a Repository Database

This section describes some general guidelines for tuning a repository database:

DBAs must frequently run statistics to monitor what is happening with the repository database.
If you are using a data source, set the connectionPoolDisable attribute to true in the RepositoryConfiguration object to disable automatic internal connection pooling in the Identity Manager repository.

For example, setting <RepositoryConfiguration connectionPoolDisable=’true’> allows you to avoid having two connection pools (one for Identity Manager and one for your application server).

You can edit the RepostioryConfiguration object to enhance the performance of searches against specific, single-valued attributes. For example, you might edit this object to add an extended attribute, such as employeeID, that is used to search for Users or as a correlation key.

The default RepositoryConfiguration object looks like the following example:

<RepositoryConfiguration ... >
   <TypeDataStore Type="User" ... attr1="MemberObjectGroups", 
attr2="lastname" attr3="firstname" attr4="" attr5="">
   </TypeDataStore>
</RepositoryConfiguration>

Note –

The ellipses represent XML attributes that are not relevant here.

Each of the attr1, attr2, attr3, attr4, and attr5 XML attributes specifies a single-valued attribute to be copied into the waveset.userobj table. The waveset.userobj table can contain up to five inline attributes. The attribute value named by attr1 in RepositoryConfiguration will be copied into the “attr1” database column in this table.

Inline attributes are stored in the base object table for a Type (rather than as rows in the child attribute table).

Using inline attributes improves the performance of repository queries against those attributes. (Because inline attributes reside in the main “object” table, queries against inline attributes are faster than those against non-inline attributes, which are stored in the child “attribute” table. A query condition against a non-inline attribute requires a “join” to the attribute table.)

By default, Identity Manager uses the MemberObjectGroups, lastname, and firstname inline attributes.

You can add two more attributes to enable faster searching, as long as those attributes are queryable.

For example, if your deployment contains an employeeID extended attribute, adding that attribute inline will improve the performance of repository searches against that attribute.
If you do not need lastname or firstname, you can remove them or replace them with other attributes.
Do not remove MemberObjectGroups. Identity Manager uses this attribute internally to speed up authorization checks.

For more information about which object types are stored in each set of tables, see Data Classes.

Vendor-Specific Database Tuning Guidelines

This section describes some vendor-specific guidelines for tuning Oracle and SQL Server repository databases.

Note –

Currently, MySQL^TM databases are only supported in development and for demonstrations.

Oracle Databases

This section describes guidelines for tuning Oracle repository databases:

The Identity Manager application does not require Oracle database features or options.
If you are using an Oracle repository database and Identity ManagerService Provider or Identity Manager, you might encounter problems with object table fragmentation because Identity Manager uses LONG, rather than LOB, data types by default. Using LONG data types can result in large amounts of “unallocated” extent space, which cannot be made into usable space.

To mitigate this problem, do the following:
- Take EXPORT dumps of the Object table and re-import them to free up unallocated extent space. After importing, you must stop and restart the database.
- Use LOB data types and DataDirect Technologies’ Merant drivers, which provide a standard LOB implementation for Oracle.
- Use Locally Managed Tablespaces (LMTs), which offer automatic free space management. LMTs are available in Oracle 8.1.5.

Identity Manager does not require Oracle init.ora parameter settings for SGA sizing, buffer sizing, open cursors, processes, and so forth.

While the Identity Manager repository is a general-purpose database, it is best described as an object database.

Of the Identity Manager tables, the TASK table-set comes closest to having transaction-processing characteristics. The LOG and SYSLOG table-sets are also exceptional because these tables do not store serialized objects.

See Repository Table Types and Data Classes for descriptions of the tables, the object types stored in each table, and the general access pattern for each table.
If you have performance issues with the Oracle database, check for issues related to poor query plans being chosen for what Identity Manager expects to be relatively efficient queries.

For example, Identity Manager is configured to perform a full table-scan when an index is available for use. These issues are often visible in Automated Workload Repository (AWR) reports provided in the SQL by the buffer gets table. You can also view issues in the Enterprise Manager tool.

Performance problems typically appear to be the result of bad or missing database table statistics. Addressing this problem improves performance for both the database and Identity Manager.

The following articles (available from Oracle) are a good source of information about the cost-based optimizer (CBO) in Oracle:
- Understanding System Statistics
- Oracle MetaLink: Note:114671.1: Gathering Statistics for the Cost Based Optimizer
- Cost Control: Inside the Oracle Optimizer
- You might also investigate using SQL Profiles, which are another method for choosing the best query plans. You can use the SQL Advisor within Enterprise Manager to create these profiles when you identify poorly performing SQL.
If you detect unexpected growth in the Oracle redo log, you might have workflows that are caught in an infinite loop with a manual action. The loop causes constant updates to the repository, which in turn causes the size of each TaskInstance to grow substantially. The workflow errors are caused by improper handling of WF_ACTION_TIMEOUT and by users closing their browser in the middle of a workflow.

To prevent problematic workflows, preview each manual action before a production launch and verify the following:
- Have you set a timeout?
- Have you created appropriate transition logic to handle a timeout for the activity with the manual action?
- Is the manual action using the exposed variables tag when there is a large amount of data in the TaskInstance?
Frequently, you can significantly improve Identity Manager performance if you change the CURSOR_SHARING parameter value from EXACT to SIMILAR.

Identity Manager uses prepared statements for some activities (such as inserting and updating database rows), but does not use these statements for most queries.

When you use Oracle, this behavior can cause issues with the library cache. In particular, the large number of statement versions can create contention on the library cache latch. Changing CURSOR_SHARING to SIMILAR causes Oracle to replace literals in SQL statements with bind variables, which greatly reduces the number of versions.

See Prepared Statements for more information.

SQL Server Databases

Some customers who used an SQL Server 2000 database as a repository reported that as concurrency increased, SQL Server 2000 reported deadlocking problems that were related to SQL Server’s internal use of pessimistic locking (primarily lock escalation).

These deadlock errors display in the following format:

com.waveset.util.IOException:
  ==> com.microsoft.sqlserver.jdbc.SQLServerException: Transaction (Process ID 51) 
was deadlocked on lock | communication buffer resources with another 
process and has been chosen as the deadlock victim. Rerun the transaction.

To prevent or address deadlocking problems, do the following:

Use the SQL Server 2005 database.
Configure the READ_COMMITTED_SNAPSHOT parameter by formatting the command as follows:

ALTER DATABASE waveset SET READ_COMMITTED_SNAPSHOT ON

Enabling the READ_COMMITTED_SNAPSHOT parameter does the following:
- Removes contention during the execution of SELECT statements that can cause blocks, which greatly reduces the potential for deadlocks internal to SQL Server.
- Prevents uncommitted data from being read and guarantees that SELECT statements receive a consistent view of committed data.
For more information about the READ_COMMITTED_SNAPSHOT parameter, see: http://msdn2.microsoft.com/en-us/library/ms188277.aspx.