After conducting basic performance tuning and following the best practices recommendations described in previous chapters, you may still encounter performance issues. This chapter helps you troubleshoot the most common OpenSSO Enterprise performance issues. Topics in this chapter include:
The amtune tool provided by OpenSSO Enterprise tunes parameter values for the following three LDAP connection pools:
Realm User Authentication LDAP Connection Pool
Realm Data Store LDAP Connection Pool
OpenSSO Enterprise Configuration Store and SMS LDAP Connection Pools
In deployments with a subrealm, you must also tune the subrealm connection pools. Just like the root realm, each sub-realm can have its own user authentication LDAP connection pool and data store LDAP connection pool. You must tune these as well.
You can modify one or more of the three LDAP connection pool configurations . In each configuration, the recommended values are MIN=8 and MAX=32. Under some conditions, you can increase the MAX value up to 64. The following sections describe how to manually tune the connection pools:
You can modify the settings on one of the following depending upon the module you use for user authentication.
This module is used only to authenticate the user. In the OpenSSO Enterprise console, under Configuration, click Authentication > Core.
When the Data Store is as the authentication module, the Data Store LDAP connection pool settings are used. No additional Authentication connection pool settings are used.
The Data Store LDAP Configuration is used for retrieving user profiles and can also be used for authentication. If the Data Store Authentication module is used for authentication, then the recommended Data Store LDAP configuration settings are MIN=8 and MAX=64. You can modify the settings under Console > Access Control > Realm > Data Store.
The configuration data store is used for storing all the OpenSSO Enterprise configurations and Policy Service configurations. Configuration data is stored in the config directory. The OpenSSO Enterprise server supports Sun Directory Server and the embedded OpenDS as the config data stores. You can configure the LDAP configuration for the config data store through the OpenSSO Enterprise administration console. Go to Configuration >Servers and Sites > server >Directory configuration.
Start by setting all the connection pool configurations with MIN=8 and MAX=32.
If you must make adjustments based on performance test results, adhere to the following requirements:
The MIN value should be at least 8.
The MAX value for any pool should not be greater than 64. The MAX value of 32 is enough for most typical deployments.
Special requirements are outside the scope of this document.
After following steps 1 and 2, if low throughput or low response times persist, then try the following solutions:
Verify that the Directory Server instance is not at 100% CPU usage. If the Directory Server instance is at 100% and the throughput is still low, revisit the indexing on the Directory Server entries. Be sure that Directory Server indexing is configured properly.
Run load tests to verify that OpenSSO Enterprise logging is not causing performance to slow down. First run the tests with logging enabled, and then run the tests with logging disabled. If you find that logging is causing low response time, then you can tune the logging service through the OpenSSO Enterprise console. See the “Logging” section in Chapter 7, Configuration Attributes, in Sun OpenSSO Enterprise 8.0 Administration Reference.
Two modes exist for client-side policy configuration: subtree mode and self mode. Based on the client configuration, server-side policy evaluation is done differently.
In subtree mode, all the policies from the root resource are evaluated. The high performance cost of evaluating high number of policies makes caching necessary. In self mode, only one resource is evaluated. Self mode is fast, and no caching is required. So there is no need to tune the policy cache when all the clients are running in self mode.
The policy cache is a two-level nested cache, with one hash map contained inside the other. The top level cache is the resource cache. The session cache is a second hash map inside the resource cache.
A hash map whose key is resource/rule name and the value is hash map of policy session cache.
A hash map whose key is sessionid and the value is map of policy decision objects. For each new resource a new hash map of session cache is created and stored in the policy resource cache.
You can configure the policy cache by setting properties for both server and client.
The following two properties do not exist in the OpenSSO Enterprise administration console by default. These properties must be added manually in the advanced properties section of the OpenSSO Enterprise administration console:
The default value is 100. This means that a maximum of 100 rules can be cached in subtree mode.
This property should be always equal to the total number of rules configured in the system. Otherwise, when the maximum cache limits are reached for the resource cache, and if a new rule or resource is accessed, then the oldest cached rule and all the sessions cached for that rule will be removed. If you have large number of rules, configure this value to the total number of most frequently accessed rules.
The default value is 1000. Total number of policy objects is (100 *1000) or 100,000 maximum.
The resourceCap should be always tuned. The SessionCap should be tuned accordingly only when you observer high latency for policy requests or responses, and you observe repeated policy requests from the same policy agent for the same user. This usually does not occur unless the user session stays active for a very long period. The policies are also cached on the policy agent.
If you increase the ResourceCap value correspondingly, you should also reduce the SessionCap value to limit the total number of policy objects cached, and to maintain unchanged the maximum number of sessions supported on the server. The following table illustrates how the policy cache configuration effects the number of sessions supported. The SDK cache size is set to 10,000 for all of the tests. If the SDK cache is increased, the maximum number of sessions will be reduced accordingly.
Table 3–1 Policy Session Cache Configuration and Number of Sessions Supported
Policy Session Cache Configuration |
Maximum Number of Sessions Supported |
---|---|
1000 (100 * 1000 = 100,000 policy decision objects) |
200,000 |
2000 (100 * 2000 = 200,000 policy decision objects) |
150,000 |
3000 (100 * 3000 = 300,000 policy decision objects) |
90,000 |
4000 (100 * 4000 = 400,000 policy decision objects) |
40,000 |
The client-side SDK and policy agent cache properties apply only to Java EE policy agents. The properties do not apply to web agents.
The default value 20. This means the Java EE policy agent can cache a maximum of 20 rules or resources.
This property should be set equal to the number of rules configured on the server for the FQDN the Agent is protecting. Otherwise, when the maximum cache limits are reached for the resource cache, and if a new rule or resource is accessed, then the oldest cached rule and all the sessions cached for that rule will be removed.
The default value is 10000. This means the Java EE policy agent can cache a maximum of 10000 sessions per rule or resource. This property should be reduced or increased based on the memory available on the container.
The ResourceCap value should be always tuned. Since the policy agents co-exist with the application, you should increase or reduce the SessionCap on the policy agent based on the memory use of the application protected by the policy agent. You can increase the SessionCap value until you no longer observe frequent full GCs.
The amtune tool automatically tunes all memory related parameters. In most deployments, this is sufficient. However, occasionally the amtune tuning may not be sufficient and you may run into memory issues. Memory issues manifest themselves through excessively frequent garbage collection (GC) operations or frequent “Out of Memory” errors.
To resolve memory related issues, use the OpenSSO Enterprise administration console to tune the following parameters:
User cache/SDK cache
Go to Configuration > Servers and Sites > server > SDK > SDK Caching Max Size
Max Active Session the system should allow
Go to Configuration > Servers and Sites > server > Maximum Sessions.
Session Notification Thread Pool Size (Number of threads to process session notifications)
Go to Configuration >Servers and Sites > server >Notification Pool Size.
Session Notification Queue size
Go to Configuration > Servers and Sites > server > Notification Thread Pool Threshold.
Session Purge Delay (Number of minutes to delay the purge timed-out session)
Go to Configuration > Servers and Sites > server > Session Purge Delay.
To tune the policy cache, see Tuning the Policy Cache.
The tuning of this property depends on the JVM heap size configured in the web container where OpenSSO Enterprise is deployed. The minimum required JVM heap size for OpenSSO Enterprise is 1024 MB, and the number of sessions supported for 1024 MB is approximately 7000. see the table below for various JVM heap sizes with the default configuration.
The default value is set to 10000, This is suitable for most deployments. The SDK cache value can be increased to equal to the maximum number of sessions as long as you don't encounter frequent full GCs. Increasing this value results in slightly better performance, but will reduce the maximum number of sessions.
The Notification Queue size should be less than or equal to 30% of the Max Sessions, up to a maximum value of 30,000.
The following table lists the maximum number of sessions supported for various JVM heap sizes with the default tuning.
Table 3–2 Maximum Number of Sessions Supported for Various JVM Heap Sizes
JVM Heap Size |
Max # of session supported |
---|---|
3136 MB |
200,000 |
2560 MB |
145,000 |
1536 MB |
45,000 |
1024 MB |
7,000 |
These settings may not be suitable for certain deployments. When the number of user attributes retrieved is large, the SDK cache size will increase. Similarly, if the Extra Session properties are set, the Session size will increase.
In these cases, use one of the following options to solve the memory related issues:
Reduce the Max Sessions limit and make sure you follow the above rules. If you reduce the Max Sessions you may need to add additional instances to support additional sessions. If you do not want to add additional instances you can use the 64-bit JRE.
Reduce the SDK cache size. If you reduce the SDK cache size, your performance will go down. For better performance it is always better to set the SDK cache size equal to Max Sessions, and add additional instances to support more sessions.
Set the value of com.iplanet.am.notification.threadpool.size based on number of CPUs and based on the purgedelay value. See To Tune the Purge Delay Settings for related information.
If purgedelay is set to 0, the threadpool should be set using the following formula: (number of CPUs) x 3 = threadpool size. For example, for a machine with 8 CPUs, the threadpool size is 24. For CMT T1, T2, and T2 plus machines, use the formula: (number of cores) x 3 = threadpool size. The amtune tool sets this value based on the above rules, when purgedelay is set to 0, which is the default setting.
If the purgedelay value is set to greater than 0, then the threadpool should be set using the following formula: (number of CPUs) x 4 = threadpool size . For CMT T1, T2, and T2 plus machines, use the formula: (number of cores) x 4 = threadpool size. The notification threadpool size should be set manually by a multiple of 4 times the number of CPUs or cores. With this setting, if you still see problems such as frequent "Cannot send notification" or "Notification task queue full" errors in the amSession debug file, this indicates that the SessionNotificationqueue is full. The problem could be related to the Policy Agent or SDK client which is receiving notifications. The Policy Agent or SDK client is not able to process notifications properly. Consider disabling notification mode on the Policy Agent.
The purgedelay property is used to keep the session in memory in a timed-out state after the session has timed out. If the value is set to 0, then the session is removed from memory immediately. If the value is greater than zero, then the session is maintained in the memory until the purgedelay time elapses.
In almost all deployments, purgedelay should be set to 0. The amtune tool will set the value to 0 when run.
In special cases when the purgedelay value is greater than 0, reduce the number of active sessions (com.iplanet.am.session.maxSessions). Additionally, increase the notification threadpool size (com.iplanet.am.notification.threadpool.size)
The property com.iplanet.am.session.maxSessions describes the maximum number of active sessions that the system will allow. When the purgedelay is set to 0, the total number of sessions (active sessions and timed-out sessions) in memory will be equal to the value set for com.iplanet.am.session.maxSessions. If purgedelay is greater than 0, then the total number of sessions (active and timed-out sessions) in memory can be greater than active sessions. The difference will be based on three factors: the purgedelay time , the percentage of timed-out sessions, and the authentication rate. Therefore, when purgedelay is greater than zero, the maximum active sessions value should be reduced accordingly.
The simple way to do this is to look in the OpenSSO Enterprise session stats file. The amMasterSessionTable shows the current and peak values for maxSessions (active sessions + timed-out sessions) and maxActive (only active sessions) sessions in memory . Based on this information, the maxSessions value in the stats file limit should not exceed the 90000 limit for a JVM heap size of 3136 MB. When the purgedelay is set to 0, only one notification is sent when a session is removed from memory. When the purgedelay is greater than 0, then there will be two notifications for each timed-out session. The number of notifications for timed-out sessions are increased, and now more notification threads are needed. So the notification thread pool size should also be increased.
For more information on performance tuning and troubleshooting, see the following resources:
Java Performance portal site
http://java.sun.com/javase/technologies/performance.jsp
Java Tuning Whitepaper
http://java.sun.com/performance/reference/whitepapers/tuning.html
Java Hotspot VM Options
http://java.sun.com/javase/technologies/hotspot/vmoptions.jsp
Solaris TCP Tuning Parameters
http://docs.sun.com/app/docs/doc/817-0404/6mg74vsaj?a=view
Understanding Tuning TCP
http://www.sun.com/blueprints/1205/819-5144.pdf
Tuning for Linux platforms
http://www.redbooks.ibm.com/abstracts/redp4285.html
Java 5.0 Troubleshooting and Diagnostic Guide
http://java.sun.com/j2se/1.5/pdf/jdk50_ts_guide.pdf
Troubleshooting JRE 6 Deployment
http://java.sun.com/developer/technicalArticles/javase/troubleshoot/