Chapter 12. Optimization Techniques

There are numerous techniques the developer can use in order to ensure that Kodo operates in the fastest and most efficient manner. Following is a list of guidelines, in the approximate order of importance:

  1. Database indices: Indices created by Kodo's schematool may not always be the more appropriate for your application. Manually manipulating indices to include frequently-queried fields (as well as dropping indices on rarely-queried fields) can yield significant performance benefits.

  2. Use the best JDBC Driver: The JDBC driver provided by the database vendor is not always the fastest and most efficient. Some JDBC drivers do not support features like batched statements, the lack of which can significantly slow down Kodo's data access.

  3. JVM optimizations: Manipulating various parameters of the Java Virtual Machine (such as hotspot compilation modes and the maximum memory) can result in performance improvements. For more details about optimizing the JVM execution environment, please see http://java.sun.com/docs/hotspot/PerformanceFAQ.html.

  4. Use the data cache: Using the Kodo Data Cache feature (available as a separate product) can often result in a dramatic improvement in performance. See the DataStore Cache chapter for more details.

  5. Disable logging, SynchronizeSchema: Developer options such as logging, and using the SynchronizeSchema developer option will result in serious performance hits for your application. Before evaluating any Kodo performance, these options should all be disabled.

  6. Set IgnoreCache to true, and configure FlushBeforeQueries to flush data automatically before queries : When the javax.jdo.option.IgnoreCache property is set to false and com.solarmetric.kodo.FlushBeforeQueries is set to false, Kodo must evaluate in-memory dirty instances against the datastore values that are returned from a Query. This can sometimes result in Kodo needing to evaluate the entire extent of objects in-memory in order to return the correct query results, which can have drastic performance consequences. If it is appropriate for your application, configuring FlushBeforeQueries to automatically flush queries will ensure that this never happens. Setting IgnoreCache to false will result in a small performance hit even if FlushBeforeQueries is true, as incremental flushing is not as efficient overall as delaying all flushing to a single operation during commit. This is because incrementally flushing decreases Kodo's ability to maximize statement batching, and increases resource utilization.

    Note that the default setting of FlushBeforeQueries is with-connection, which means that data will be flushed only if a dedicated connection is already in use by the PersistenceManager. So, the default value may not be appropriate for you.

  7. Ensure that batch updates are available: When performing bulk inserts, updates, or delete, Kodo will use batched statements. If this feature is not available in your JDBC driver, then Kodo will need to issue multiple SQL statements instead of a single batch statement.

  8. Use single-table inheritance: Using a single-table inheritance model is faster for most operations than a multi-table inheritance model. If it is appropriate for your application, you should use the single-table inheritance model whenever possible.

  9. Use unordered Sets instead of Lists: There is extra overhead for Kodo to maintain ordered Collections (either as relations or privately-owned Collections). If your application does not require ordering for a relation or Collection, you should always use a HashSet as opposed to a LinkedList, ArrayList, or SortedSet.

  10. High increment in DBSequenceFactory: For applications that perform large bulk inserts, a bottleneck can be the retrieval of sequence numbers. Incrementing the value of the Increment parameter of the com.solarmetric.kodo.impl.jdbc.SequenceFactoryProperties property can result in this bottleneck being reduced. In some cases, implementing your own SequenceFactory can be used to optimize sequence number retrieval.

  11. Use optimistic transactions: Using datastore transactions translates into pessimistic database row locking, which can be a performance hit (depending on the database). If appropriate for your application, optimistic transactions are typically faster than datastore transactions.

  12. Perform nontransactional data reads outside a transaction.

  13. Always close PersistenceManagers and Query results: It is important to bear in mind that a PersistenceManager and the result from a Query are often backed by resources in the database. For example, if a Query result has not been completely instantiated, it will hold open a ResultSet object, which, in turn, will hold open a Statement object (preventing it from being re-used). Garbage collection will clean up these resources, so it is never necessary to explicitely close these resources, but it is always faster if it is done at the application level.

    Example 12.1. Explicitly closing resources

    public int getPersonCount (String jdoql)
    {
      PersistenceManagerFactory factory = ...; // obtain a PersistenceManagerFactory
      PersistenceManager pm = factory.getPersistenceManager ();
      try
      {
        Query query = pm.newQuery (Person.class, jdoql);
        try
        {
          return ((Collection)query.execute ()).size ();
        }
        finally
        {
          // close all results from this query
          query.closeAll (); 
        }
      }
      finally
      {
        // close the PersistenceManager and any associated resources
        pm.close (); 
      }
    }
    
    

  14. Optimize connection pool settings: Kodo's built-in connection pool's default settings may not be optimal for all applications. For applications that instantiate and close many PersistenceManagers (such as a web application), increasing the size of the connection pool will reduce the overhead of waiting on free connections or opening new connections. Additionally, as the connection pool size increases, you should also increase the prepared statement pool size, since the prepared statement pool size is global with respect to the connection pool (as opposed to a per-connection size) in Kodo 2.5.

  15. Utilize the PM cache: When possible and appropriate, re-using PersistenceManager objects will result in huge performance gains, since the PersistenceManager's built-in object cache will be used.

  16. Enable Multithreaded operation only when necessary: Kodo respects the javax.jdo.option.Multithreaded option in that it does not impose synchronization overhead for applications that set this value to false. If your application is guaranteed to only use a given PersistenceManager from a single thread (for example, EJB applications fall into this category), setting this option to false will result in the elimination synchronization overhead, and may result in a modest performance increase.

  17. Disable large data set handling: By default, Kodo JDO creates statements with the ResultSet.TYPE_SCROLL_INSENSITIVE flag. On some databases (SQLServer for example), result sets that support bidirectional scrolling are much slower than unidirectional result sets. So, if you do not have lots of data or your application always fully traverses large data sets, then you should disable large data set handling by setting the DefaultFetchThreshold property to -1.

  18. Use the OnDemandForwardResultList: By default, Kodo JDO uses a lazy result list implementation. This relies on result sets created with the ResultSet.TYPE_SCROLL_INSENSITIVE flag. On some databases (Oracle for example), result sets that support bidirectional scrolling are very memory-intensive, effectively faking bidirectional support in the client tier by loading all data from the beginning to whahtever index is requested. This interferes with the behavior of Kodo's LazyResultList -- in particular, the method that LazyResultList uses to compute the size of the list. So, if you have lots of data and do not want to pay the memory price of loading all the data into memory at one time, and you are using a driver that loads all data when jumping to the end of a result set, then you should use the OnDemandForwardResultList. This list is useful in two primary situations -- when it is acceptible to not know the size of the result list as you iterate through it, and when you need to iterate through large amounts of data and are using an inefficient JDBC driver.

    If you just want to be able to walk through a result list without incurring memory penalties when computing the size of the underlying result set, set the DefaultFetchThreshold property to -1 to force Kodo to use ResultSet.TYPE_FORWARD_ONLY result sets, and set the ResultListClass property to com.solarmetric.kodo.runtime.objectprovider.OnDemandForwardResultList. The result lists returned from query execution will be lazily loaded and will not know their correct size until the entire list is iterated, and it will be possible to jump to any index in the list.

    If you also expect the result list to be quite large and do not want to hold references to all the data at the same time, then you should also set the UseWindow configuration property in the com.solarmetric.kodo.ResultListProperties property:

    com.solarmetric.kodo.ResultListClass: com.solarmetric.kodo.runtime.objectprovider.OnDemandForwardResultList
    com.solarmetric.kodo.ResultListProperties: UseWindow=true
    com.solarmetric.kodo.DefaultFetchThreshold: -1
    
    [Note]Note

    Bear in mind that for most drivers and situations, the default LazyResultList will work fine. It is only in situations where query execution seems to take an inordinately long time or memory usage becomes an issue that the OnDemandForwardResultList becomes useful.

  19. Develop a custom SubclassProvider, use the com.solarmetric.kodo.impl.jdbc.ormapping.IntegerSubclassProvider, or turn off the subclass indicator column. Kodo JDO's default subclass provider is quite robust, in that it can handle any class and needs no configuration, but the downside of this robustness is that it puts a relatively lengthy string into each row of the database. With the IntegerSubclassProvider provider, a little application-specific configuration, you could easily reduce this to an integer. This can result in significant performance gains when dealing with many small objects, since the subclass indicator data can become a significant proportion of the data transferred between the JVM and the database.

    Alternately, if your application does not make use of inheritance, then you can disable the subclass provider column altogether.

  20. Set the com.solarmetric.kodo.CacheReferenceSize property to -1: Setting this property to -1 will cause the PersistenceManager to maintain hard references to all objects loaded through that PM, causing potential memory issues. However, it will no longer be necessary to maintain any ordering in this cache, so systems that load lots of objects may see some performance improvements.

  21. Do not use XA transactions unless distributed transaction functionality is required by your application. Distributed transactions are much slower than standard transactions (sometimes by as much as a factor of 500 to 1).

  22. Set the com.solarmetric.kodo.TransactionCacheClass property to com.solarmetrirc.kodo.runtime.ClassGroupStateManagerSet: See the StateManagerSet documentation for details.