10 Tuning Oracle TopLink

You can tune EclipseLink, an open-source persistence framework used with Oracle TopLink, to optimize its performance as the Java Persistence API (JPA) implementation.

About Oracle TopLink and EclipseLink

Oracle TopLink includes the open source EclipseLink as the Java Persistence API (JPA) implementation. Oracle TopLink extends EclipseLink with advanced integration into the Oracle Application Server.

The information here assumes that you are familiar with the basic functionality of EclipseLink. Before you begin tuning, consider reviewing the following introductory information:

For more information on Oracle TopLink, see the TopLink page on the Oracle Technology Network (OTN).

Note:

The information here serves as a Quick Start guide to performance tuning JPA in the context of a Java EE environment. While this information provides common performance tuning considerations and related documentation resources, it is not meant to be a comprehensive list of areas to tune.

Basic Tuning Considerations

To achieve optimal performance, you can follow the tuning recommendations that apply to your own use case scenarios.

SQL Statement and Query Tuning Parameters

Table 10-1 and Table 10-2 show tuning parameters and performance recommendations related to SQL statements and querying.

Table 10-1 EJB/JPA Using Efficient SQL Statements and Querying

Tuning Parameter Description Performance Notes

Parameterized SQL Binding

By using parameterized SQL and prepared statement caching, you can improve performance by reducing the number of times the database SQL engine parses and prepares SQL for a frequently called query. EclipseLink enables parameterized SQL by default. However, not all databases and JDBC drivers support these options. Note that the Oracle JDBC driver bundled with Oracle Application Server does support this option. Use the persistence property eclipselink.jdbc.bind-parameters in the persistence.xml file to configure this.

See also "Understanding Caching" at http://www.eclipse.org/eclipselink/documentation/2.5/concepts/cache.htm#CDEFHHEH and "Understanding Querying" at http://www.eclipse.org/eclipselink/documentation/2.5/concepts/queries.htm#CHDGGCJB

Default Value: PERSISTENCE_UNIT_DEFAULT, which is true by default.

Leave parameterized SQL binding enabled for selected databases and JDBC drivers that support these options.

JDBC Statement Caching

Statement caching is used to lower the performance impact of repeated cursor creation and repeated statement parsing and creation; this can improve performance for applications by using a database.

Note: For Java EE applications, use the data source's statement caching and do not use EclipseLink Statement Caching for EJB3.0/JPA. For example: eclipselink.jdbc.cache-statements="true".

Set this option in an Oracle Weblogic data source by setting Statement Cached Type and Statement Cached Size configuration options.

See also Increasing Performance with the Statement Cache in Administering JDBC Data Sources for Oracle WebLogic Server.

Default Value: The Oracle Weblogic Server data source default statement cache size is 10 statements per connection.

If your JDBC driver supports this option, you should always enable statement caching. The Oracle JDBC driver supports this option.

Fetch Size

The JDBC fetch size gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed.

For large queries that return a large number of objects, you can configure the row fetch size used in the query to improve performance by reducing the number database hits required to satisfy the selection criteria.

Most JDBC drivers use a default fetch size of 10. If you are reading 1000 objects, increasing the fetch size to 256 can significantly reduce the time required to fetch the query's results.

Note: The default value means use the JDBC driver default value, which is typically 10 rows for the Oracle JDBC driver.

To configure this, use query hint eclipselink.jdbc.fetch-size.

Default Value: 0.

The optimal fetch size is not always obvious. Usually, a fetch size of one half or one quarter of the total expected result size is optimal. Note that if you are unsure of the result set size, incorrectly setting a fetch size too large or too small can decrease performance.

Batch Writing

Batch writing can improve the database performance by sending groups of INSERT, UPDATE, and DELETE statements to the database in a single transaction, rather than individually.

Use the persistence property "eclipselink.jdbc.batch-writing"="JDBC" in the persistence.xml file to configure this.

Default Value: Off.

Enable for the persistence unit.

Change Tracking

This is an optimization feature that lets you tune the way EclipseLink detects changes in an entity.

Default Value: If using weaving (Java EE default) AttributeLevelotherwise Deferred.

Leave at default AttributeLevel for best performance.

Weaving

Can disable through persistence.xml properties eclipselink.weaving

Default Value: On.

Leave On for best performance.

Read Only

Setting an EJB3.0 JPA entity to Read Only ensures that the entity cannot be modified and enables EclipseLink to optimize unit of work performance.

Set through query hint eclipselink.read-only.

Can also be set at entity level by using the @ReadOnly class annotation.

Default Value: False.

For optimal performance use Read Only on any query where the resulting objects are not changed.

firstResult and maxRows

These are JPA query properties that are used for paging large queries. Typically, these properties can be used when the entire result set of a query returning a large number of rows is not needed. For example, when a user scans the result set (a page at a time) looking for a particular result and then discards the rest of the data after the record is found.

Use on queries that can have a large result set and only a subset of the objects is needed.

Sequence number pre-allocation

Sequence number preallocation enables a batch of ids to be queried from the database simultaneously to avoid accessing the database for an id on every insert.

Default Value: 50.

Always use sequence number preallocation for best performance for inserts. SEQUENCE or TABLE sequencing should be used for optimal performance, not IDENTITY, which does not allow preallocation.

Entity Relationships Query Tuning Parameters

Table 10-2 shows the entity relationship between the query parameters for performance tuning.

Table 10-2 EJB3.0 Entity Relationship Query Performance Options

Tuning Parameter Description Performance Notes

Batch Fetching

The eclipselink.batch hint supplies EclipseLink with batching information so subsequent queries of related objects can be optimized in batches instead of being retrieved one-by-one or in one large joined read.

Batch fetching has three types: JOIN, EXISTS , and IN. The type is set through the query hint eclipselink.batch.type.

Note that batching is only allowed on queries that have a single object in their select clause. The query hint to configure this is eclipselink.batch. Batch fetching can also be set by using the @BatchFetch annotation.

Default Value: Off.

Use it to query the tables with columns mapping to the table data you need. You should only use either batch fetching or joining if you know that you are going to access all the data; if you do not intend to access the relationships, then let the indirection defer their loading.

Batch fetching is more efficient than joining because it avoids reading duplicate data; therefore for best performance for queries where batch fetching is supported, consider using batch fetching instead of join reading.

Join Fetching

Join fetching is a query optimization feature that enables a single query for a class to return the data to build the instances of that class and its related objects.

Use this feature to improve query performance by reducing database access. By default, relationships are not join-read; if you are using lazy-loading, each relationship is fetched separately when accessed or as a separate database query if you are not using lazy-loading.

You can specify the use of join in JPQL (JOIN FETCH), or you can set it as multilevel in the query hint eclipselink.join-fetch. It also can be set in the mapping annotation @JoinFetch.

Joining is part of the JPA specification, whereas batch fetching is not. And, joining works on queries that do not work with batch fetching. For example, joining works on queries with multiple objects in the select clause, queries with a single result, and for cursors and first or max results, whereas batch fetching does not.

See also "Join Fetching" at http://www.eclipse.org/eclipselink/documentation/2.5/solutions/performance001.htm#CHDEGCHH

Default Value: Not Used.

Use it to query the tables with columns mapping to the table data you need. You should only use either batch fetching or joining if you know that you are going to access all the data; if you do not intend to access the relationships, then let the indirection defer their loading. For the best performance of selects, where batch fetching is not supported, a join is recommended

Lazy loading

Without lazy loading on, when EclipseLink retrieves a persistent object, it retrieves all the dependent objects to which it refers. When you configure lazy reading (also known as indirection, lazy loading, or just-in-time reading) for an attribute mapped with a relationship mapping, EclipseLink uses an indirection object as a place holder for the referenced object.

EclipseLink defers reading the dependent object until you access that specific attribute. This can result in a significant performance improvement, especially if the application is interested only in the contents of the retrieved object, rather than the objects to which it is related.

See also "Using Lazy Loading" at http://www.eclipse.org/eclipselink/documentation/2.5/concepts/mappingintro001.htm#CEGBCJAG.

Default Value: On for collection mapping (ToMany mappings, @OneToMany and @ManyToMany)

Default Value: Off for reference (ToOne mappings, @OneToOne and @ManyToOne)

Note:

Setting lazy loading to On for @OneToOneand @ManyToOne requires weaving, which is set to On by default for Java EE.

Use lazy loading for all mappings. Using lazy loading and querying the referenced objects by using batch fetching or Join is more efficient than Eager loading.

You may also consider using optimized loading with LoadGroups, which allows a query to force instantiation of relationships.

Cache Configuration Tuning Parameters

You can tune the default internal cache that is provided by EclipseLink. Oracle Toplink or EclipseLink can also be integrated with Oracle Coherence. For information on configuring and tuning an EclipseLink Entity Cache by using Oracle Coherence, see .

The default settings for EJB3.0/JPA that is used with the EclipseLink persistence manager and cache are no locking, no cache refresh, and cache-usage DoNotCheckCache. To ensure that your application uses the cache and does not read stale data from the cache (when you do not have exclusive access), you must configure these and other isolation related settings appropriately. Table 10-3 shows the cache configuration options.

For more information on cache configuration, see "Understanding Caching" at http://www.eclipse.org/eclipselink/documentation/2.5/concepts/cache.htm#CDEFHHEH.

Note:

By default, EclipseLink assumes that your application has exclusive access to the data it is using that is, there are no external, non-EclipseLink, or applications that are modifying the data. If your application does not have exclusive access to the data, then you must change some of the defaults from Table 10-3.

Table 10-3 EJB3.0 JPA Entities and Cache Configuration Options

Tuning Parameter Description Performance Notes

Object Cache

EclipseLink sessions provide an object cache. EJB3.0 JPA applications that use the EclipseLink persistence manager create EclipseLink sessions that by default use this cache. This cache, known as the session cache, retains information about objects that are read from or written to the database, and is a key element for improving the performance of an EclipseLink application.

Typically, a server session's object cache is shared by all client sessions that are acquired from it. Isolated sessions provide their own session cache isolated from the shared object cache.

The annotation type @Cacheable specifies whether an entity should be cached. Caching is enabled when the value in the persistence.xml file caching element is ENABLE_SELECTIVE or DISABLE_SELECTIVE. The value of the Cacheable annotation is inherited by subclasses; it can be overridden by specifying Cacheable on a subclass.

Cacheable(false) means that the entity and its state must not be cached by the provider.

Default Value: Enabled (shared is True).

Generally, it is recommended that you leave caching enabled. If you have an object that is always read from the database, as in a pessimistic locked object, then the cache for that entity should be disabled. Also, consider disabling the cache for infrequently accessed entities.

Query Result Set Cache

In addition to the object cache in EclipseLink, EclipseLink also supports a query cache:

  • The object cache indexes objects by their primary key, allowing primary key queries to obtain cache hits. By using the object cache, queries that access the data source can avoid the cost of building the objects and their relationships if the object is already present.

  • The query cache is distinct from the object cache. The query cache is indexed by the query and the query parameters-not the object's primary key. This enables any query executed with the same parameters to obtain a query cache hit and return the same result set.

The query hints for a query cache are:

eclipselink.query-cache

eclipselink.query-cache.size

eclipselink.query-cache.invalidation

See also "Understanding Caching" at http://www.eclipse.org/eclipselink/documentation/2.5/concepts/cache.htm#CDEFHHEH and "JPA Query Customization Extensions" at http://www.eclipse.org/eclipselink/documentation/2.5/jpa/extensions/queryhints.htm#sthref498.

Default Value: Not Used.

Use for frequently executed non-primary key queries with infrequently changing result sets. Use with a cache invalidation time out to refresh as needed.

Cache Size

Cache size can be configured through the following persistence properties: eclipselink.cache.size.entity

eclipselink.cache.size.default

eclipselink.cache.type.default

See also "About the Persistence Unit" at http://www.eclipse.org/eclipselink/documentation/2.5/concepts/appdeployment002.htm#BABHCJDG and "Class PersistenceUnitProperties" at http://www.eclipse.org/eclipselink/api/2.3/org/eclipse/persistence/config/PersistenceUnitProperties.html.

Default Value: Type SoftWeak, Size 100 (per Entity). The default value may be different if Toplink is running on Exalogic. See Enable the Exalogic Automated Tuner in the Solutions Guide for Oracle TopLink for more information about the Exalogic default.

Based on your tolerance for stale data, set the cache size relative to how much memory you have available, how many instances of the class you have, the frequency the entities are accessed, and how much caching you want.

Consider creating larger cache sizes for entities that have many instances that are frequently accessed and stale data is not a big issue.

Consider using smaller cache sizes or no cache for frequently updated entities that must always have fresh data, or infrequently accessed entities.

Locking

Oracle supports the locking policies shown in Table 10-4: No Locking, Optimistic, Pessimistic, and Read Only.

Locking is set through JPA @Version annotation, eclipselink.read-only

See "Descriptors and Locking" at http://www.eclipse.org/eclipselink/documentation/2.5/concepts/descriptors002.htm#CHEEEIEA.

Default Value: No Locking.

For entities that can be updated concurrently, consider using the locking policy to prevent a user from writing over another users changes. To optimize performance for read-only entities, consider defining the entity as Read Only or use a read-only query hint.

Cache Usage

By default, all query types search the database first and then synchronize with the cache. Unless refresh has been set on the query, the cached objects can be returned without being refreshed from the database. You can specify whether a given query runs against the in-memory cache, the database, or both.

To get performance gains by avoiding the database lookup for objects already in the cache, you can configure that the search attempts to retrieve the required object from the cache first, and then search the data source only if the object is not in the cache. For a query that looks for a single object based on a primary key, this is done by setting the query hint eclipselink.cache-usage to CheckCacheByExactPrimaryKey.

Default Value: DoNotCheckCache.

For faster performance on primary key queries, where the data is typically in the cache and does not require a lot of refreshing, it is recommended to check the cache first on these queries by using CheckCacheByExactPrimaryKey.

This avoids the default behavior of retrieving the object from the database first and then for objects already in the cache, returning the cached values, which are not updated from the database access, unless refresh has been set on the query.

Isolation

There is not a single tuning parameter that sets a particular database transaction isolation level in a JPA application that uses EclipseLink.

In a typical EJB3.0 JPA application, a variety of factors affect when database transaction isolation levels apply and to what extent a particular database transaction isolation can be achieved, including the following:

  • Locking mode

  • Use of the Session Cache

  • External Applications

  • Database Login method setTransactionIsolation

See also Isolated Cache at http://www.eclipse.org/eclipselink/documentation/2.5/concepts/cache001.htm#CDEEGICF.

 

Cache Refreshing

By default, EclipseLink caches objects read from a data source. Subsequent queries for these objects access the cache and thus improve performance by reducing data source access and avoiding the cost of rebuilding object's and their relationships. Even if a query accesses the data source, if the objects corresponding to the records returned are in the cache, EclipseLink uses the cached objects. This default caching policy can lead to stale data in the application.

Refreshing can be enabled at the entity level (alwaysRefresh or refreshOnlyIfNewer and expiry) and at the query level (with the eclipselink.refresh query hint). You can also force queries to go to the database with (disableHits). Using an appropriate locking policy is the only way to ensure that stale or conflicting data does not get committed to the database.

See About Cache Refreshing.

See also Understanding Caching at http://www.eclipse.org/eclipselink/documentation/2.5/concepts/cache.htm#CDEFHHEH.

Default Value: No Cache Refreshing

Try to avoid entity level cache refresh and instead, consider configuring the following:

  • cache refresh on a query-by-query basis

  • cache expiration

  • isolated caching

About Cache Refreshing

There are a few scenarios to consider for data refreshing in the cache, all with performance implications:

  • In the case where you never want cached data and always want fresh data, consider using an isolated cache (Shared=False). This is the case when certain data in the application changes so frequently that it is desirable to always refresh the data, instead of only refreshing the data when a conflict is detected.

  • In the case when you want to avoid stale data, but getting stale data is not a major issue, then using a cache expiry policy would be the recommended solution. In this case you should also use optimistic locking, which automatically refresh stale objects when a locking error occurs. If using optimistic locking, you could also enable the entity @Cache attributes alwaysRefresh and refreshOnlyIfNewer to allow queries that access the database to refresh any stale objects returned, and avoid refreshing invalid objects when unchanged. You may also want to enable refreshing on certain query operations when you know you want refreshed data, or even provide the option of refreshing something from the client that would call a refreshing query.

  • In the case when you are not concerned about stale data, you should use optimistic locking; this automatically refreshes stale objects in the cache on locking errors.

Locking Mode Policy Options

The locking modes, as shown in Table 10-4, along with EclipseLink cache-usage and query refreshing options, ensures data consistency for EJB entities using JPA. The different combinations have both functional and performance implications, but often the functional requirements for up-to-date data and data consistency lead to the settings for these options, even when it may be at the expense of performance.

For more information, see "Descriptors and Locking" at http://www.eclipse.org/eclipselink/documentation/2.5/concepts/descriptors002.htm#CHEEEIEA.

Table 10-4 Locking Mode Policies

Locking Option Description Performance Notes

No Locking

The application does not prevent users overwriting each other's changes. This is the default locking mode. Use this mode if the entity is never updated concurrently or concurrent reads and updates to the same rows with read-committed semantics is sufficient.

Default Value: No Locking.

In general, no locking is faster, but may not meet your needs for data consistency.

Optimistic

All users have read access to the data. When a user attempts to make a change, the application checks to ensure that the data has not changed since the user read the data.

See also "Using Optimistic Locking" at http://www.eclipse.org/eclipselink/documentation/2.5/concepts/mappingintro005.htm#CEGDIIIB.

If infrequent concurrent updates to the same rows are expected, then optimistic locking may provide the best performance while providing data consistency guarantees.

Pessimistic

The first user who accesses the data with the purpose of updating it locks the data until completing the update.

If frequent concurrent updates to the same rows are expected, pessimistic locking may be faster than optimistic locking that is getting a lot of concurrent access exceptions and retries.

When using pessimistic locking at the entity level, it is recommended that you use it with an isolated cache (Shared=False) for best performance.

Read Only

Setting an EJB3.0 JPA entity to Read Only ensures that the entity cannot be modified and enables EclipseLink to optimize the unit of work performance.

Set at the entity level by using @ReadOnly class annotation. Can also be set at the query level through the query hint eclipselink.read-only.

Defining an entity as Read Only can perform better than an entity that is not defined as Read Only, yet does no inserts, updates, or deletes, since it enables EclipseLink to optimize the unit of work performance. Always use Read Only for all read-only operations.

About Mapping and Descriptor Configurations

EclipseLink can transform data between an object representation and a representation specific to a data source. This transformation is called mapping and it is the core of a EclipseLink project.

A mapping corresponds to a single data member of a domain object. It associates the object data member with its data source representation and defines the means of performing the two-way conversion between object and data source.

For information on Mapping, see “Mapping and Descriptors” at http://www.eclipse.org/eclipselink/documentation/2.5/solutions/performance002.htm#sthref153.

About Data Partitioning

EclipseLink allows you to configure data partitioning by using the @Partitioned annotation. Partitioning enables an application to scale information across multiple databases; including clustered databases.

For more information on using @Partitioned and other partitioning policy annotations, see "Partitioning Annotations" at http://www.eclipse.org/eclipselink/documentation/2.5/jpa/extensions/annotations_ref.htm#CACHIHIB.

Advanced Tuning Considerations

After you have performed the recommended modifications, you can make additional changes that are specific to your deployment. Consider carefully whether the advanced tuning recommendations are appropriate for your environment.

Integrating with Oracle Coherence

Oracle Toplink can be integrated with Oracle Coherence. This integration is provided through the Oracle TopLink Grid feature. With TopLink Grid, there are several types of integration with EclipseLink JPA features.

For example:

  • Replace the default EclipseLink L2 cache with Coherence. This provides support for very large L2 caches that span cluster nodes. EclipseLink's default L2 cache improves performance for multithreaded and Java EE server hosted applications that are running in a single JVM, and requires configuring special cache coordination features if used across a cluster.

  • Configure entities to execute queries in the Coherence data grid instead of the database. This allows clustered application deployments to scale beyond database-bound operations.

For using EclipseLink JPA with a Coherence Cache, see Grid Cache Configuration in Integrating Oracle Coherence.

For details on Oracle Toplink integration with Oracle Coherence, see Integrating Toplink Grid with Oracle Coherence in Integrating Oracle Coherence.

Analyzing EclipseLink JPA Entity Performance

The following features in EclipseLink can help you analyze your JPA application performance: