Terracotta Logo (http://www.terracotta.org)

Terracotta Documentation Home

Terracotta 3.2.1 Documentation

Table of Contents •  Back  •  Forward


Register

2.2 Testing and Tuning Terracotta Distributed Ehcache for Hibernate

This document shows you how to test and tune Terracotta Distributed Ehcache for Hibernate.

TIP: Top Tuning Tips

- Turn Off Query Cache
- Prevent Unnecessary Database Connections (see Reducing Unnecessary Database Connections)
- Configure Database Connection Pool (see Connection Pools)
- Turn off Unnecessary Statistics Gathering

2.2.1 Testing the Cache

The main benefit of a Hibernate second-level cache is raising performance by decreasing the number of times an application accesses the database. To gauge the level of database offloading provided by the Terracotta Distributed Ehcache for Hibernate second-level cache, look for these benefits:

The number of threads that can simultaneously access the distributed second-level cache can be scaled up more easily and efficiently than database connections, which generally are limited by the size of the connection pool.

You should record measurements for all of these factors before enabling the Terracotta Distributed Ehcache for Hibernate second-level cache to create a benchmark against which you can assess the impact of using the cache. You should also record measurements for all of these factors before tuning the cache to gauge the impact of any tuning changes you make.

Another important test in addition to performance testing is verifying that the expected data is being loaded. For example, loading one entity can result in multiple cache entries. One approach to tracking cache operations is to set Hibernate cache logging to "debug" in log4j.properties :

log4j.logger.org.hibernate.cache=debug

This level of logging should not be used during performance testing.

NOTE: Optimizing Cache Performance

Before doing performance testing, you should read through the rest of this document to learn about optimizing cache performance. Some performance optimization can be done ahead of time, while some may require testing to reveal its applicability.

When using a testing framework, ensure that the framework does not cause a performance bottleneck and skew results.

2.2.2 Optimizing the Cache Size

Caches that get too large may become inefficient and suffer from performance degradation. A growing rate of flushing and faulting is an indication of a cache that's become too large and should be pruned.

2.2.2a Eviction Parameters

The most important parameters for tuning cache size and cache performance in general are the following:

  • Time to Idle (TTI) – This parameter controls how long an entity can remain in the cache without being accessed at least once. TTI is reset each time an entity is accessed. Use TTI to evict little-used entities to shrink the cache or make room for more frequently used entities. Adjust the TTI up if the faulting rate (data faulted in from the database) seems too high, and lower it if flushing (data cleared from the cache) seems too high.
  • Time to Live (TTL) – This parameter controls how long an entity can remain in the cache, regardless of how often it is used (it is _never_ overridden by TTI). Use TTL to prevent the cache from holding stale data. As entities are evicted by TTL, fresh versions are cached the next time they are accessed.

TTI and TTL are set in seconds.

You can also control Hibernate region sizes using the following parameters:

  • Target Max In-Memory Count - The maximum number of elements allowed in a region in any one client (any one application server). If this target is exceeded, eviction occurs to bring the count within the allowed target. 0 means no eviction takes place (infinite size is allowed).
  • Target Max Total Count - The maximum total number of elements allowed for a region in all clients (all application servers). If this target is exceeded, eviction occurs to bring the count within the allowed target. 0 means no eviction takes place (infinite size is allowed).

 

TIP: Default Eviction Parameters

When you first install Terracotta Distributed Ehcache for Hibernate, eviction is turned off by default because all eviction parameters are set to 0. You should set these parameters to non-zero values to turn eviction on, then tune them based on how your application requirements and performance characteristics.
How to Set Eviction Parameters

You can set eviction parameters in two different ways:

  • In ehcache.xml – Configuration file for Terracotta Distributed Ehcache for Hibernate with properties for controlling eviction on a per-cache basis. See Setting Cache Eviction for more information.
  • The Terracotta Developer Console – The GUI for Hibernate second-level cache allows you to apply real-time values to eviction parameters and export a configuration file.

After setting eviction parameters, be sure to test the effect on performance (see Testing the Cache) .

2.2.2b Reducing the Cache Miss Rate

The cache miss rate is a measure of requests that the cache could not meet. Each miss can lead to a fault which requires a database query. (However, misses and faults are not one-to-one since a query can return results that satisfy more than one miss.) A high or growing cache miss rate indicates the cache should be optimized.

To lower the miss rate, adjust for regions containing entities with high access rates to evict less frequently. This keeps popular entities in the cache for longer periods of time. You should adjust eviction parameter values incrementally and carefully observe the effect on the cache miss rate. For example, TTI and TTL that are set too high can introduce other drawbacks, such as stale data or overly large caches.

2.2.2c Examinator Example

Examinator , the Terracotta reference application that uses Terracotta Distributed Ehcache for Hibernate to implement the second-level cache, supports thousands of concurrent user sessions. This web-based test-taking application caches exams and must have TTI and TTL properly tuned to prevent unnecessarily large data caches and stale exam pages.

The following sections detail how certain cached Examinator data is configured for second-level caching. Included are snippets from the Terracotta distributed Ehcache for Hibernate configuration file (see Cache Configuration File).

User Roles

The data defining user roles has the following characteristics:

  • Never changes – User roles are fixed (read only).
  • Accessed frequently – Each user session must have a user role.

Therefore, user roles are cached and never evicted (TTI=0, TTL=0). In general, read-only data that is used frequently and never grows stale should be cached continuously.

<cache name="org.terracotta.reference.exam.domain.UserRole"
    maxElementsInMemory="1000"
    eternal="false"
    timeToIdleSeconds="0"
    timeToLiveSeconds="0"
    overflowToDisk="false">
       <terracotta/>
</cache>
User Data

User data, which includes the user entity and its role, is useful only while the user is active. This data has the following characteristics:

  • Access is unpredictable – User interaction with the application is unpredictable and can be sporadic.
  • Lifetime is unpredictable – The data is useful as long as the user session has activity. Only when the user becomes inactive are the associated entities idle.

Therefore, these entities should have a short idle time of two minutes (TTI=120) to allow data associated with inactive user sessions to be evicted. However, there should never be eviction based on a hard lifetime (TTL=0), thus allowing the associated entities to be cached indefinitely as long as TTI is reset by activity.

<cache name="org.terracotta.reference.exam.domain.User"
    maxElementsInMemory="1000"
    eternal="false"
    timeToIdleSeconds="120"
    timeToLiveSeconds="00"
    overflowToDisk="false">
       <terracotta/>
</cache>
<cache name="org.terracotta.reference.exam.domain.User.roles"
    maxElementsInMemory="1000"
    eternal="false"
    timeToIdleSeconds="120"
    timeToLiveSeconds="0"
    overflowToDisk="false">
       <terracotta/>
</cache>
Exam Data

Exam data is includes the actual exams being taken by users. It has the following characteristics:

  • Rarely changes – There is the potential for exam questions to be changed in the database, but this happens infrequently.
  • Data set is large – There can be any number of exams, and not all of them can be cached due to limitations on the size of the cache.

Since there can be many different exams, and the potential exists for a cached exam to become stale, cached exams should be periodically evicted based on lack of access (TTI=3600) and to ensure they are up-to-date (TTL=86400).

<cache name="org.terracotta.reference.exam.domain.Exam"
    maxElementsInMemory="1000"
    eternal="false"
    timeToIdleSeconds="3600"
    timeToLiveSeconds="86400"
    overflowToDisk="false">
       <terracotta/>
</cache>
<cache name="org.terracotta.reference.exam.domain.Section"
    maxElementsInMemory="1000"
    eternal="false"
    timeToIdleSeconds="3600"
    timeToLiveSeconds="86400"
    overflowToDisk="false">
       <terracotta/>
</cache>
<cache name="org.terracotta.reference.exam.domain.Section.questions"
    maxElementsInMemory="1000"
    eternal="false"
    timeToIdleSeconds="3600"
    timeToLiveSeconds="86400"
    overflowToDisk="false">
       <terracotta/>
</cache>
<cache name="org.terracotta.reference.exam.domain.Section.sections"
    maxElementsInMemory="1000"
    eternal="false"
    timeToIdleSeconds="3600"
    timeToLiveSeconds="86400"
    overflowToDisk="false">
       <terracotta/>
</cache>
<cache name="org.terracotta.reference.exam.domain.Question"
    maxElementsInMemory="1000"
    eternal="false"
    timeToIdleSeconds="3600"
    timeToLiveSeconds="86400"
    overflowToDisk="false">
       <terracotta/>
</cache>
<cache name="org.terracotta.reference.exam.domain.Question.choices"
    maxElementsInMemory="1000"
    eternal="false"
    timeToIdleSeconds="3600"
    timeToLiveSeconds="86400"
    overflowToDisk="false">
       <terracotta/>
</cache>
<cache name="org.terracotta.reference.exam.domain.Choice"
    maxElementsInMemory="1000"
    eternal="false"
    timeToIdleSeconds="3600"
    timeToLiveSeconds="86400"
    overflowToDisk="false">
       <terracotta/>
</cache>

2.2.3 Optimizing for Read-Only Data

If your application caches read-only data, the following may improve performance:

2.2.4 Reducing Unnecessary Database Connections

The JDBC mode Autocommit automatically writes changes to the database, making it unnecessary for an application to do so explicitly. However, unnecessary database connections can result from Autocommit because of the way JDBC drivers are designed. For example, transactional read-only operations in Hibernate, even those that are resolved in the second-level cache, still generate "empty" database connections. This situation, which can be tracked in database logs, can quickly have a detrimental effect on performance.

Turning off Autocommit should prevent empty database connections, but may not work in all cases. Lazily fetching JDBC connections resolves the issue by preventing JDBC calls until a connection to the database actually needed.

NOTE: Autocommit

While Autocommit should be turned off to reduce unnecessary database connections for applications that create their own transaction boundaries, it may be useful for applications with on-demand (lazy) loading of data. You should investigate Autocommit with your application to discover its effect.

Two options are provided for implementing lazy fetching of database connections:

2.2.4a Lazy Fetching with Spring-Managed Transactions

If your application is based on the Spring framework, turning off Autocommit may not be enough to reduce unnecessary database connections for transactional read operations. You can prevent these empty database connections from occurring by using the Spring LazyConnectionDataSourceProxy proxy definition. The proxy holds unnecessary JDBC calls until a connection to the database is actually required, at which time the held calls are applied.

To implement the proxy, create a target DataSource definition (or rename your existing target DataSource) and a LazyConnectionDataSourceProxy proxy definition in the Spring application context file:

<!-- Renamed the existing target DataSource to 'dataSourceTarget' which will be used by the proxy. -->
<bean id="dataSourceTarget" 
class="org.apache.commons.dbcp.BasicDataSource"
     destroy-method="close">
   <property name="driverClassName"><value>com.mysql.jdbc.Driver</value></property>
   <property name="url"><value>jdbc:mysql://localhost:3306/imagedb</value></property>
   <property name="username"><value>admin</value></property>
   <property name="password"><value></value></property>
   <!-- other datasource configuration properties -->
</bean>
<!-- This is the lazy DataSource proxy that interacts with the target DataSource once a real statement is sent to the database. Users use this DataSource to set up their Hibernate session factory, which in turn forces the Hibernate second-level cache and also everything that interacts with that Hibernate session factory to use it. -->
<bean id="dataSource"
class="org.springframework.jdbc.datasource.LazyConnectionDataSourceProxy">
   <property name="targetDataSource"><ref local="dataSourceTarget"/></property>
</bean>

Your application's SessionFactory, transaction manager, and all DAOs should access the proxy. Since the proxy implements the DataSource interface too, it can simply be passed in instead of the target DataSource.

See the Spring documentation for more information.

2.2.4b Lazy Fetching for Non Spring Applications

By implementing a custom Hibernate connection provider, you can use the LazyConnectionDataSourceProxy in a non-Spring based application:

public class LazyDBCPConnectionProvider implements ConnectionProvider {
    private DataSource ds;
    private BasicDataSource basicDs;
    public void configure(Properties props) throws HibernateException {
        // DBCP properties used to create the BasicDataSource
        Properties dbcpProperties = new Properties();
        // set some DBCP properties or implement logic to get them from the Hibernate config
        try {
            // Let the factory create the pool
            basicDs = (BasicDataSource)BasicDataSourceFactory.createDataSource(dbcpProperties);
            ds = new LazyConnectionDataSourceProxy(basicDs);
            // The BasicDataSource has lazy initialization
            // borrowing a connection will start the DataSource
            // and make sure it is configured correctly.
            Connection conn = ds.getConnection();
            conn.close();
        } catch (Exception e) {
            String message = "Could not create a DBCP pool";
            if (basicDs != null) {
                try {
                    basicDs.close();
                } catch (Exception e2) {
                    // ignore
                }
                ds = null;
                basicDs = null;
            }
            throw new HibernateException(message, e);
        }
    }
    public Connection getConnection() throws SQLException {
        return ds.getConnection();
    }
    public void closeConnection(Connection conn) throws SQLException {
        conn.close();
    }
    public void close() throws HibernateException {
        try {
            if (basicDs != null) {
                basicDs.close();
                ds = null;
                basicDs = null;
            }
        } catch (Exception e) {
            throw new HibernateException("Could not close DBCP pool", e);
        }
    }
    public boolean supportsAggressiveRelease() {
        return false;
    }
}

To use the custom connection provider, update hibernate.cfg.xml with the following property:

<property name="connection.provider_class">LazyDBCPConnectionProvider</property>

2.2.5 Reducing Memory Usage with Batch Processing

If your application must perform a large number of insertions or updates with Hibernate, a potential antipattern can emerge from the fact that all transactional insertions or updates in a session are stored in the first-level cache until flushed. Therefore, waiting to flush until the transaction is committed can result in an OutOfMemoryException (OOME) during large operations of this type.

You can prevent OOMEs in this case by processing the insertions or updates in batches, flushing after each batch. The Hibernate core documentation gives the following example for inserts:

Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();
   
for ( int i=0; i<100000; i++ ) {
    Customer customer = new Customer(.....);
    session.save(customer);
    if ( i % 20 == 0 ) { //20, same as the JDBC batch size
        //flush a batch of inserts and release memory:
        session.flush();
        session.clear();
    }
}
   
tx.commit();
session.close();

 

TIP: session.clear()

The performance of session.clear() has been improved in Hibernate 3.3.2.

Updates can be batched similarly. The JDBC batch size referred to in the comment above is set in the Hibernate configuration property hibernate.jdbc.batch_size . For more information, see "Batch processing" in the Hibernate core documentation .

2.2.6 Other Important Tuning Factors

The following factors could affect the performance of your second-level cache.

2.2.6a Query Cache

This Hibernate feature creates overhead regardless of how many queries are actually cached. For example, it records timestamps for entities even if not caching the related queries. Query cache is on if the following element is set in hibernate.cfg.xml :

<property name="hibernate.cache.use_query_cache">true</property>

If query cache is turned on, two specially-named cache regions appear in the Terracotta Developer Console cache-regions list. The two regions are the query cache and the timestamp cache.

Unless you are certain that the query cache benefits your application, it is recommended that you turn it off (set hibernate.cache.use_query_cache to "false").

2.2.6b Connection Pools

If your installation of Hibernate uses JDBC directly, you use a connection pool to create and manage the JDBC connections to a database. Hibernate provides a default connection pool and supports a number of different connection pools. The low-performance default connection pool is inadequate for more then just initial development and testing. Use one of the supported connection pools, such as C3P0 or DBCP, and be sure to set the number of connections to an optimal amount for your application.

2.2.6c Local Key Cache

Terracotta Distributed Ehcache for Hibernate can cache a "hotset" of keys on clients to add locality-of-reference, a feature suitable for read-only cases. Note that the set of keys must be small enough for available memory.

See Cache Configuration File for more information on configuring a local key cache.

2.2.6d Hibernate CacheMode

CacheMode is the Hibernate class that controls how a session interacts with second-level and query caches.

If your application explicitly warms the cache (reloads entities), CacheMode should be set to REFRESH to prevent unnecessary reads and null checks.

2.2.6e Cache Concurrency Strategy

If your application can tolerate somewhat inconsistent views of data, and the data does not change frequently, consider changing the cache concurrency strategy from READ_WRITE to NONSTRICT_READ_WRITE to boost performance. See Cache Concurrency Strategies for more information on cache concurrency strategies.

2.2.6f Terracotta Server Optimization

You can optimize the Terracotta servers in your cluster to improve cluster performance with a second-level cache. Some server optimization requires editing the Terracotta configuration file. For more information on Terracotta configuration file, see:

Test the following recommendations to gauge their impact on performance.

Less Aggressive Memory Management

By default, Terracotta servers clear a certain amount of heap memory based on the percentage of memory used. You can configure a Terracotta server to be less aggressive in clearing heap memory by raising the threshold that triggers this action. Allowing more data to remain in memory makes larger caches more efficient by reducing the server's swap-to-disk dependence. Be sure to test any changes to the threshold to confirm that the server doesn't suffer an OOME by failing to effectively manage memory at the new threshold level.

The default threshold is 70 (70 percent of heap memory used). Raise the threshold by setting a higher value for the Terracotta property l2.cachemanager.threshold in one of the following ways.

Create a Java Property

To set the threshold at 90, add the following option to $JAVA_OPTS before starting the Terracotta server:

-Dcom.tc.l2.cachemanager.threshold=90

Be sure to export JAVA_OPTS. If you adjust the threshold value after the server is running, you must restart the Terracotta server for the new value to take effect.

Add to Terracotta Configuration

Add the following configuration to the top of the Terracotta configuration file ( tc-config.xml by default) before starting the Terracotta server:

<tc-properties>
     <property name="l2.cachemanager.threshold" value="90" />
</tc-properties>

You must start the Terracotta server with the configuration file you've updated:

start-tc-server.sh -f <path_to_configuration_file>

Use start-tc-server.bat in Microsoft Windows.

Run in Non-Persistent Mode

If your data is backed by a database, and no critical data exists only in memory, you can run the Terracotta server in non-persistent mode ( temporary-swap-only mode). By default, Terracotta servers are set to non-persistent mode. For more information on persistence, see the Terracotta Configuration Guide and Reference .

Reduce the Berkeley DB Memory Footprint

Terracotta allots a certain percentage of memory to Berkeley DB, the database application used to manage the disk store. The default is 25 percent. Under the following circumstances, this percentage can be reduced:

  • Running in temporary-swap-only mode (see Run in Non-Persistent Mode) requires less memory for Berkeley DB since it is managing less data.
  • Running with a large heap size may require a smaller percentage of memory for Berkeley DB.

For example, if Berkeley DB has a fixed requirement of 300– 400MB of memory, and the heap size is set to 6GB, Berkeley DB can be allotted eight percent. You can set the percentage using the Terracotta property l2.berkeleydb.je.maxMemoryPercent in one of the following ways.

Create a Java Property

To set the percentage at 8, add the following option to $JAVA_OPTS (or $JAVA_OPTIONS ) before starting the Terracotta server:

-Dcom.tc.l2.berkeleydb.je.maxMemoryPercent=8

Be sure to export JAVA_OPTS (or JAVA_OPTIONS ). If you adjust the percentage value after the server is running, you must restart the Terracotta server for the new value to take effect.

Add to Terracotta Configuration

Add the following configuration to the top of the Terracotta configuration file ( tc-config.xml by default) before starting the Terracotta server:

<tc-properties>
     <property name="l2.berkeleydb.je.maxMemoryPercent" value="8" />
</tc-properties>

You must start the Terracotta server with the configuration file you've updated:

start-tc-server.sh -f <path_to_configuration_file>

Use start-tc-server.bat in Microsoft Windows.

If you lower the value of l2.berkeleydb.je.maxMemoryPercent , be sure to test the new value's effectiveness by noting the amount of flushing to disk that occurs in the Terracotta server. If flushing rises to a level that impacts performance, increase the value of l2.berkeleydb.je.maxMemoryPercent incrementally until an optimal level is observed.

2.2.6g JDK Version

While both JDK 1.5 and 1.6 are supported, JDK 1.6 may deliver better performance.

2.2.6h Statistics Gathering

Each time you connect to the Terracotta cluster with the Developer Console and go to the second-level cache node, Hibernate and cache statistics gathering is automatically started. Since this may have a negative impact on performance, consider disabling statistics gathering during performance tests and in production if you continue to use the Developer Console. To disable statistics gathering, navigate to the Overview panel in the Hibernate view, then click Disable Statistics .

2.2.6i Logging

There is a negative impact on performance if logging is set. Consider disabling statistics logging during performance tests and in production.

To disable statistics gathering in the Terracotta Developer Console, navigate to the Configuration panel in the Hibernate view, then select the target regions in the list and clear Logging enabled if it is set.

To disable debug logging for Terracotta Distributed Ehcache, set the logging level for the clustered store to be less granular than FINE.

2.2.6j Java Garbage Collection

Garbage Collection (GC) should be aggressive. Consider using the -Server Java option on all application servers to force a "server" GC strategy.

2.2.6k Database Tuning

A well-tuned database reduces latency and improves performance:

  • Indexes should be optimized for your application.
  • Databases should be indexed to load data quickly, based on the types of queries your application performs (type of key used, for example).

  • Database tables should be of a format that is optimized for your application. In MySQL, for example, the InnoDB format provides better performance than the default MyISAM (or the older ISAM) format if your application performs many transactions and uses foreign keys.
  • Ensure that the database is set to accept at least as many connections as the connection pool can open. See Connection Pools for more information.

The following are issues that could affect the functioning of Terracotta Distributed Ehcache for Hibernate.

2.2.6l Unwanted Synchronization with Hibernate Direct Field Access

When direct field access is used, Hibernate uses reflection to access fields, triggering unwanted synchronization that can degrade performance across a cluster. See this JIRA issue for more information.

2.2.6m Hibernate Exception Thrown With Cascade Option

Under certain circumstances, using a cascade="all-delete-orphan" can throw a Hibernate exception. See this Hibernate troubleshooting issue for more information.

2.2.6n Cacheable Entities and Collections Not Cached

Certain data that should be in the second-level cache may not have been configured for caching (or may have not been configured correctly). This oversight may not cause an error, but may impact performance. See Finding Cacheable Entities and Collections for more information.


Top of 2.2 Testing and Tuning Terracotta Distributed Ehcache for Hibernate