This document contains technical reference information for Terracotta Distributed Ehcache for Hibernate.
Cache eviction removes elements from the cache based on parameters with configurable values. Having an optimal eviction configuration is critical to maintaining cache performance.
Initially, the clustered second-level cache is configured with default values that turn off eviction. It is unlikely that these default values, which allow for infinite element lifetimes, are appropriate for most use cases. This "eternal" cache configuration may work in the early phases of the development and testing process, but eviction should be introduced before performance and load testing take place.
When you run Terracotta Distributed Ehcache for Hibernate, the default cache-configuration values are used when no cache configuration file is found on your application's classpath. To override these values, use the cache configuration file
ehcache.xml
to set eviction parameters. Click
ehcache.xml
to view a sample file.
To add eviction and control the size of the cache, edit the values of the following <cache> attributes and tune these values based on results of performance tests:
Ensure that the edited
ehcache.xml
is in your application's classpath. If you are using a WAR file,
ehcache.xml
should be in
WEB-INF/classes
.
See Terracotta Clustering Configuration Elements for definitions of other available configuration properties.
Note the following about
ehcache.xml
in a Terracotta cluster:
Changes take effect immediately but are
not
written to the original on-disk copy of
ehcache.xml
.
The original (on-disk)
ehcache.xml
is loaded.
If you are using the Terracotta servers with persistence of shared data, and you want the cluster to load the original (on-disk)
ehcache.xml
, the servers' database must be wiped by removing the data files from the servers'
server-data
directory. This directory is specified in the Terracotta configuration file in effect (
tc-config.xml
by default). Wiping the database causes
all persisted shared data to be lost
.
See Terracotta Clustering Configuration Elements for more information.
To create a cache configuration set the values you want in the Terracotta Developer Console, then click
Show configuration...
to open a window containing a formatted cache configuration based on those values. Copy the contents into
ehcache.xml
.
To persist custom cache configuration values, create a cache configuration file by exporting customized configuration from the Terracotta Developer Console or create a file that conforms to the required format.
If you are migrating from another second-level cache provider, recreate the structure and values of your cache configuration in
ehcache.xml
. Then simply follow the directions for installing and configuring Terracotta Distributed Ehcache for Hibernate in Terracotta Distributed Ehcache for Hibernate Express Installation.
A cache concurrency strategy controls how the second-level cache is updated based on how often data is likely to change. Cache concurrency is set using the
usage
attribute in one of the following ways:
@Cache(usage=CacheConcurrencyStrategy.READ_WRITE)
hibernate.cfg.xml
.Supported cache concurrency strategies are described in the following sections.
The READ_ONLY strategy works well for unchanging reference data. It can also work in use cases where the cache is periodically invalidated by an external event. That event can flush the cache, then allow it to repopulate.
The READ_WRITE strategy works well for data that changes and must be committed. READ_WRITE guarantees correct data at all times by using locks to ensure that transactions are not open to more than one thread. If a cached element is created or changed in the database, READ_WRITE updates the cache. The cached element is guaranteed to be the same version as the one in database.
Terracotta Distributed Ehcache for Hibernate is designed to maximize performance with READ_WRITE strategies when the data involved is partitioned by your application (using sticky sessions, for example). However, caching needs are application-dependent and should be investigated on a case-by-case basis.
The NONSTRICT_READ_WRITE strategy is similar to READ_WRITE, but may provide better performance. NONSTRICT_READ_WRITE works well for data that changes and must be committed, but it does not guarantee exclusivity or consistency (and so avoids the associated performance costs). This strategy allows more than one transaction to simultaneously write to the same entity, and is intended for applications able to tolerate caches that may at times be out of sync with the database.
Because it does not guarantee the stability of data as it is changed in the database, NONSTRICT_READ_WRITE does not update the cache when an element is created or changed in the database.
The TRANSACTIONAL strategy is intended for use in an environment utilizing the Java Transaction API (JTA) to manage transactions across a number of XA resources. This strategy guarantees that a cache remains in sync with other resources, such as databases and queues.
The TRANSACTIONAL strategy is supported in Ehcache 2.0 and higher. For more information on how to set up a second-level cache with transactional caches, see Setting Up Transactional Caches.
To set up transactional caches in a second-level cache with Terracotta Distributed Ehcache for Hibernate, ensure the following:
transactionalMode
is set to "xa".For example, the following cache is configured to be transactional:
<cache name="com.my.package.Foo"
maxElementsInMemory="500"
eternal="false"
overflowToDisk="false"
transactionalMode="xa">
<terracotta clustered="true"/>
</cache>
net.sf.ehcache.hibernate.EhCacheRegionFactory
.
current_session_context_class
is
jta
.
transaction.manager_lookup_class
is the name of a TransactionManagerLookup class (see your Transaction Manager).
transaction.factory_class
is the name of a TransactionFactory class to use with the Hibernate Transaction API.
For example, to set the cache concurrency strategy for com.my.package.Foo in
hibernate.cfg.xml
:
<class-cache class="com.my.package.Foo" usage="transactional"/>
<cache usage="transactional"/>
@Cache(usage=CacheConcurrencyStrategy.TRANSACTIONAL)
public class Foo {...}
For more on cache concurrency strategies, see Cache Concurrency Strategies.
If you are using more than one Hibernate web application with the Terracotta second-level cache, additional configuration is needed to allow for multiple classloaders. See the section on configuring an application group ( app-groups ) in the Configuration Guide and Reference for more information on configuring application groups.
Certain data that should be in the second-level cache may not have been configured for caching. This oversight may not cause an error, but may impact performance.
Using the Terracotta Developer Console, you can compare the set of cached regions with the set of all Hibernate entities and collections. Note any items, such as collections containing fixed or slow-changing data, that appear as Hibernate entities but do not have corresponding cache regions.
If the Terracotta Distributed Ehcache for Hibernate second-level cache is being clustered correctly, a Terracotta root representing the second-level cache appears in the Terracotta Developer Console's object browser. Under this root, which exists in every client (application server), are the cached regions and their children.
You can use this root to verify that the second-level cache is running and is clustered with Terracotta:
[PROMPT] ${TERRACOTTA_HOME}\bin\dev-console.bat
Using the
Terracotta Developer Console
, verify that there is a root named
default:terracottaHibernateCaches
. For each Terracotta client (application server), the caches should appear as MapEntry objects under this root, one per cache region. The data itself is found inside these cache-region entries.
The second-level cache runtime statistics are pulled from Hibernate statistics, which have a fixed sampling rate of one second (sample once per second). The Terracotta Developer Console's sampling rate for display purposes, however, is adjustable.
To display all of the Hibernate statistical counts, set the Terracotta Developer Console's sampling rate to one second. To set the sampling rate, choose Options... from the Developer Console's Tools menu, then set Poll period seconds to "1".
For example, if the sampled Hibernate statistics record the Cache Miss Count values "15, 25, 62, 10, 12, 43," and the Terracotta Developer Console's sampling rate is set to one second, then all of these values are graphed. However, if the Terracotta Developer Console's sampling rate is set to three seconds, then only the values "15, 62, 43" are graphed (assuming that the first poll period coincides with the first value recorded).
Some use cases may present hurdles to realizing benefits from a second-level Hibernate cache implementation.
Volatile data requires frequent cache invalidation, which increases the overhead of maintaining the cache. At some point this overhead impacts performance at a cost too high to make the cache favorable. Identifying "hotsets" of data can mitigate this situation by limiting the amount of data that requires reloading. Another solution is scaling your cluster to keep more data in memory (see Terracotta Server Arrays).
Huge data sets that are queried randomly (or across the set with no clear pattern or "hotsets") are difficult to cache because of the impact on memory of attempting to load that set or having to evict and load elements at a very high rate. Solutions include scaling the cluster to allow more data into memory (see Terracotta Server Arrays), adding storage to allow Terracotta to spill more data to disk, and using partitioning strategies to prevent any one node from loading too much data.
As the rate of updating cached data goes up, application performance goes down as Hibernate attempts to manage and persist the changes. Write-behind or some asynchronous approach to writing the data may be a good solution for this issue (see DSO Async Processing ).
The benefits of caching are maximized when cached data is queried multiple times before expiring. If cached data is infrequently accessed, or often expires before it is used, the benefits of caching may be lost. Solutions to this situation include invalidating data in the cache more often to force updates. Also, refactoring your application to cache more frequently queried data and avoid caching data that tends to expire unused.
Cached data cannot be guaranteed to be coherent at all times with the data in a database. In situations where this must guaranteed, such as when an application requires auditing, access to the data must be through the System of Record (SoR). Financial applications, for example, require auditing, and for this the database must be accessed directly. If critical data is changed in a cache, however, the data obtained from the database could be erroneous.
If data in the database can be modified by applications outside of your application with Hibernate, and that same data is eligible for the second-level cache, unpredictable results could occur. One solution is a redesign to prevent data that can end up in the cache from being modified by applications outside of the scope of your Hibernate application.
Top of 2.3 Terracotta Distributed Ehcache for Hibernate Reference