Terracotta Discussion Forums (LEGACY READ-ONLY ARCHIVE)

We have a Java application running Quartz in a cluster (several nodes)
Is it possible in Quartz to capture IP address of a node executing quartz job and log it to the database?

Our clocks are all sync'd.
We increased org.quartz.jobStore.clusterCheckinInterval to 1 minute and it seems to solve the problem.

We see this happening in Quartz 2.1 release:
Both nodes are up but Quartz thinks that one has failed. Quartz then tries to recover (if the job has the recoverable flag set to true) and fires immediately.
So we have the same job executing twice on different nodes.

Did anything change in cluster recovery in 2.x release? Was the logic of finding failed instance using only last checkin time tested under high load?
Can it be that a table lock prevents cluster to checkin when it is actually up and running the job?

Code:

     /**
      * Get a list of all scheduler instances in the cluster that may have failed.
      * This includes this scheduler if it is checking in for the first time.
      */
     protected List<SchedulerStateRecord> findFailedInstances(Connection conn)
 
                 // find own record...
                 if (rec.getSchedulerInstanceId().equals(getInstanceId())) {
                     foundThisScheduler = true;
                     if (firstCheckIn) {
                         failedInstances.add(rec);
                     }
                 } else {
                     // find failed instances...
                     if (calcFailedIfAfter(rec) < timeNow) {
                         failedInstances.add(rec);
                     }
                 }
 	}
 
 
     protected long calcFailedIfAfter(SchedulerStateRecord rec) {
         return rec.getCheckinTimestamp() + Math.max(rec.getCheckinInterval(), (System.currentTimeMillis() - lastCheckin)) + 7500L;
     }

Is it possible to configure Quartz so you will get notifications (email alerts) when Quartz thinks there is a node failure in a cluster and it tries to pick up a job that failed on a different node?

This issue is not resolved.
We deployed EHCACHE replication with JGroups using recomended JGroups configuration.
http://ehcache.org/EhcacheUserGuide.html#id.s33.9

However we can see significant Thread leaks (CPU increase).
We used DynaTrace Thread Snapshots to analyse it.

Is it a known issue with JGroups?
http://osdir.com/ml/java.javagroups.general/2006-03/msg00057.html

Should it be documented?

We implemented EHCACHE (version 23.2) replication using JGroups.
(jgroups-2.10.0.GA.jar and ehcache-jgroupsreplication-1.4.jar)

<cacheManagerPeerProviderFactory class="net.sf.ehcache.distribution.jgroups.JGroupsCacheManagerPeerProviderFactory"
properties="connect=UDP(mcast_addr=231.12.21.132;mcast_port=45566;ip_ttl=32;
mcast_send_buf_size=150000;mcast_recv_buf_size=80000):
PING(timeout=2000;num_initial_members=6):
MERGE2(min_interval=5000;max_interval=10000):
FD_SOCK:VERIFY_SUSPECT(timeout=1500):
pbcast.NAKACK(gc_lag=10;retransmit_timeout=3000):
UNICAST(timeout=5000):
pbcast.STABLE(desired_avg_gossip=20000):
FRAG:
pbcast.GMS(join_timeout=5000;join_retry_timeout=2000)"
propertySeparator="::" />

Periodically we are getting following exception:

WARNING: No TransactionManagerLookup found in Hibernate config, XA Caches will be participating in the two-phase commit!
2011-05-19 10:26:29,710 (JChannel.java:1679) INFO org.jgroups.JChannel - JGroups version: 2.10.0.GA
2011-05-19 10:26:30,303 (Configurator.java:965) WARN org.jgroups.stack.Configurator - GMS property join_retry_timeout was deprecated and is ignored

-------------------------------------------------------------------
GMS: address=LNNJNOCCB0J3M1-39886, cluster=ORMCacheManager, physical address=10.30.106.144:1333
-------------------------------------------------------------------
2011-05-19 10:18:21,872 (TP.java:1109) WARN org.jgroups.protocols.UDP - LNNJNOCCB0J3M1-39886: no physical address for a06e70eb-eba2-5326-52d0-1ffd66a98d6c, dropping message
2011-05-19 10:18:26,885 (ClientGmsImpl.java:145) WARN org.jgroups.protocols.pbcast.GMS - join(LNNJNOCCB0J3M1-39886) sent to a06e70eb-eba2-5326-52d0-1ffd66a98d6c timed out (after 5
000 ms), retrying

Does anyone have any idea what is causing it?

According to User Guide EHCACHE supports updateViaInvalidate
http://ehcache.org/EhcacheUserGuide.html

30.3 Replicated Caches
...
Update supports updateViaCopy or updateViaInvalidate. The latter sends the a remove message out to the cache cluster, so that other caches remove the Element, thus preserving coherency. It is typically a lower cost option than a copy.

I am trying to replicate cache using JGroups
http://ehcache.org/documentation/replicated_caching_with_jgroups.html

The configuration options that I found for JGroupsCacheReplicatorFactory are:

* replicatePuts=true | false - whether new elements placed in a cache replicated to others. Defaults to true.
* replicateUpdates=true | false - whether new elements which override an element already existing with the same key are replicated. Defaults to true.
* replicateRemovals=true - whether element removals are replicated. Defaults to true.
* replicateAsynchronously=true | false - whether replications are asyncrhonous (true) or synchronous (false). Defaults to true.
* replicateUpdatesViaCopy=true | false - whether the new elements are copied to other caches (true), or whether a remove message is sent. Defaults to true.
* asynchronousReplicationIntervalMillis default 1000ms Time between updates when replication is asynchroneous

What JGroupsCacheReplicatorFactory property will replicate Puts, Updates and Removals via Invalidate?

Figured it out myself. It can be done this way:

<ehcache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="ehcache.xsd"
updateCheck="true" name="myCacheManager">

From:
http://ehcache.org/ehcache.xsd

<xs:attribute name="name" use="optional"/>

Thanks to Serban Balamaci.
http://balamaci.wordpress.com/2010/02/22/multiplecachemanager

I have registered CacheStatistics MBean with the server.
However, every time I restart the server cache manager name is different.

In JConsole it looks like this:

before server restart:

net.sf.ehcache.CacheManager@680387a

after server restart:

net.sf.ehcache.CacheManager@2ec76032

Is it possible declaratively change cache manager name in ehcache.xml so it will always have the same name?

We are using several cache managers in our application, several ehcache configuration files.

After a couple of weeks of using method caching we started getting following error:

Code:

[ - ]: 2011-03-24 09:40:14,409 ERROR [ApplicationLogger] - <Exception occured in runQueryCheck() method: The method.dataPDP Cache is not alive.>
 java.lang.IllegalStateException: The method.dataPDP Cache is not alive.
         at net.sf.ehcache.Cache.checkStatus(Cache.java:2291)
         at net.sf.ehcache.Cache.get(Cache.java:1397)
         at net.sf.ehcache.Cache.get(Cache.java:1378)
         at com.googlecode.ehcache.annotations.interceptor.EhCacheInterceptor.invokeCacheable(EhCacheInterceptor.java:119)
         at com.googlecode.ehcache.annotations.interceptor.EhCacheInterceptor.invoke(EhCacheInterceptor.java:77)
         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
         at org.springframework.aop.framework.Cglib2AopProxy$DynamicAdvisedInterceptor.intercept(Cglib2AopProxy.java:625)

Any thoughts?

Can it be a conflict between different cache factories?
Oyr Hibernate cache is using SingletonEhCacheRegionFactory.

We are using ehcache 2.3.2

Thanks. I found it:

http://sourceforge.net/projects/ehcache/files/ehcache-jgroupsreplication/ehcache-jgroupsreplication-1.4

http://mvnrepository.com/artifact/net.sf.ehcache/ehcache-jgroupsreplication

http://svn.terracotta.org/svn/ehcache/tags/jgroupsreplication-1.4/pom.xml

Looks like the latest version is 1.4

What version of ehcache are you using?
I cannot find JGroupsCacheManagerPeerProviderFactory class in ehcache-core-2.3.1.jar

What happaned to JGroups support in EHCache 2.3.1?
http://ehcache.org/documentation/distributed_caching_with_jgroups.html

I do not see JGroupsCacheManagerPeerProviderFactory in ehcache-core-2.3.1.jar. What happened to it? How ehCache now provides distributed caching using JGroups?

Why do you think hitting parent class cache region is acceptable?

I want to achieve different caching Hibernate L2 caching strategies for subclasses!

Some of them shoud be read-only, other should be read-write.

read-write subclasses should have it's own cache region and be distributed.

For example,
I have parent class AbstractCode

It's child class CountryCode that should be read-write and distributed
Another child class CompanyCode should be read-only and not distributed.

Its suppose to be one table - multiple classes, therefore - multiple caching strategies

Alex,

I can see hits on other cache regions.
I tested it in JUnit using Statistics following methods:

String[] regionNames = stats.getSecondLevelCacheRegionNames();
if (regionNames != null && regionNames.length > 0) {
for (String regionName : regionNames) {
System.out.println("Second Level Region " + regionName + " : "
+ stats.getSecondLevelCacheStatistics(regionName));
}
}

I can see hits on other cache regions using EHCACHE MONITOR.

Cache configurations are loaded from ehcache.xml file.

Classes that I want to be cached are marked using annotations:

@Entity
@DiscriminatorValue(value=CodeType.COUNTRY)
@Inheritance(strategy=InheritanceType.SINGLE_TABLE)
@Cache(usage=CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
public class Country extends SomeCode {

I have following properties Hibernate configuration file:

<property name="hibernate.cache.use_second_level_cache" value="true" />
<property name="hibernate.cache.region.factory_class" value="net.sf.ehcache.hibernate.SingletonEhCacheRegionFactory"/><property name="net.sf.ehcache.configurationResourceName" value="ehcache.xml" />
<property name="hibernate.cache.use_query_cache" value="true" />
<property name="hibernate.cache.use_structured_entries" value="true" />
<property name="hibernate.generate_statistics" value="true" />

I can see that cache regions that are created on disk and I can see them in EHCache Monitor.

However, Session.get() method is is hitting parent class cache region.
It is not hitting derived class cache region.

Is this clear?