njain wrote:
Absolutely.
Recommended approach is to use a combination of quartz and ehcache. Use Clustered Quartz for scheduling the work and distributed ehcache for sharing the state.
Quartz where which is an enterprise feature gives you finer control on scheduling.
This is an old question, but I'd like to know how to do this properly in express mode. I believe the answer refers to working with asynchronous jobs, whoever I don't think it's that simple if one needs to actually collect the results of a distributed computation (like joining the threads in a cluster). I see that terracotta seems to be integrating with Hadoop, whoever I'd like to know if there's another, more light-weight approach.
In the "old" way, which I think still works if I configure a DSO cluster, it was straightforward to collect and merge the results if the state were to be shared across Work instances and queues, due to DSO magic.
In the express mode there's a messy logic involved in order to update the state of work instances, plus error handling, joining etc is not trivial to implement using a clustered cache.
I'm currently trying to implement this using a cache based solution, and usually the involved patterns are more or less like this:
Code:
//copy values to a mutable list to shuffle it
List keys = new ArrayList<>(cache.getKeys());
//shuffle them to avoid sequential bottleneck on lock-acquisition
Collections.shuffle(keys);
for (Object key : keys) {
cache.acquireWriteLockOnKey(key);
try{
Element element = cache.get(key);
if(element!=null){
Work w = (Work) element.getObjectValue();
if(w.getState()==State.PENDING){
w.setState(State.RUNNING);
//return serialized copy - overhead in re-searializing the whole state contained in w
cache.replace(new Element(key, w));
//submit work to queue ...
}
}
}finally{
cache.releaseWriteLockOnKey(key);
}
}