Terracotta Discussion Forums (LEGACY READ-ONLY ARCHIVE)

Hi,

We are going from 2 web servers to 4 web servers and in the past, we had one Terracotta master instance on one web server and one Terracotta slave instance on the other web server.

I'm now wondering if we should just keep one master/slave configuration as opposed to one master/3 slaves to to save on the replication and I/O cost that this might introduce.

Thoughts?

Thanks,
Terracha

Hi,

I posted on a thread in the past where I thought I had this figured out:

https://forums.terracotta.org/forums/posts/list/9118.page

My issue is that I don't believe this works if you have more than one web server. Let's pretend we have web server #1 called w1 with 2 tomcat instances t1 and t 2 and web server #2 called w2 with t3 and t4, I'm not sure how a session could fail-over if a request originally comes to w1 but subsequently goes to w2.

Is the only way this would work is that a request from a source IP would always go to the same web sever?

We would like to use load balancing with least connection but we seemingly can't do this right now as we can't fail-over a session from w1 to w2 unless a request always come to the same web server.

Any suggestions?

Thanks!

I believe I know what the issue is. I'm not sure if this is the standard configuration others would use with multiple web servers, apache and mod_jk.

I believe our issues stems from the fact that subsequent requests from a user after logging in usually go to the same web server. Therefore, we will chose to make sure that all requests for a specific user always go to the same web server.

The only way I see to get around this is that in our workers.properties file for both webservers, we would need all tomcat instances in both so that mod_jk could realize if all tomcat instances on a server are not working, Ie:

workers.properties:

worker.list=w1lbworker

worker.w1worker1.type=ajp13
worker.w1worker1.host=localhost
worker.w1worker1.port=8009
worker.w1worker1.connection_pool_timeout=10
worker.w1worker1.lbfactor=1

worker.w1worker2.type=ajp13
worker.w1worker2.host=localhost
worker.w1worker2.port=8010
worker.w1worker2.connection_pool_timeout=10
worker.w1worker2.lbfactor=1

worker.w1worker3.type=ajp13
worker.w1worker3.host=localhost
worker.w1worker3.port=8011
worker.w1worker3.connection_pool_timeout=10
worker.w1worker3.lbfactor=1

worker.w2worker4.type=ajp13
worker.w2worker4.host=2ndwebserver
worker.w2worker4.port=8009
worker.w2worker4.connection_pool_timeout=10
worker.w2worker4.lbfactor=1

worker.w2worker5.type=ajp13
worker.w2worker5.host=2ndwebserver
worker.w2worker5.port=8010
worker.w2worker5.connection_pool_timeout=10
worker.w2worker5.lbfactor=1

worker.w2worker6.type=ajp13
worker.w2worker6.host=2ndwebserver
worker.w2worker6.port=8011
worker.w2worker6.connection_pool_timeout=10
worker.w2worker6.lbfactor=1

worker.w1lbworker.type=lb
worker.w1lbworker.balance_workers=w1worker1, w1worker2, w1worker3, w2worker4, w2worker5, w2worker6
worker.w1lbworker.sticky_session=True

TerraCha

Hi,

We are using Terracotta for 2 webservers with 3 tomcat instances each. Our application fails over when we kill the tomcat used and that we jump to another tomcat on the same web server.

However, if i start on web01 and kill all tomcat instances on web01, I don't fail over to web02. Is this by design or is this likely a configuration issue?

Thanks,
TerraCha

Actually, found the issue. Web sessions was not activated in our staging environment.

Tried the same thing in our production environment when I got a chance and noticed we failed over when requests hit the same web server but it seems we might not be failing over when we go to a different web server which I need to test.

Thanks,
Charles

Hi,

We are having an issue on our site with losing session data and I am doing a test on our staging environment which has the following:

- Terracotta 3.4.1
- Apache 2.2
- Mod_jk 1.28 I believe
- 2 tomcat instances

I try the following test. Log into our site, check our JSessionID cookie to see which tomcat worker I'm on, kill that tomcat worker.

Expected behavior: I can continue using our site like nothing happened.
Actual behavior: I get sent back to our log in page.

I'm not sure if this ever worked but then I saw this article:

http://forums.terracotta.org/forums/posts/list/202.page

Could somebody elaborate on that article if I need to do this. If my tomcat worker names are tomcat2 and tomcat3, do I put the following in my tomcat environment variables:

-Dcom.terracotta.session.serverid=.tomcat2

Any suggestions are appreciated!

Cheers!

Hi,

I was running some load tests to try to figure out an intermittent issue that we have and I was doing an analysis of tomcat thread dumps and got a blocking "alert". When I looked at it, the below is what I saw for a stack trace. Any ideas on what the blocking was about?

- locked < 0x00002af0b2f020f0> (a java.lang.Object)
at com.tc.net.core.CoreNIOServices$CommThread.addSelectorTask(CoreNIOServices.java:354)
at com.tc.net.core.CoreNIOServices$CommThread.handleRequest(CoreNIOServices.java:696)
at com.tc.net.core.CoreNIOServices$CommThread.requestWriteInterest(CoreNIOServices.java:761)
at com.tc.net.core.CoreNIOServices.requestWriteInterest(CoreNIOServices.java:250)
at com.tc.net.core.TCConnectionJDK14.putMessageImpl(TCConnectionJDK14.java:491)
at com.tc.net.core.TCConnectionJDK14.putMessage(TCConnectionJDK14.java:642)
at com.tc.net.protocol.transport.MessageTransportBase.sendToConnection(MessageTransportBase.java:183)
at com.tc.net.protocol.transport.MessageTransportBase.send(MessageTransportBase.java:176)
- locked < 0x00002af0b2ed8be0> (a com.tc.net.protocol.transport.MessageTransportStatus)
at com.tc.net.protocol.tcm.AbstractMessageChannel.send(AbstractMessageChannel.java:195)
at com.tc.net.protocol.tcm.ClientMessageChannelImpl.send(ClientMessageChannelImpl.java:99)
at com.tc.net.protocol.tcm.TCMessageImpl.basicSend(TCMessageImpl.java:362)
at com.tc.net.protocol.tcm.TCMessageImpl.send(TCMessageImpl.java:357)
at com.tc.object.locks.RemoteLockManagerImpl.sendMessage(RemoteLockManagerImpl.java:186)
at com.tc.object.locks.RemoteLockManagerImpl.recallCommit(RemoteLockManagerImpl.java:125)
at com.tc.object.locks.RemoteLockManagerImpl.recallCommit(RemoteLockManagerImpl.java:130)
at com.tc.object.locks.ClientLockImpl.recallCommit(ClientLockImpl.java:932)
- locked < 0x00002af0bfebfc50> (a com.tc.object.locks.ClientLockImpl)
at com.tc.object.locks.ClientLockImpl.doRecall(ClientLockImpl.java:905)
- locked < 0x00002af0bfebfc50> (a com.tc.object.locks.ClientLockImpl)
at com.tc.object.locks.ClientLockImpl.recall(ClientLockImpl.java:400)
- locked < 0x00002af0bfebfc50> (a com.tc.object.locks.ClientLockImpl)
at com.tc.object.locks.ClientLockManagerImpl.recall(ClientLockManagerImpl.java:462)
at com.tc.object.locks.ClientLockManagerImpl.recall(ClientLockManagerImpl.java:449)
at com.tc.object.handler.LockResponseHandler.handleEvent(LockResponseHandler.java:47)
at com.tc.async.impl.StageImpl$WorkerThread.run(StageImpl.java:127)

Thanks!
TerraCha

Ok, thanks a lot for the information!

It is much appreciated!

Cheers,
Terracha

No, sorry if this sounds obvious but is that a must? I've got more of a SysAdmin background and less of a Java background so this is an area that I'm trying to learn as much as I can.

2nd question: If upgrading Spring is required for the new Quarts jar to work, could we just keep using the old jar if the only Quartz functionality we require still seems to work with the new Terracotta and the old jar?

We want to upgrade to Terracotta because of a weird bug that occurs in our production environment and we would like to do as less upgrades as possible because of time constraints.

Thanks!

Cheers,
Terracha

Sure, I'll give you what I know.

The jar that seems to work is quartz-1.6.1-RC3.jar . The one that won't work is quartz-2.1.0.jar .

We are getting the following error with 2.1.0:

SEVERE: StandardWrapper.Throwable
org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'newUserNotificationJob' defined in ServletContext resource [/WEB-INF/tlb-servlet.xml]: Invocation of init method failed; nested exception is java.lang.InstantiationError: org.quartz.JobDetail
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1336)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:471)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory$1.run(AbstractAutowireCapableBeanFactory.java:409)
at java.security.AccessController.doPrivileged(Native Method)

Caused by: java.lang.InstantiationError: org.quartz.JobDetail
at org.springframework.scheduling.quartz.MethodInvokingJobDetailFactoryBean.afterPropertiesSet(MethodInvokingJobDetailFactoryBean.java:176)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1367)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1333)

Our config in the tlb-servlet.xml (Spring MVC 2.5.3):

<bean id="newUserNotificationJob"
class="org.springframework.scheduling.quartz.MethodInvokingJobDetailFactoryBean">
<property name="targetObject" ref="notificationMgr" />
<property name="targetMethod" value="processNewUserNotifications" />
</bean>

<bean id="trigger" class="org.springframework.scheduling.quartz.SimpleTriggerBean">
<property name="jobDetail" ref="newUserNotificationJob" />
<property name="startDelay" value="5000" />
<property name="repeatInterval" value="600000" />
</bean>

Thanks!

Cheers,
Terracha

Hi,

We recently upgraded Terracotta from 3.4.1 to 3.5.3 . I copied all the new jars in our base code.

Strangely enough, all the new jars work except the new quarts jar which for some reason won't work with our code, only the old jar will work.

Any suggestions on what could be happening here?

Thanks,
Terracha

Hi Rajoshi,

Thanks very much for the response!

Is Terracotta 3.5.1 a safe version? I seem to recall reading about people having issues with it and Terracotta 3.5.3 or 3.5.4 looking like safe versions.

Any thoughts on this?

Thanks,
Charles

Hi,

I believe we are having the same issue as this thread:

http://forums.terracotta.org/forums/posts/list/6470.page

We are using Terracotta 3.4.1 and the thread dump is pretty similar during the weird locking event:

"TP-Processor396" daemon prio=10 tid=0x00002aff2ba9d000 nid=0x5a7b waiting on condition [0x00002aff61e98000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00002afecfb91da0> (a java.util.concurrent.Semaphore$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281)
at java.util.concurrent.Semaphore.acquire(Semaphore.java:286)
at com.tc.object.locks.LockStateNode$PendingLockHold.park(LockStateNode.java:179)
at com.tc.object.locks.ClientLockImpl.acquireQueued(ClientLockImpl.java:723)
at com.tc.object.locks.ClientLockImpl.acquireQueued(ClientLockImpl.java:701)
at com.tc.object.locks.ClientLockImpl.lock(ClientLockImpl.java:52)
at com.tc.object.locks.ClientLockManagerImpl.lock(ClientLockManagerImpl.java:98)
at com.tc.object.bytecode.ManagerImpl.lock(ManagerImpl.java:747)
at com.tc.object.bytecode.ManagerUtilInternal.beginLock(ManagerUtilInternal.java:33)
at org.terracotta.locking.strategy.LongLockStrategy.beginLock(LongLockStrategy.java:16)
at org.terracotta.locking.strategy.LongLockStrategy.beginLock(LongLockStrategy.java:7)
at com.terracotta.toolkit.collections.ConcurrentDistributedMapDso.beginLock(ConcurrentDistributedMapDso.java:828)
at com.terracotta.toolkit.collections.ConcurrentDistributedMapDso.get(ConcurrentDistributedMapDso.java:195)
at com.terracotta.toolkit.collections.ConcurrentDistributedMapDsoArray.get(ConcurrentDistributedMapDsoArray.java:175)
at org.terracotta.collections.ConcurrentDistributedMap.get(ConcurrentDistributedMap.java:190)
at org.terracotta.cache.TerracottaDistributedCache.getNonExpiredEntry(TerracottaDistributedCache.java:197)
at org.terracotta.cache.TerracottaDistributedCache.getNonExpiredEntryCoherent(TerracottaDistributedCache.java:131)
at org.terracotta.cache.TerracottaDistributedCache.containsKey(TerracottaDistributedCache.java:126)
at org.terracotta.modules.ehcache.store.ClusteredStore.internalContainsKey(ClusteredStore.java:476)
at org.terracotta.modules.ehcache.store.ClusteredStore.containsKey(ClusteredStore.java:456)
at org.terracotta.modules.ehcache.store.ClusteredStore.containsKeyInMemory(ClusteredStore.java:463)
at net.sf.ehcache.Cache.searchInStoreWithStats(Cache.java:1742)
at net.sf.ehcache.Cache.get(Cache.java:1405)
at net.sf.ehcache.hibernate.regions.EhcacheTransactionalDataRegion.get(EhcacheTransactionalDataRegion.java:91)
at net.sf.ehcache.hibernate.strategy.AbstractReadWriteEhcacheAccessStrategy.get(AbstractReadWriteEhcacheAccessStrategy.java:65)
at org.hibernate.event.def.DefaultLoadEventListener.loadFromSecondLevelCache(DefaultLoadEventListener.java:524)
at org.hibernate.event.def.DefaultLoadEventListener.doLoad(DefaultLoadEventListener.java:397)
at org.hibernate.event.def.DefaultLoadEventListener.load(DefaultLoadEventListener.java:165)
at org.hibernate.event.def.DefaultLoadEventListener.proxyOrLoad(DefaultLoadEventListener.java:223)
at org.hibernate.event.def.DefaultLoadEventListener.onLoad(DefaultLoadEventListener.java:126)
at org.hibernate.impl.SessionImpl.fireLoad(SessionImpl.java:906)
at org.hibernate.impl.SessionImpl.internalLoad(SessionImpl.java:874)
at org.hibernate.type.EntityType.resolveIdentifier(EntityType.java:590)

My question is, will upgrading to Terracotta 3.5.1 fix this or do we have to switch the HttpSession to HttpSessionMutexListener?

I'm not really sure what going to HttpSessionMutexListener means but the solution to upgrade Terracotta to fix this issue would obviously be attractive.

We don't see any clients timeouts in the client logs but we have the same behaviour where some weird locking is happening until the Tomcat connection pool threads are maxed out.

We use Spring MVC 2.5.1 I believe it is and tomcat 6.0.30.

Thanks,
Charles

Hi,

I'm looking to see if I can get more logging out of the Terracotta server to try to debug a production issue we're having.

I see this in the documentation:

**********************************
How can I control the logging level for Terracotta servers and clients?

Create a file called .tc.custom.log4j.properties and edit it as a standard log4j.properties file to configure logging, including level, for the Terracotta node that loads it. This file is searched for in the path specified by the environment variable TC_INSTALL_DIR (if defined), user.home, and user.dir.

***********************************

I don't feel I understand how to setup this .tc.custom.log4j.properties file? Does anyone have an example file?

Thanks,
Terracha

We use version 3.4.1 so I would venture your version would be better than this one and that shouldn't be the issue.

Still might be worth a try I suppose.

Cheers,
Terracha