[Logo] Terracotta Discussion Forums
  [Search] Search   [Recent Topics] Recent Topics   [Members]  Member Listing   [Groups] Back to home page 
[Register] Register / 
[Login] Login 
[Expert]
using 2.6-stable4 problem  XML
Forum Index -> Terracotta for Web Sessions
Author Message
cljhyjs

journeyman

Joined: 05/07/2008 03:22:42
Messages: 10
Offline

hi,I have 2 servers startup with 2.6-stable4 cluster, when I shoudown activated server,then another server was activated, but printed below message in console:
"2008-05-09 06:35:40,949 INFO - Unable to find communications stack. ConnectionID(2.e3efe3c35a364bcf9647f0271fad1554) not found. This is usually caused by a client from a prior run trying to illegally reconnect to the server. While that client is being rejected, everything else should proceed as normal. "

why??
zeeiyer

consul

Joined: 05/24/2006 14:28:28
Messages: 493
Offline


This perhaps means one of your client JVMs did not connect to the standby Terracotta server and is being rejected from joining the cluster - is that what you observed?

If so, you have to look into your client-reconnect-window (tc-config.xml) and l2.l1reconnect settings (in tc.properties) and what your client and server were doing when this happened, which resulted in one of your client JVMs not being able to connect to the standby server


Sreeni Iyer, Terracotta.
Not a member yet - Click here to join the Terracotta Community
gkeim

ophanim

Joined: 12/05/2006 10:22:37
Messages: 685
Location: Terracotta, Inc.
Offline

Is this happening on Linux?

Gary Keim (terracotta developer) Want to post to this forum? Join the Terracotta Community
cljhyjs

journeyman

Joined: 05/07/2008 03:22:42
Messages: 10
Offline

yes,is happening on Linux?
gkeim

ophanim

Joined: 12/05/2006 10:22:37
Messages: 685
Location: Terracotta, Inc.
Offline

This probably means you have an old client that was once connected to that server linger about. If this is not the case, please try to provide more details or a script to reproduce the problem.

Gary Keim (terracotta developer) Want to post to this forum? Join the Terracotta Community
cljhyjs

journeyman

Joined: 05/07/2008 03:22:42
Messages: 10
Offline

I was found another exception with cluster,It was very deadliness,lead to two terracotta servers down , below log show in console:
2008-05-15 03:31:22,910 [WorkerThread(group_events_dispatch_stage,0)] INFO com.terracottatech.console - NodeID[192.168.100.55:9530] joined the cluster
2008-05-15 03:31:22,910 [TCComm Main Selector Thread (listen 0:0:0:0:0:0:0:0:9530)] INFO com.tc.net.protocol.transport.ConnectionHealthCheckerImpl. TCGroupManager - Health monitoring agent started for 192.168.100.55:48075
2008-05-15 03:31:23,074 [WorkerThread(group_handshake_message_stage,0)] INFO com.tc.net.protocol.transport.ConnectionHealthCheckerImpl: TCGroupManager - Connection to [192.168.100.55:48075] CLOSED. Health Monitoring for this node is now disabled.
2008-05-15 03:31:23,403 [WorkerThread(receive_group_message_stage,0)] INFO com.tc.l2.objectserver.ReplicatedObjectManagerImpl - Send response to Active's query : known id lists = 1387586
2008-05-15 03:31:25,615 [WorkerThread(receive_group_message_stage,0)] WARN com.tc.l2.ha.L2HAZapNodeRequestProcessor - Terminating due to Zap request from NodeID : NodeID[192.168.100.55:9530] Error Type : Newly Joined Node Contains dirty database. (Please clean up DB and restart node) Details : Nodes joining the cluster after startup shouldnt have any Objects. NodeID[192.168.100.50:9530] contains 1387586 Objects !!! : Exception :
java.lang.Throwable
at com.tc.l2.objectserver.ReplicatedObjectManagerImpl.handleObjectListResponse(ReplicatedObjectManagerImpl.java:165)
at com.tc.l2.objectserver.ReplicatedObjectManagerImpl.handleClusterObjectMessage(ReplicatedObjectManagerImpl.java:146)
at com.tc.l2.objectserver.ReplicatedObjectManagerImpl.messageReceived(ReplicatedObjectManagerImpl.java:120)
at com.tc.net.groups.TCGroupManagerImpl.fireMessageReceivedEvent(TCGroupManagerImpl.java:588)
at com.tc.net.groups.TCGroupManagerImpl.messageReceived(TCGroupManagerImpl.java:548)
at com.tc.objectserver.handler.ReceiveGroupMessageHandler.handleEvent(ReceiveGroupMessageHandler.java:22)
at com.tc.async.impl.StageImpl$WorkerThread.run(StageImpl.java:142)


2008-05-15 03:31:25,615 [WorkerThread(receive_group_message_stage,0)] WARN com.terracottatech.console - Terminating due to Zap request from NodeID : NodeID[192.168.100.55:9530] Error Type : Newly Joined Node Contains dirty database. (Please clean up DB and restart node) Details : Nodes joining the cluster after startup shouldnt have any Objects. NodeID[192.168.100.50:9530] contains 1387586 Objects !!! : Exception :
java.lang.Throwable
at com.tc.l2.objectserver.ReplicatedObjectManagerImpl.handleObjectListResponse(ReplicatedObjectManagerImpl.java:165)
at com.tc.l2.objectserver.ReplicatedObjectManagerImpl.handleClusterObjectMessage(ReplicatedObjectManagerImpl.java:146)
at com.tc.l2.objectserver.ReplicatedObjectManagerImpl.messageReceived(ReplicatedObjectManagerImpl.java:120)
at com.tc.net.groups.TCGroupManagerImpl.fireMessageReceivedEvent(TCGroupManagerImpl.java:588)
at com.tc.net.groups.TCGroupManagerImpl.messageReceived(TCGroupManagerImpl.java:548)
at com.tc.objectserver.handler.ReceiveGroupMessageHandler.handleEvent(ReceiveGroupMessageHandler.java:22)
at com.tc.async.impl.StageImpl$WorkerThread.run(StageImpl.java:142)


2008-05-15 03:31:25,615 [CommonShutDownHook] INFO com.terracottatech.dso - L2 Exiting...

ssubbiah

jedi

Joined: 05/24/2006 14:25:22
Messages: 115
Location: Saravanan Subbiah
Offline

I only see one server going down. Did the active server went down too ? If so please post both the logs.

This exception is normal when you start a passive server with a persistent database. The active server is asking the passive serve to quit because there is data in the persistent data store. If you clean up the store and then restart passive server then this wont happen.

In future TC versions, this will be automatic.

cheers,

Saravanan Subbiah
Terracotta Engineer
ari

seraphim

Joined: 05/24/2006 14:23:21
Messages: 1665
Location: San Francisco, CA
Offline

I think you should step back a moment. It sounds like you may have several things misconfigured. What are you trying to do? Are you in production or running a test? Are you trying to test what happens when TC servers fails? Clients fail?

You are definitely encountering several configuration issues, but nothing that we see thus far is a bug in the software. With a bit more information we should be able to help.

Can you share your tc-config.xml?
Can you explain the test you are trying to run?
Can you explain a bit about what you did / what happened when you found these errors? Was the system down when you expected it to be up and so you scanned the logs looking for problems? Or were you explicitly testing TC active / passive failover?

More info please.

--Ari
[WWW]
cljhyjs

journeyman

Joined: 05/07/2008 03:22:42
Messages: 10
Offline

Yes,when passive server going down. the active server went down too,Attachment is terracotta cluster l2 log.
 Filename log.txt [Disk] Download
 Description terracotta cluster log
 Filesize 32 Kbytes
 Downloaded:  218 time(s)

ssubbiah

jedi

Joined: 05/24/2006 14:25:22
Messages: 115
Location: Saravanan Subbiah
Offline

Again the log is for the passive server. (192.168.100.55)

Can you attach the log from the active server ? (192.168.100.50)

From the passive servers log, I see that there may have been some transient network problem between the active and the passive for about a second or so.


2008-05-14 08:40:14,031 [WorkerThread(group_events_dispatch_stage,0)] WARN com.tc.l2.ha.L2HACoordinator - NodeID[192.168.100.50:9530] left the cluster
....
2008-05-14 08:40:15,274 [WorkerThread(group_events_dispatch_stage,0)] INFO com.tc.l2.ha.L2HACoordinator - NodeID[192.168.100.50:9530] joined the cluster
 


This caused the active is request the passive to quit. If you want protect against such transient network failures, there are some configuration parameters. Our field engineers will be able to help u tune it.

I still dont see the active server quiting.

cheers,

Saravanan Subbiah
Terracotta Engineer
cljhyjs

journeyman

Joined: 05/07/2008 03:22:42
Messages: 10
Offline

thanks,Attachment is active server log。
How to configuration parameters that protect against such transient network failures?
 Filename nohup.out [Disk] Download
 Description
 Filesize 30 Kbytes
 Downloaded:  132 time(s)

ari

seraphim

Joined: 05/24/2006 14:23:21
Messages: 1665
Location: San Francisco, CA
Offline

You shouldn't simply configure Terracotta to "fix" transient network failures. I think your network / machines / operating systems are not configured right. Saravanan, correct me if I am wrong, but shouldn't cljhyjs fix the network and not try to work around the problem using Terracotta?

--Ari
[WWW]
cljhyjs

journeyman

Joined: 05/07/2008 03:22:42
Messages: 10
Offline

ari wrote:
I think you should step back a moment. It sounds like you may have several things misconfigured. What are you trying to do? Are you in production or running a test? Are you trying to test what happens when TC servers fails? Clients fail?

You are definitely encountering several configuration issues, but nothing that we see thus far is a bug in the software. With a bit more information we should be able to help.

Can you share your tc-config.xml?
Can you explain the test you are trying to run?
Can you explain a bit about what you did / what happened when you found these errors? Was the system down when you expected it to be up and so you scanned the logs looking for problems? Or were you explicitly testing TC active / passive failover?

More info please.

--Ari
 


Ok,thank you response!
I just running a test, as I am now running a system which has millions of users. The maximum number of concurrent access requests is 10,000 per second, and in every second up to 10,000 user sessions are added. I want to use terracotta. but I want to know feasibility?

Currently I'm doing a performance test about session sharing. I have two servers (dell pc server,2cpu 2.4g,6G memory), each installed with terracotta; and 4 web servers, each installed with tomcat 5.5.
Here is the testing result:
When sessions raise up to 1.8 millions, I restarted a standby server. Then the following error happened, which caused the two servers down:
Attachment is tc-config file.


 Filename tc-config-server.xml [Disk] Download
 Description terracotta server config file
 Filesize 3 Kbytes
 Downloaded:  149 time(s)

 Filename tc-config-tomcat.xml [Disk] Download
 Description
 Filesize 4 Kbytes
 Downloaded:  127 time(s)

 
Forum Index -> Terracotta for Web Sessions
Go to:   
Powered by JForum 2.1.7 © JForum Team