[Logo] Terracotta Discussion Forums (LEGACY READ-ONLY ARCHIVE)
  [Search] Search   [Recent Topics] Recent Topics   [Members]  Member Listing   [Groups] Back to home page 
[Register] Register / 
[Login] Login 
[Expert]
Client reconnect problems  XML
Forum Index -> General
Author Message
michal

journeyman

Joined: 09/13/2009 06:34:52
Messages: 22
Offline

Hi.
We use a single TC server and a TC client running on different machines.
We had a scenario in which the Terracotta server could not see the TC client due to a communication problem. After a while the TC server disconnected this client. When the communication between the client and the server returned, the client could not reconnect to the server (the server did not accept it), and the client lost the connection forever.
Our TC client is a critical application that must not be restarted.

We tried to configure:
<property name="l2.l1reconnect.enabled" value="true" />
<property name="l2.l1reconnect.timeout.millis" value="300000" />

but it didn't help.

My questions are:
(1) What can we do in order to solve this problem?
(2) How can we increase the time which the server waits before it consideres the client dead?

Server log:

2010-07-07 11:12:30,080 [L2_L1:TCComm Main Selector Thread (listen 0.0.0.0:9510)] INFO com.terracottatech.console - Client Cannot Reconnect ConnectionID(45.8781aca8a7e14697a6292d47c429486b) not found. Connection attempts from the Terracotta client at 10.1.154.151:64470 are being rejected by the Terracotta server array. Restart the client to allow it to rejoin the cluster. Many client reconnection failures can be avoided by configuring the Terracotta server array for "permanent-store" and tuning reconnection parameters. For more information, see http://www.terracotta.org/ha
2010-07-07 11:12:30,080 [L2_L1:TCComm Main Selector Thread (listen 0.0.0.0:9510)] INFO com.tc.net.protocol.transport.ServerStackProvider - Client Cannot Reconnect ConnectionID(45.8781aca8a7e14697a6292d47c429486b) not found. Connection attempts from the Terracotta client at 10.1.154.151:64470 are being rejected by the Terracotta server array. Restart the client to allow it to rejoin the cluster. Many client reconnection failures can be avoided by configuring the Terracotta server array for "permanent-store" and tuning reconnection parameters. For more information, see http://www.terracotta.org/ha

zeeiyer

consul

Joined: 05/24/2006 14:28:28
Messages: 493
Offline

Assuming you are on a recent version of Terracotta, those properties should help put in some grace period (300s = 5 min) during which time a disconnected client would be allowed to connect back into the cluster. Are you sure all services including TC server were restarted after this config change - the config read in is printed out in the tc-server logs, when it starts up.



Sreeni Iyer, Terracotta.
Not a member yet - Click here to join the Terracotta Community
michal

journeyman

Joined: 09/13/2009 06:34:52
Messages: 22
Offline

Yes. I'm sure.
steve

ophanim

Joined: 05/24/2006 14:22:53
Messages: 619
Offline

Can you package up a reproducible case? We'll help us take a look.

Want to post to this forum? Join the Terracotta Community
michal

journeyman

Joined: 09/13/2009 06:34:52
Messages: 22
Offline

Hi.
I don't have a simple reproducable case, so I'm attaching our tc-config.xml.
We are using TC 3.2.1
Please let me know if you need any other information.
Thanks.
 Filename tc-config.xml [Disk] Download
 Description
 Filesize 10 Kbytes
 Downloaded:  329 time(s)

ssubbiah

jedi

Joined: 05/24/2006 14:25:22
Messages: 117
Location: Saravanan Subbiah
Offline

The full logs of the server and the client would be helpful to understand what is going on.

Saravanan Subbiah
Terracotta Engineer
mgovinda

journeyman

Joined: 10/16/2007 12:32:55
Messages: 30
Offline

Client and Server logs will help us to know the duration of the network outages. As per your configuration, server should allow 5 mins network outages and client should be able to join back anytime within 5 mins.

But, if either of them were in Long GCs, then "l2.l1reconnect.timeout.millis" property would not govern the disconnect and reconnect behaviours. Once we have a look at your log files, we would have more clues about the exact problem.
 
Forum Index -> General
Go to:   
Powered by JForum 2.1.7 © JForum Team