Terracotta Discussion Forums (LEGACY READ-ONLY ARCHIVE)

DSO's start scripts now make the bootjars if the JVM version doesn't match or if one doesn't exist.

We chose to stop making builds for EVERY platform and this allows DSO to run on many platforms for which we don't have a formal build.

Can't remember where they are placed though...sorry.

Yes...close enough.

The reconnect window is discussed in more detail in the architecture guide. Again, remember that the reconnect window is designed to give every JVM in your app cluster a chance to find the new active TC server and get back into the cluster. If it fails to get into the cluster, that JVM will be quarantined.

http://www.terracotta.org/confluence/display/docs1/Concept+and+Architecture+Guide

You can get informed via JMX in your app when these events occur.

http://www.terracotta.org/confluence/display/docs1/JMX+Guide

--Ari

John,

your heatbeat mechanism is a good idea. The direct answer is it both does and does not change my advice:

1. In your case, you can live w/o the load balancer so my advice changes

2. In your TEST however, you are killing TomcatA and TerracottaA. The logs show 2 JVMs (channels 5 and 6) and only channel6 coming back when failing over to TerracottaB.

If I am correct, you do not need to change your RECONNECT window but instead change your test to kill only TerracottaA w/o killing TomcatA. You can break into 2 tests, (1) test Terracotta failover and (2) test Tomcat failover. What you have uncovered in terms of losing both Tomcat AND Terracotta taking 2 minutes to recover...well, for this you can just lower the RECONNECT to an acceptable level. And that is totally fine / ok.

But you do want to make sure you confirm for yourself that failover for JUST LOSS OF EITHER of Terracotta or Tomcat will indeed failover almost intantaneously, regardless of your RECONNECT setting...it should. Your test is different though...simultaneous loss of Tomcat and Terracotta (even if you had only one Terracotta server and it just restarted itself while a Tomcat instance died), is exactly what RECONNECT WINDOW is for.

Make sense?

--Ari

http://www.terracotta.org/confluence/display/docs1/Configuration+Guide+and+Reference#ConfigurationGuideandReference-tc%3Atcconfig%2Fservers%2Fserver%2Fdso%2Fclientreconnec...

I think what is happening here, and I am guessing based on your post is that you are connecting a DIFFERENT set of Tomcat instances to TerracottaB versus TerracottaA.

Our fail-over model keeps an app cluster running exactly as it was before Terracotta failed. This means many things like lock state and data state being maintained across TC server failures. What matters in your case is that the connected JVMs are also remembered. So if TomcatA dies and TomcatB takes over, Terracotta is letting TomcatB in right away but waiting for TomcatA till a configured timeout. Why? Because the cluster was made up of TomcatA before the failure occurred.

2 ways fwd for you (and I recommend the first...not the second):
1. Run both TomcatA and TomcatB underneath an HTTP loadbalancer--sticky load balancing is PREFERED for performance reasons but not required. Now, kill ONLY TerracottaA when testing failover and TerracottaB will start IMMEDIATELY w/o invoking the RECONNECT window because TomcatA and B were connected BEFORE failure and reconnect immediately a TerracottaB starts up.

2. Shrink the reconnect window if you MUST run only one of TomcatA or B at one time. This is undesirable because any locks that TomcatA was holding and requests it was busy servicing will be LOST when they didn't have to be lost if you run under the previous load-balanced model.

In short, TC let's JVMs keep running after a TC failure and since you are killing TomcatA, TC is trying to give TomcatA time to come back. You can shorten that timeout but you shouldn't--you should go load balanced in your Tomcat layer.

--Ari

FWIW, Dell Hosting Services has mentioned to us on a concall that they are building expertise in house on operating Terracotta Servers inside customer applications and use cases.

You might want to start there.

--Ari

Right: what Iyer said.

Or, you have a configuration alternative which would be to have the same field in 2 different apps use "named roots" instead of letting DSO auto-name your root by fully-qualified field name.

Basically, use 2 different TC configs and in app #1, name the root "foo-app1" and in app #2, name the root "foo-app2" and the same field in the same class under the same classloader in 2 different apps will no longer be shared across those 2 apps.

See:
http://www.terracotta.org/confluence/display/docs1/Configuration+Guide+and+Reference#ConfigurationGuideandReference-%2Ftc%3Atcconfig%2Fapplication%2Fdso%2Froots

Depends. Hibernate w/ 2nd level caching is creating its own objects inside EHCache. These are not the same as the POJO-like objects Hibernate hands back to your app.

There is currently no way for your POJO objects and your Hibernate 2nd level cache objects to work as one / act as the same reference in heap.

The only way those Hibernate objects (flattened raw field data, actually) would be shared with other objects in Terracotta is if all the apps spoke to the same DB through the same Hibernate configuration. In other words, all use of the data has to route through Hibernate.

Just to be clear, if your question means "can Hibernate data be shared across app clusters" the answer is "yes." If your question means "can Hibernate 2nd level data be shared with regular POJOs" the answer is "no."

--Ari

please provide a feature request in a JIRA, erezhara. Taylor and Steve are currently at work on the runtime-changeable capabilities for an upcoming release. There are a few things that we could expose through JMX such as "object pre-fetch depth" and "log level" or "debug mode" of some sort. There are even more things we could do such as "add another passive Terracotta server to the list".

As zeeiyer points out, though, changing locks and roots at runtime would be difficult to figure out for us. Example: adding some static named "foo" as a new root after 3 days of operation. Each JVM will have its own state for "foo". Which version wins? We can't figure that out and definitely cannot configure it without breaking many of the running app instances / threads that are currently relying on certain data in "foo".

A better example would be locking. If we allowed the introduction of locks at runtime, you could easily deadlock. If you removed locks at runtime, you would likely introduce race conditions.

Make sense? Please file the JIRA for the types of things you want to see, though...we are very interested and working on this feature RIGHT NOW.

--Ari

Client shutdown includes a disconnect protocol, BTW. This is how we make it possible to identify a burp from a shutdown.

--Ari

You need to put the server in persistent mode in order to restart it. It then keeps track of connected JVMs and allows them back in to the cluster upon restart. If you do not enable persistent mode, there is no way for the terracotta server to know about clients and thus a client that connects to a "fresh" server and announces itself as "reconnecting" is turned away because the server knows nothing about what that client had been doing in the past.

This search gives many links explaining the situation, architecture, and the appropriate remedies:

http://www.terracotta.org/confluence/display/orgsite/Search+Results?cx=011330805590965408378%3Al-uxr17yka0&cof=FORID%3A9&q=persistent+mode&sa=Search

Hope this helps,

--Ari

Looks like we are working on the same problem in 2 places. We might be a tad further along on the following thread (should just move this conversation there, IMHO):

http://forums.terracotta.org/forums/posts/list/15/342.page#1873

I saw this sort of behavior in the middle of a presentation at JavaOne this year. We later fiddled with my firewall settings on my Mac and it immediately connected.

Sadly, I later could not connect and my firewall was disabled so this was indeed not the problem. But, we suspected the RMI library inside the product that we use to do password authentication of the admin console to the L2.

Right now, I suspect more the localhost port binding. Try running admin console on the same machine as Terracotta's server process and connecting to localhost. Next, screenshot the output of "netstat -a -n" for us, and send us your config files so we can see which port you expect JMX services to bind to and which port they are indeed bound to.

Hopefully, we will all find that the interface on which we are listening for connections can sometimes be too restrictive (localhost instead of * -- all interfaces).

Thanks for your patience,

--Ari

When using Terracotta to cluster sessions inside Tomcat, you need to do less than when using Terracotta for clustering your own roots.

The configurator writes a sessions-specific Terracotta config for you but the basic steps it is undertaking, you can do on your own. You wll have to specify some things on your own. Try this guide:

http://www.terracotta.org/confluence/display/docs1/Setting+Up+a+Tomcat+Web+Cluster

BTW, you are going to want to set your Terracotta servers on box_a and box_b in what is called "network active/passive" mode. On both Tomcat_a and Tomcat_B, make your Terracotta config such that Terracotta_A is primary and Terracotta_B is secondary. Both Tomcat_A and Tomcat_B MUST BE TALKING TO THE SAME TERRACOTTA SERVER to share data. From your question, my first guess was that you might have Terracotta configured mostly right but you have Tomcat_A talking to Terracotta_A and Tomcat_B talking to Terracotta_B.

Best way to validate if things are working at all is start up Terracotta, then Tomcat on 1 machine (don't worry about clustered Terracotta yet) and then startup $TC_HOME/bin/admin.sh. That console application should be able to connect to your Terracotta Server instance and show you if you have any connected clients (your Tomcat instance would be a client, for example), and show you if those clients have any shared data in them.

If you have nothing according to the admin console, your config is incorrect. Easiest way to get a configuration is usually the configurator but it is possible to do by hand. What happened when you tried the configurator?

Try this doc out if you haven't already: http://www.terracotta.org/confluence/display/docs1/Sessions+Configurator+Reference+Guide

Let us know if it helps.