[Logo] Terracotta Discussion Forums
  [Search] Search   [Recent Topics] Recent Topics   [Members]  Member Listing   [Groups] Back to home page 
[Register] Register / 
[Login] Login 
Messages posted by: ismith99  XML
Profile for ismith99 -> Messages posted by ismith99 [19] Go to Page: 1, 2 Next 
Author Message
We upgraded to TC 3.3.0 and verified with JDK (JRE) 1.6.0_22. Now we are at the customer site and they have a JDK 1.6.0_15.

Are these compatible? Where is this info?
>>1. possible problems depending on use cases
>>3. Have you tried excluding it and rebuilding the bootjar? Do our bootjar tools ignore your excludes in this case or something?

This helped me find the problem. Thank you.

In TC 3.3.0, when ${TC_HOME}/platform/bin/dso-java.sh launches make-boot-jar.sh, the extra 28 awt classes are instrumented because "loaded base configuration from Java resource at '/com/tc/config/schema/setup/default-config.xml', relative to class com.tc.config.schema.setup.StandardXMLFileConfigurationCreator"

In TC 3.1.1, when ${TC_HOME}/bin/dso-java.sh launches make-boot-jar.sh, only a single awt class is instrumented because our own tc-config.xml is loaded (from "dot" current directory).

When make-boot-jar.sh is run standalone, the results are same for TC 3.3 and TC 3.1.1. That is, the awt classes are not instrumented, because it loads our own tc-config.xml.

This seems like a tc bug to me.
Why does java 1.6_20 under tc 3.3 instrument Java2D classes (28 classes)
but under tc 3.1.1, no java2d is instrumented? How to exclude them?
I have excluded GUI from l1.modules.default.

The following demonstrates my question. 28 awt classes are instrumented in tc3.3 but none for 3.1.

<property name="l1.modules.default"
value="org.terracotta.modules.excludes-config;bundle-version:=3.3.0,org.terracotta.modules.jdk15-preinst-config;bundle-version:=3.3.0,org.terracotta.modules.standard-config;bundle-version:=3.3.0" />

skupunit:dso-boot issmith1$ pwd

skupunit:dso-boot issmith1$ jar tvf dso-boot-hotspot_osx_160_20.jar |grep awt
5108 Sat Sep 11 15:07:12 PDT 2010 java/awt/geom/Line2D.class
4253 Sat Sep 11 15:07:12 PDT 2010 java/awt/geom/Rectangle2D.class

jar tvf dso-boot-hotspot_osx_160_20.jar |grep awt |wc -l

skupunit:dso-boot issmith1$ jar tvf dso-boot-hotspot_osx_160_20.jar | grep awt
742 Wed Jun 09 09:40:14 PDT 2010 java/awt/AWTException.class
I just ported it. Here are the steps I needed to take:

Get times
./bin/tim-get.sh install-for tc-config.xml --dry-run

Just to be clear, you don't want to be referencing tim-concurrent-collections, tim-distributed-cache, tim-async-processing or tim-annotations starting with the 3.3.0 release. They have all been rolled into the terracotta-toolkit-1.0 artifact

For sure I know that we removed <statistics> from <client> in 3.3.0 so you should just remove that from your tc-config.xml

If you are overriding l1.modules.default , ensure the versions are changed.

<property name="l1.modules.default"
value="org.terracotta.modules.excludes-config;bundle-version:=3.3.0,org.terracotta.modules.jdk15-preinst-config;bundle-version:=3.3.0,org.terracotta.modules.standard-config;bundle-version:=3.3.0" />

Change <modules> in tc-config.xml

<module name="tim-cglib-2.1.3" version="1.6.0"/>
<module name="tim-equinox-3.5.1" version="1.2.0"/>

platform subdir for dso boot
I am requesting a reply to my previous post:

Does DSO support a way to revert a DSO object state upon client ejection from the cluster? or perhaps does tc have a way for us to set a time, ie. the state of a DSO object would revert to its original state upon timeout?
Pardon I didn't explain that what we are trying to release is an application-level lock. The application object (call it AppLock) is instrumented for TC DSO. The use case is:
- in our application the user acquires AppLock, so its state in the TC cluster is changed to this applicationID.
- our app is abnormally killed
- our shutdown hook intends to revert the state of AppLock in the cluster....to remove applicationID from that object.

Perhaps we can achieve this use case in a better way. Perhaps tc server handle this situation for us. Does DSO support a way to revert a DSO object state upon client ejection from the cluster? or perhaps does tc have a way for us to set a time, ie. the state of a DSO object would revert to its original state upon timeout?

Regarding the shutdown hook, this is what I already concluded, so thanks for your answer. We should not code service-dependent tasks in a jvm shutdown hook... normally jvm shutdown hooks should be quick and simple.

Why does 3.3 eject the client before our shutdown hook, but 3.1 did not?
Should I try tc 3.2.1 instead? Has health checker settings changed?

The stacks are already in this post. The shutdown thread and here is the event queue.

Name: AWT-EventQueue-0
State: WAITING on java.lang.Thread@1335ca6c
Total blocked: 384 Total waited: 372

Stack trace:
java.lang.Object.wait(Native Method)
- locked java.lang.Class@4b901dc7

Here is our shutdown hook stack:
"Thread-21" prio=5 tid=101dd8800 nid=0x15687e000 in Object.wait() [15687d000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <107d6e068> (a com.tcclient.util.concurrent.locks.ConditionObject$SyncCondition)
at java.lang.Object.wait(Object.java:485)
at com.tc.object.bytecode.ManagerImpl.wait(ManagerImpl.java:834)
at com.tc.object.bytecode.ManagerUtil.objectWait(ManagerUtil.java:521)
at com.tcclient.util.concurrent.locks.ConditionObject.await(ConditionObject.java:103)
at com.tc.object.locks.ClientLockManagerImpl.waitUntilRunning(ClientLockManagerImpl.java:597)
at com.tc.object.locks.ClientLockManagerImpl.lock(ClientLockManagerImpl.java:91)
at com.tc.object.bytecode.ManagerImpl.lock(ManagerImpl.java:740)
at com.tc.object.bytecode.ManagerImpl.monitorEnter(ManagerImpl.java:874)
at com.tc.object.bytecode.ManagerUtil.instrumentationMonitorEnter(ManagerUtil.java:590)
at gov.nasa.arc.mct.lock.MCTNonBlockingLock.isLocked(MCTNonBlockingLock.java)
at gov.nasa.arc.mct.session.MCTSessionImpl.recoverLocksIfNecessary(MCTSessionImpl.java:101)
at gov.nasa.arc.mct.session.MCTSessionImpl.stop(MCTSessionImpl.java:162)
- locked <1095db2f0> (a gov.nasa.arc.mct.session.MCTSessionImpl)
at gov.nasa.arc.mct.session.SessionShutDownHook.run(SessionShutDownHook.java:14)
at java.lang.Thread.run(Thread.java:637)
We are trying to upgrade from 3.1.1 to 3.3.0. Our application (t1) uses dso platform. The client has a JVM shutdown hook that invokes tc instrumented classes to determine if the application is holding any tc locks.

Upon shutdown of our t1 application, it hangs. Using dev-console I observe t1 node has left the cluster. This make sense why our JVM shutdown hook is hanging (since l1 is trying to get to l2 after it has left the cluster.)

What has changed from 3.1.1 to 3.3.0? For example, certain health check properties? I am not sure the best debugging route to take.

thank you
We are deploying to our customer network. We learned at the site that some
machines had very low memory. One of the TC l1 clients went into an application hang most likely due to swap/thrashing. The entire cluster
hung and we needed to tell our customers abort all sessions.

TC Doc indicates the l2 to l1 health checker tests are ping/socket, thus we suspect that during the hang at our customer site, the l2 to l1 connection status was up (at socket/ping level).

Does TC provide a mechanism for L2 server to detect L1 hang at
the application level? When TC doc says HealthChecker can determine
if a peer node is in a GC operation, how is this done?

2) http://www.terracotta.org/documentation/ga/high-availability.html says the default for l2.l1reconnect.enabled is true. What is actual default for TC 3.1.1 ?
Any reply?
To better state my question, TC doc states that the client reconnect mechanism is a hybrid of Automatic Client Reconnect (l2.l1reconnect.enabled = true) and Health Checker.

What is TC's recommendation to ensure that a client holding a cluster lock will not hang the cluster?
I am reading tc doc 5.2.2c Automatic Client Reconnect

We are running tc in production with active/passive HA.

If a client is holding a cluster lock, then how to avoid cluster wide hang?

1) if a client is holding a cluster lock, then client hangs, how to
config so the server will reject the client from the cluster?
Should we set l2.l1reconnect.enabled to false, then let
Health Checker control everything?

2) The above documenation is somewhat unclear. Can you point
me to further documenation?

Thank you! I tried moving my MAC down from 20 to 17... much hassle and java minor version 17 on MAC still fails... now I know I can wait for a fix.
Please confirm :
linux java.16_20 tc3.2.1 broken?
mac OS java1.6_20 tc3.2.1 broken?
(Sorry for all these posts.)
About when 3.2.2 open src is avail?
Profile for ismith99 -> Messages posted by ismith99 [19] Go to Page: 1, 2 Next 
Go to:   
Powered by JForum 2.1.7 © JForum Team