[Logo] Terracotta Discussion Forums (LEGACY READ-ONLY ARCHIVE)
  [Search] Search   [Recent Topics] Recent Topics   [Members]  Member Listing   [Groups] Back to home page 
[Register] Register / 
[Login] Login 
[Expert]
Problem OutOfMemory when reattach server node to cluster  XML
Forum Index -> General
Author Message
blackza

neo

Joined: 08/25/2008 00:58:07
Messages: 5
Offline

Hi

I have 3 questions please see..

I have problem when I try to test cluster concept of tc server. At first I start server1 & server2 ok, after that I ctrl+C server2 to make it lefts cluster, server1 is still running, then I try to reconnect server2 to cluster then it complain that db is dirty please clear it, so I delete /data/* then start server2 it go to [ PASSIVE-UNINITIALIZED ] for a while, I notice that server2 /data/objectdb/ get growing but just about 200 - 300+ MB it will throws OutOfMemException.

I stop server1 and copy server1/data/objectdb/* to server2/data/objectdb/ manually and start server2, it will start ok, but I can't start server1 as it also complains the same problem.

in the active node it all objectdb files are about 900+ MB ,
roots contain only one ConcurrentHashMap holding around 800,000+ objects (Indeed my production should contain more than this size, maybe 1m - 5m objects should be around 1-5 GB as I notice).


1) What is the best way to reconnect tc server node to cluster how to solve this problem, am I do anything wrong?


please see tc server log below

SERVER 1
=====================================================

Code:
 D:\Work\project\terracotta\TestTerraCotta1\runnable\logs>cd ..
 
 D:\Work\project\terracotta\TestTerraCotta1\runnable>server1.bat
 
 D:\Work\project\terracotta\TestTerraCotta1\runnable>"D:\Terracotta\terracotta-2.
 6.2\bin"\start-tc-server.bat -f .\tc-config.xml -n "Server 1"
 2551-08-29 18:19:09,430 INFO - Terracotta 2.6.2, as of 20080626-150612 (Revision
  8952 by cruise@WXPMO0 from 2.6)
 2551-08-29 18:19:09,930 INFO - Configuration loaded from the file at 'D:\Work\pr
 oject\terracotta\TestTerraCotta1\runnable\.\tc-config.xml'.
 2551-08-29 18:19:09,968 INFO - Log file: 'D:\Work\project\terracotta\TestTerraCo
 tta1\runnable\.\logs\terracotta-server.log'.
 2551-08-29 18:19:12,355 INFO - Statistics store: 'D:\Work\project\terracotta\Tes
 tTerraCotta1\runnable\.\statistics'.
 2551-08-29 18:19:14,651 INFO - Statistics buffer: 'D:\Work\project\terracotta\Te
 stTerraCotta1\runnable\.\statistics'.
 2551-08-29 18:19:14,713 INFO - JMX Server started. Available at URL[service:jmx:
 jmxmp://0.0.0.0:9520]
 2551-08-29 18:19:21,475 INFO - Becoming State[ ACTIVE-COORDINATOR ]
 2551-08-29 18:19:21,540 INFO - Terracotta Server has started up as ACTIVE node o
 n 0.0.0.0:9510 successfully, and is now ready for work.
 2551-08-29 18:19:35,879 INFO - NodeID[192.168.0.6:9531] joined the cluster
 2551-08-29 18:19:39,446 WARN - Requesting node to quit due to the following erro
 r
 NodeID : NodeID[192.168.0.6:9531] Error Type : Newly Joined Node Contains dirty
 database. (Please clean up DB and restart node) Details : Nodes joining the clus
 ter after startup shouldnt have any Objects. NodeID[192.168.0.6:9531] contains 2
 827183 Objects !!! : Exception :
 java.lang.Throwable
         at com.tc.l2.objectserver.ReplicatedObjectManagerImpl.handleObjectListRe
 sponse(ReplicatedObjectManagerImpl.java:175)
         at com.tc.l2.objectserver.ReplicatedObjectManagerImpl.handleClusterObjec
 tMessage(ReplicatedObjectManagerImpl.java:146)
         at com.tc.l2.objectserver.ReplicatedObjectManagerImpl.messageReceived(Re
 plicatedObjectManagerImpl.java:120)
         at com.tc.net.groups.TCGroupManagerImpl.fireMessageReceivedEvent(TCGroup
 ManagerImpl.java:589)
         at com.tc.net.groups.TCGroupManagerImpl.messageReceived(TCGroupManagerIm
 pl.java:549)
         at com.tc.objectserver.handler.ReceiveGroupMessageHandler.handleEvent(Re
 ceiveGroupMessageHandler.java:22)
         at com.tc.async.impl.StageImpl$WorkerThread.run(StageImpl.java:142)
 
 
 2551-08-29 18:19:40,098 WARN - NodeID[192.168.0.6:9531] left the cluster
 2551-08-29 18:21:00,230 INFO - NodeID[192.168.0.6:9531] joined the cluster
 2551-08-29 18:24:01,424 WARN - NodeID[192.168.0.6:9531] left the cluster
 



SERVER 2
=====================================================
Code:
 
 D:\Work\project\terracotta\TestTerraCotta1\runnable2\statistics>cd ..
 
 D:\Work\project\terracotta\TestTerraCotta1\runnable2>server2.bat
 
 D:\Work\project\terracotta\TestTerraCotta1\runnable2>"D:\Terracotta\terracotta-2
 .6.2\bin"\start-tc-server.bat -f .\tc-config.xml -n "Server 2"
 2551-08-29 18:11:32,708 INFO - Terracotta 2.6.2, as of 20080626-150612 (Revision
  8952 by cruise@WXPMO0 from 2.6)
 2551-08-29 18:11:33,240 INFO - Configuration loaded from the file at 'D:\Work\pr
 oject\terracotta\TestTerraCotta1\runnable2\.\tc-config.xml'.
 2551-08-29 18:11:33,280 INFO - Log file: 'D:\Work\project\terracotta\TestTerraCo
 tta1\runnable2\.\logs\terracotta-server.log'.
 2551-08-29 18:11:35,673 INFO - Statistics store: 'D:\Work\project\terracotta\Tes
 tTerraCotta1\runnable2\.\statistics'.
 2551-08-29 18:11:37,957 INFO - Statistics buffer: 'D:\Work\project\terracotta\Te
 stTerraCotta1\runnable2\.\statistics'.
 2551-08-29 18:11:38,018 INFO - JMX Server started. Available at URL[service:jmx:
 jmxmp://0.0.0.0:9521]
 2551-08-29 18:11:58,492 INFO - Becoming State[ ACTIVE-COORDINATOR ]
 2551-08-29 18:11:58,541 INFO - Terracotta Server has started up as ACTIVE node o
 n 0.0.0.0:9511 successfully, and is now ready for work.
 2551-08-29 18:13:28,428 INFO - NodeID[192.168.0.6:9530] joined the cluster
 2551-08-29 18:13:37,003 WARN - Requesting node to quit due to the following erro
 r
 NodeID : NodeID[192.168.0.6:9530] Error Type : Newly Joined Node Contains dirty
 database. (Please clean up DB and restart node) Details : Nodes joining the clus
 ter after startup shouldnt have any Objects. NodeID[192.168.0.6:9530] contains 2
 827183 Objects !!! : Exception :
 java.lang.Throwable
         at com.tc.l2.objectserver.ReplicatedObjectManagerImpl.handleObjectListRe
 sponse(ReplicatedObjectManagerImpl.java:175)
         at com.tc.l2.objectserver.ReplicatedObjectManagerImpl.handleClusterObjec
 tMessage(ReplicatedObjectManagerImpl.java:146)
         at com.tc.l2.objectserver.ReplicatedObjectManagerImpl.messageReceived(Re
 plicatedObjectManagerImpl.java:120)
         at com.tc.net.groups.TCGroupManagerImpl.fireMessageReceivedEvent(TCGroup
 ManagerImpl.java:589)
         at com.tc.net.groups.TCGroupManagerImpl.messageReceived(TCGroupManagerIm
 pl.java:549)
         at com.tc.objectserver.handler.ReceiveGroupMessageHandler.handleEvent(Re
 ceiveGroupMessageHandler.java:22)
         at com.tc.async.impl.StageImpl$WorkerThread.run(StageImpl.java:142)
 
 
 2551-08-29 18:13:37,599 WARN - NodeID[192.168.0.6:9530] left the cluster
 2551-08-29 18:14:11,193 INFO - NodeID[192.168.0.6:9530] joined the cluster
 2551-08-29 18:16:45,468 WARN - NodeID[192.168.0.6:9530] left the cluster
 Terminate batch job (Y/N)? y
 
 D:\Work\project\terracotta\TestTerraCotta1\runnable2>server2.bat
 
 D:\Work\project\terracotta\TestTerraCotta1\runnable2>"D:\Terracotta\terracotta-2
 .6.2\bin"\start-tc-server.bat -f .\tc-config.xml -n "Server 2"
 2551-08-29 18:19:28,357 INFO - Terracotta 2.6.2, as of 20080626-150612 (Revision
  8952 by cruise@WXPMO0 from 2.6)
 2551-08-29 18:19:28,877 INFO - Configuration loaded from the file at 'D:\Work\pr
 oject\terracotta\TestTerraCotta1\runnable2\.\tc-config.xml'.
 2551-08-29 18:19:28,919 INFO - Log file: 'D:\Work\project\terracotta\TestTerraCo
 tta1\runnable2\.\logs\terracotta-server.log'.
 2551-08-29 18:19:31,322 INFO - Statistics store: 'D:\Work\project\terracotta\Tes
 tTerraCotta1\runnable2\.\statistics'.
 2551-08-29 18:19:33,614 INFO - Statistics buffer: 'D:\Work\project\terracotta\Te
 stTerraCotta1\runnable2\.\statistics'.
 2551-08-29 18:19:33,680 INFO - JMX Server started. Available at URL[service:jmx:
 jmxmp://0.0.0.0:9521]
 2551-08-29 18:19:35,883 INFO - NodeID[192.168.0.6:9530] joined the cluster
 2551-08-29 18:19:35,909 INFO - Moved to State[ PASSIVE-UNINITIALIZED ]
 2551-08-29 18:19:39,466 WARN - Terminating due to Zap request from NodeID : Node
 ID[192.168.0.6:9530] Error Type : Newly Joined Node Contains dirty database. (Pl
 ease clean up DB and restart node) Details : Nodes joining the cluster after sta
 rtup shouldnt have any Objects. NodeID[192.168.0.6:9531] contains 2827183 Object
 s !!! : Exception :
 java.lang.Throwable
         at com.tc.l2.objectserver.ReplicatedObjectManagerImpl.handleObjectListRe
 sponse(ReplicatedObjectManagerImpl.java:175)
         at com.tc.l2.objectserver.ReplicatedObjectManagerImpl.handleClusterObjec
 tMessage(ReplicatedObjectManagerImpl.java:146)
         at com.tc.l2.objectserver.ReplicatedObjectManagerImpl.messageReceived(Re
 plicatedObjectManagerImpl.java:120)
         at com.tc.net.groups.TCGroupManagerImpl.fireMessageReceivedEvent(TCGroup
 ManagerImpl.java:589)
         at com.tc.net.groups.TCGroupManagerImpl.messageReceived(TCGroupManagerIm
 pl.java:549)
         at com.tc.objectserver.handler.ReceiveGroupMessageHandler.handleEvent(Re
 ceiveGroupMessageHandler.java:22)
         at com.tc.async.impl.StageImpl$WorkerThread.run(StageImpl.java:142)
 
 
 
 D:\Work\project\terracotta\TestTerraCotta1\runnable2>server2.bat
 
 D:\Work\project\terracotta\TestTerraCotta1\runnable2>"D:\Terracotta\terracotta-2
 .6.2\bin"\start-tc-server.bat -f .\tc-config.xml -n "Server 2"
 2551-08-29 18:20:54,210 INFO - Terracotta 2.6.2, as of 20080626-150612 (Revision
  8952 by cruise@WXPMO0 from 2.6)
 2551-08-29 18:20:54,703 INFO - Configuration loaded from the file at 'D:\Work\pr
 oject\terracotta\TestTerraCotta1\runnable2\.\tc-config.xml'.
 2551-08-29 18:20:54,742 INFO - Log file: 'D:\Work\project\terracotta\TestTerraCo
 tta1\runnable2\.\logs\terracotta-server.log'.
 2551-08-29 18:20:57,108 INFO - Statistics store: 'D:\Work\project\terracotta\Tes
 tTerraCotta1\runnable2\.\statistics'.
 2551-08-29 18:20:59,391 INFO - Statistics buffer: 'D:\Work\project\terracotta\Te
 stTerraCotta1\runnable2\.\statistics'.
 2551-08-29 18:20:59,454 INFO - JMX Server started. Available at URL[service:jmx:
 jmxmp://0.0.0.0:9521]
 2551-08-29 18:21:00,229 INFO - NodeID[192.168.0.6:9530] joined the cluster
 2551-08-29 18:21:00,239 INFO - Moved to State[ PASSIVE-UNINITIALIZED ]
 java.lang.OutOfMemoryError: Java heap space
         at java.util.LinkedList.addBefore(LinkedList.java:778)
         at java.util.LinkedList.addFirst(LinkedList.java:153)
         at com.tc.stats.LossyStack.push(LossyStack.java:24)
         at com.tc.stats.counter.sampled.SampledCounterImpl.recordSample(SampledC
 ounterImpl.java:84)
         at com.tc.stats.counter.sampled.SampledCounterImpl$1.run(SampledCounterI
 mpl.java:35)
         at java.util.TimerThread.mainLoop(Timer.java:512)
         at java.util.TimerThread.run(Timer.java:462)
 java.lang.OutOfMemoryError: Java heap space
         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject
 .addConditionWaiter(AbstractQueuedSynchronizer.java:1739)
         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject
 .awaitNanos(AbstractQueuedSynchronizer.java:1954)
         at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.jav
 a:395)
         at com.tc.util.concurrent.TCLinkedBlockingQueue.poll(TCLinkedBlockingQue
 ue.java:30)
         at com.tc.async.impl.StageQueueImpl.poll(StageQueueImpl.java:103)
         at com.tc.async.impl.StageImpl$WorkerThread.run(StageImpl.java:129)
 


tc-config.xml

Code:
 <?xml version="1.0" encoding="UTF-8"?>
 <tc:tc-config xmlns:tc="http://www.terracotta.org/config">
         <servers>
                 <server name="Server 1">
                         <dso-port>9510</dso-port>
                         <jmx-port>9520</jmx-port>
                         <l2-group-port>9530</l2-group-port>
                         <dso>
                                 <persistence>
                                         <mode>permanent-store</mode>
                                 </persistence>
                         </dso>
                 </server>
                 <server name="Server 2">
                         <dso-port>9511</dso-port>
                         <jmx-port>9521</jmx-port>
                         <l2-group-port>9531</l2-group-port>
                         <dso>
                                 <persistence>
                                         <mode>permanent-store</mode>
                                 </persistence>
                         </dso>
                 </server>
                 <ha>
                         <mode>networked-active-passive</mode>
                         <networked-active-passive>
                                 <election-time>5</election-time>
                         </networked-active-passive>
                 </ha>
         </servers>
         <application>
                 <dso>
                         <roots>
                                 <root>
                                         <field-name>test.PolicyMapManager.map</field-name>
                                 </root>
                         </roots>
                         <instrumented-classes>
                                 <include>
                                         <class-expression>test.PolicyMapManager</class-expression>
                                 </include>
                                 <include>
                                         <class-expression>test.PolicyManager</class-expression>
                                 </include>
                                 <include>
                                         <class-expression>test.model.SendingHistoryList</class-expression>
                                 </include>
                                 <include>
                                         <class-expression>test.model.SendingHistory</class-expression>
                                 </include>
                         </instrumented-classes>
                 </dso>
         </application>
 </tc:tc-config>
 


class that contains root

Code:
 public class PolicyMapManager implements PolicyManager {
 
         private Map<String, SendingHistoryList> map = new ConcurrentHashMap<String, SendingHistoryList>();
 
         public List<SendingHistory> getSendDateHistory(String mobileNumber,
                 MessageType messageType, boolean quiet) {
 
                 SendingHistoryList historyList = map.get(mobileNumber);
                 if(historyList != null) {
                         historyList.clearOldHistory();
                 } else {
                         historyList = new SendingHistoryList();
                 }
 
                 SendingHistoryList cloned = new SendingHistoryList();
                 cloned.addAll(historyList);
 
                 if(!quiet) {
                         cloned.add(new SendingHistory(messageType));
                 }
 
                 map.put(mobileNumber, cloned);
 
                 return Collections.unmodifiableList(cloned);
         }
 }
 


Code:
 public class SendingHistoryList extends ArrayList<SendingHistory> {
 
         public synchronized void clearOldHistory() {
                 Calendar calendar = Calendar.getInstance();
                 calendar.add(Calendar.DATE, -30);
                 Date date30 = calendar.getTime();
 
                 List<Date> clear = new ArrayList<Date>();
 
                 for(SendingHistory history : this) {
                         Date date = history.getDate();
 
                         if(date.before(date30)) {
                                 clear.add(date);
                         }
                 }
 
                 removeAll(clear);
         }
 }
 


Code:
 public class SendingHistory {
 
         private MessageType messageType;
 
         private Date date;
 
         public SendingHistory(MessageType messageType) {
                 this(messageType, new Date());
         }
 
         public SendingHistory(MessageType messageType, Date date) {
                 this.messageType = messageType;
                 this.date = date;
         }
 
         ...setter & getter...
 }
 


2) I notice on admin console when I connect client to tc server, it shows WARNING about client send too large data through network (sorry I can't remember the exact words in log) about tcconnect or something like that send about 17,000,000 - 20,000,000 bytes (Indeed I still test all things on my localhost now, but I worry about my real enveironment). and this number keep growing more if my map contain more data, wonder if it would grows upto 100 M or not. is this a weird behaviour?

3) Please see my code of PolicyMapManager. I have to clone the collection of SendingHistoryList and then modify the cloned and replace it into the same key, is it ok, even it's ok what is the correct synchronized way to this code? At first time, I try to modify the historyList itself, tc have me to also instrument class SendingHistoryList and SendingHistory , after I follow that I will always get TCUnlockException at this line:

cloned.add(new SendingHistory(messageType));
//first version that error is historyList.add(new SendingHistory(messageType));

I try to to do "public synchronized List<SendingHistory> getSendDateHistory(String mobileNumber, MessageType messageType, boolean quiet) { ", it still not works, and I don't know how to syn things in this method to make it works..

I try adding synchronized(map) { } around all code in this methid it still not works..

I try config auto-syn this method in lock config so let tc syn it for me it still not works..
tgautier

seraphim

Joined: 06/05/2006 12:19:26
Messages: 1781
Offline

You should give 2.7 a try - it has improved support for these kinds of things - the DGC is improved, so it may behave better with respect to your memory issues.

As for the restartability, 2.7 also fixes those issues.

If 2.7 does not fix your memory issues, you may need some tuning with regard to heap sizes and the JVM collectors - we do this on a regular basis with our customers - the server is still a Java process at the end of the day for which intensive applications still requires Java memory tuning.

For example, the default memory setting in the start server script is 256MB for the heap size. You might try increasing that to 1GB or 2GB.
[WWW]
blackza

neo

Joined: 08/25/2008 00:58:07
Messages: 5
Offline

Thank,

I try increase memory and it works now.
 
Forum Index -> General
Go to:   
Powered by JForum 2.1.7 © JForum Team