[Logo] Terracotta Discussion Forums (LEGACY READ-ONLY ARCHIVE)
  [Search] Search   [Recent Topics] Recent Topics   [Members]  Member Listing   [Groups] Back to home page 
[Register] Register / 
[Login] Login 
[Expert]
Messages posted by: halbert  XML
Profile for halbert -> Messages posted by halbert [24] Go to Page: 1, 2 Next 
Author Message
I finally figured it out.

Do not use this JVM option to start Terracotta: -XX:+DisableExplicitGC

Using that JVM option prevented Terracota from garbage collecting the direct memory so it just grew without bounds !!!

Arrgggghhhh.

So much pain
Ok, the issue has not gone away.

One thing I do have as a JVM option is -XX:+DisableExplicitGC. I read somewhere that having that option may lead to NIO memory leaks so I removed the option and tried again.

My observations were that it was still leaking into direct memory AND ignoring the -XX:MaxDirectMemorySize option !!!

With the DisableExplicitGC option, the MaxDirectMemorySize option was honored and the process would die once the direct memory maximum size was reached/exceeded.

Without the DisableExplicitGC option, direct memory continues to grow beyond the bounds specified by MaxDirectMemorySize and my belief is that it will continue to grow until all system memory has been exhausted leading to machine crash.

I would have expected MaxDirectMemorySize to be honored with or without DisableExplicitGC specified.

The behavior is quite odd and unexpected.

Does anyone have any explanation for this behavior?

Thank you.
Yes, that size warning you see is from some fairly large pieces of data that's being used.

We have these fairly large data files that are parsed then cached.

Not quite sure why there are so many of these warning.

But these shouldn't cause a Direct buffer leak.
Any progress on resolving this issue?
The machines have 40GB of memory.

The Direct Buffer OOM happened with this JVM option set: -XX:MaxDirectMemorySize=5g

Without that option set, Terracotta will eat up every last byte of availalbe system memory and crash the machine itself.

I've included 2 logs files, they are from the same log.

head.log is the top (2000 lines) of the log file
tail.log is the bottom (10,000 lines) of the log file

Hope you folks manage to get to the bottom of this.

Thanks.

The frequency of the Direct Memory (BigMemory) OOM crashes has decreased since upgrading to 3.7.7 but has not disappeared.

The Big Memory OOM crashes are at a level that's currently tolerable since I have cron job that runs every minute and checks to see if Terracotta is running. If it is not, it restarts it automatically.
I've updated to Terracotta 3.7.7 Open Source.

The direct memory leak has not gone away. It still runs out of direct memory.

Here's the stack trace:

Code:
java.lang.RuntimeException: java.lang.OutOfMemoryError: Direct buffer memory
         at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:293)
 Caused by: java.lang.OutOfMemoryError: Direct buffer memory
         at java.nio.Bits.reserveMemory(Bits.java:632)
         at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:97)
         at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288)
         at com.tc.bytes.TCByteBufferImpl.<init>(TCByteBufferImpl.java:33)
         at com.tc.bytes.TCByteBufferFactory.createNewInstance(TCByteBufferFactory.java:78)
         at com.tc.bytes.TCByteBufferFactory.getFromPoolOrCreate(TCByteBufferFactory.java:125)
         at com.tc.bytes.TCByteBufferFactory.getFixedSizedInstancesForLength(TCByteBufferFactory.java:162)
         at com.tc.net.core.TCConnectionImpl$WriteContext.getPackedUpMessage(TCConnectionImpl.java:990)
         at com.tc.net.core.TCConnectionImpl$WriteContext.<init>(TCConnectionImpl.java:941)
         at com.tc.net.core.TCConnectionImpl.buildWriteContextsFromMessages(TCConnectionImpl.java:420)
         at com.tc.net.core.TCConnectionImpl.doWriteToBufferInternal(TCConnectionImpl.java:481)
         at com.tc.net.core.TCConnectionImpl.doWriteToBuffer(TCConnectionImpl.java:367)
         at com.tc.net.core.TCConnectionImpl.doWriteInternal(TCConnectionImpl.java:324)
         at com.tc.net.core.TCConnectionImpl.doWrite(TCConnectionImpl.java:317)
         at com.tc.net.core.CoreNIOServices$CommThread.selectLoop(CoreNIOServices.java:630)
         at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:290)
 java.lang.RuntimeException: java.lang.OutOfMemoryError: Direct buffer memory
         at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:293)
 Caused by: java.lang.OutOfMemoryError: Direct buffer memory
         at java.nio.Bits.reserveMemory(Bits.java:632)
         at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:97)
         at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288)
         at com.tc.bytes.TCByteBufferImpl.<init>(TCByteBufferImpl.java:33)
         at com.tc.bytes.TCByteBufferFactory.createNewInstance(TCByteBufferFactory.java:78)
         at com.tc.bytes.TCByteBufferFactory.getFromPoolOrCreate(TCByteBufferFactory.java:125)
         at com.tc.bytes.TCByteBufferFactory.getFixedSizedInstancesForLength(TCByteBufferFactory.java:162)
         at com.tc.net.core.TCConnectionImpl$WriteContext.getPackedUpMessage(TCConnectionImpl.java:990)
         at com.tc.net.core.TCConnectionImpl$WriteContext.<init>(TCConnectionImpl.java:941)
         at com.tc.net.core.TCConnectionImpl.buildWriteContextsFromMessages(TCConnectionImpl.java:388)
         at com.tc.net.core.TCConnectionImpl.doWriteToBufferInternal(TCConnectionImpl.java:481)
         at com.tc.net.core.TCConnectionImpl.doWriteToBuffer(TCConnectionImpl.java:367)
         at com.tc.net.core.TCConnectionImpl.doWriteInternal(TCConnectionImpl.java:324)
         at com.tc.net.core.TCConnectionImpl.doWrite(TCConnectionImpl.java:317)
         at com.tc.net.core.CoreNIOServices$CommThread.selectLoop(CoreNIOServices.java:630)
         at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:290)


Can somebody at Terracotta take a look at that please. Why is something that's not supposed to be using direct memory at all, using it and using gigs of it !!!

It's not making sense to me.
I'm in the process of updating to Terracotta Open Source 3.7.7.

I did notice that the client process would infrequently shutdown on it's own with this in the log:

Code:
2014-02-12 19:56:26,931 [L1_L2:TCComm Main Selector Thread_R (listen 0:0:0:0:0:0:0:0:57642)] ERROR STDERR -     at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:293)
 2014-02-12 19:56:26,931 [L1_L2:TCComm Main Selector Thread_R (listen 0:0:0:0:0:0:0:0:57642)] ERROR STDERR - Caused by: com.tc.util.TCAssertionError: Unexpected message while in handshaking mode: Message Class: com.tc.net.protocol.delivery.OOOProtocolMessageImpl
 Sealed: true, Header Length: 48, Data Length: 369, Total Length: 417
 Header (com.tc.net.protocol.delivery.OOOProtocolMessageHeader)
         Type=TYPE_SEND sessId=4aea2806906649db89dd2f1b2a6d46b0 seq=341847
 Payload:
         Buffer 0: TCByteBufferJDK14@1170860088(java.nio.HeapByteBuffer[pos=0 lim=369 cap=369])
         369 bytes:
         00000000: 0102 0045 0000 0000 0000 0001 0000 0000  ...E............
         00000010: 0b00 0000 0000 002e 2f00 0000 0100 0000  ......../.......
         00000020: 0100 0000 0006 cd0e 8000 8000 0000 0000  ................
         00000030: 0000 0000 0000 002e 5f00 0000 0100 0000  ........_.......
         00000040: 0100 0000 0006 cd0e 8010 0000 0000 0000  ................
         00000050: 0000 0000 0000 002e 8200 0000 0100 0000  ................
         00000060: 0100 0000 0006 cd0e 8004 0000 0000 0000  ................
         00000070: 0000 0000 0000 002e b200 0000 0100 0000  ................
         00000080: 0100 0000 0006 cd0e 8000 2000 0000 0000  .......... .....
         00000090: 0000 0000 0000 002d d300 0000 0100 0000  .......-........
         000000a0: 0100 0000 0006 cd0e 8008 0000 0000 0000  ................
         000000b0: 0000 0000 0000 002e 9100 0000 0100 0000  ................
         000000c0: 0100 0000 0006 cd0e 8000 0200 0000 0000  ................
         000000d0: 0000 0000 0000 002e 7100 0000 0100 0000  ........q.......
         000000e0: 0100 0000 0006 cd0e 8000 0400 0000 0000  ................
         000000f0: 0000 0000 0000 002e ae00 0000 0100 0000  ................
         00000100: 0100 0000 0006 cd0e 8001 0000 0000 0000  ................
         00000110: 0000 0000 0000 002e b800 0000 0100 0000  ................
         00000120: 0100 0000 0006 cd0e 8000 0800 0000 0000  ................
         00000130: 0000 0000 0000 002e 3000 0000 0100 0000  ........0.......
         00000140: 0100 0000 0006 cd0e 8000 4000 0000 0000  ..........@.....
         00000150: 0000 0000 0000 002e 3200 0000 0100 0000  ........2.......
         00000160: 0100 0000 0006 cd0e 8000 1000 0000 0000  ................
         00000170: 00                              .
 


Looks like it's NIO related.
Here's the complete stack trace of the exception:

Code:
java.lang.RuntimeException: java.lang.OutOfMemoryError: Direct buffer memory
         at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:293)
 Caused by: java.lang.OutOfMemoryError: Direct buffer memory
         at java.nio.Bits.reserveMemory(Bits.java:632)
         at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:97)
         at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288)
         at com.tc.bytes.TCByteBufferImpl.<init>(TCByteBufferImpl.java:33)
         at com.tc.bytes.TCByteBufferFactory.createNewInstance(TCByteBufferFactory.java:78)
         at com.tc.bytes.TCByteBufferFactory.getFromPoolOrCreate(TCByteBufferFactory.java:125)
         at com.tc.bytes.TCByteBufferFactory.getFixedSizedInstancesForLength(TCByteBufferFactory.java:162)
         at com.tc.net.core.TCConnectionImpl$WriteContext.getPackedUpMessage(TCConnectionImpl.java:963)
         at com.tc.net.core.TCConnectionImpl$WriteContext.<init>(TCConnectionImpl.java:914)
         at com.tc.net.core.TCConnectionImpl.buildWriteContextsFromMessages(TCConnectionImpl.java:414)
         at com.tc.net.core.TCConnectionImpl.doWriteToBufferInternal(TCConnectionImpl.java:475)
         at com.tc.net.core.TCConnectionImpl.doWriteToBuffer(TCConnectionImpl.java:361)
         at com.tc.net.core.TCConnectionImpl.doWriteInternal(TCConnectionImpl.java:317)
         at com.tc.net.core.TCConnectionImpl.doWrite(TCConnectionImpl.java:312)
         at com.tc.net.core.CoreNIOServices$CommThread.selectLoop(CoreNIOServices.java:630)
         at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:290)
 java.lang.RuntimeException: java.lang.OutOfMemoryError: Direct buffer memory
         at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:293)
 Caused by: java.lang.OutOfMemoryError: Direct buffer memory
         at java.nio.Bits.reserveMemory(Bits.java:632)
         at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:97)
         at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288)
         at com.tc.bytes.TCByteBufferImpl.<init>(TCByteBufferImpl.java:33)
         at com.tc.bytes.TCByteBufferFactory.createNewInstance(TCByteBufferFactory.java:78)
         at com.tc.bytes.TCByteBufferFactory.getFromPoolOrCreate(TCByteBufferFactory.java:125)
         at com.tc.bytes.TCByteBufferFactory.getFixedSizedInstancesForLength(TCByteBufferFactory.java:162)
         at com.tc.net.core.TCConnectionImpl$WriteContext.getPackedUpMessage(TCConnectionImpl.java:963)
         at com.tc.net.core.TCConnectionImpl$WriteContext.<init>(TCConnectionImpl.java:914)
         at com.tc.net.core.TCConnectionImpl.buildWriteContextsFromMessages(TCConnectionImpl.java:382)
         at com.tc.net.core.TCConnectionImpl.doWriteToBufferInternal(TCConnectionImpl.java:475)
         at com.tc.net.core.TCConnectionImpl.doWriteToBuffer(TCConnectionImpl.java:361)
         at com.tc.net.core.TCConnectionImpl.doWriteInternal(TCConnectionImpl.java:317)
         at com.tc.net.core.TCConnectionImpl.doWrite(TCConnectionImpl.java:312)
         at com.tc.net.core.CoreNIOServices$CommThread.selectLoop(CoreNIOServices.java:630)
         at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:290)
 java.lang.RuntimeException: java.lang.OutOfMemoryError: Direct buffer memory
         at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:293)
 Caused by: java.lang.OutOfMemoryError: Direct buffer memory
         at java.nio.Bits.reserveMemory(Bits.java:632)
         at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:97)
         at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288)
         at com.tc.bytes.TCByteBufferImpl.<init>(TCByteBufferImpl.java:33)
         at com.tc.bytes.TCByteBufferFactory.createNewInstance(TCByteBufferFactory.java:78)
         at com.tc.bytes.TCByteBufferFactory.getFromPoolOrCreate(TCByteBufferFactory.java:125)
         at com.tc.bytes.TCByteBufferFactory.getFixedSizedInstancesForLength(TCByteBufferFactory.java:162)
         at com.tc.net.core.TCConnectionImpl$WriteContext.getPackedUpMessage(TCConnectionImpl.java:963)
         at com.tc.net.core.TCConnectionImpl$WriteContext.<init>(TCConnectionImpl.java:914)
         at com.tc.net.core.TCConnectionImpl.buildWriteContextsFromMessages(TCConnectionImpl.java:395)
         at com.tc.net.core.TCConnectionImpl.doWriteToBufferInternal(TCConnectionImpl.java:475)
         at com.tc.net.core.TCConnectionImpl.doWriteToBuffer(TCConnectionImpl.java:361)
         at com.tc.net.core.TCConnectionImpl.doWriteInternal(TCConnectionImpl.java:317)
         at com.tc.net.core.TCConnectionImpl.doWrite(TCConnectionImpl.java:312)
         at com.tc.net.core.CoreNIOServices$CommThread.selectLoop(CoreNIOServices.java:630)
         at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:290)
 java.lang.RuntimeException: java.lang.OutOfMemoryError: Direct buffer memory
         at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:293)
 Caused by: java.lang.OutOfMemoryError: Direct buffer memory
         at java.nio.Bits.reserveMemory(Bits.java:632)
         at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:97)
         at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288)
         at com.tc.bytes.TCByteBufferImpl.<init>(TCByteBufferImpl.java:33)
         at com.tc.bytes.TCByteBufferFactory.createNewInstance(TCByteBufferFactory.java:78)
         at com.tc.bytes.TCByteBufferFactory.getFromPoolOrCreate(TCByteBufferFactory.java:125)
         at com.tc.bytes.TCByteBufferFactory.getFixedSizedInstancesForLength(TCByteBufferFactory.java:162)
         at com.tc.net.core.TCConnectionImpl$WriteContext.getPackedUpMessage(TCConnectionImpl.java:963)
         at com.tc.net.core.TCConnectionImpl$WriteContext.<init>(TCConnectionImpl.java:914)
         at com.tc.net.core.TCConnectionImpl.buildWriteContextsFromMessages(TCConnectionImpl.java:414)
         at com.tc.net.core.TCConnectionImpl.doWriteToBufferInternal(TCConnectionImpl.java:475)
         at com.tc.net.core.TCConnectionImpl.doWriteToBuffer(TCConnectionImpl.java:361)
         at com.tc.net.core.TCConnectionImpl.doWriteInternal(TCConnectionImpl.java:317)
         at com.tc.net.core.TCConnectionImpl.doWrite(TCConnectionImpl.java:312)
         at com.tc.net.core.CoreNIOServices$CommThread.selectLoop(CoreNIOServices.java:630)
         at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:290)


That's with the JVM option -XX:MaxDirectMemorySize=5g, so it's using a lot of direct memory for something that's not supposed to be using it at all !!!!!!

It's clear from the stack trace that it is writing to direct memory. Seems related to NIO operations.

The tc-config.xml is attached.

klalithr wrote:
Thats correct - OSS doesnt use BigMemory (direct memory). There is something else that you are doing thats incorrect. Probably a look at your tc-config should help.
The other recommendation I have is to move to the latest version on the 3.7 line. 


If Terracotta Open Source (OSS) isn't intentionally using BigMemory (direct memory), then how does one explain the exception:

java.lang.OutOfMemoryError: Direct buffer memory

That came from the Terracotta log.
Thanks for the reply.

It's obviously an unintentional direct memory leak bug. The fact that it grew well past the heap maximum (-Xmx) is proof to me that it's misbehaving. It actually grew 20G more than the configure heap max !!!

And the fact that setting -XX:MaxDirectMemory caused Terracotta to quit after some time means it's leaking.

I tried using JDK 6 and 7 but neither helped. I initially thought it was a Java bug.

I'll see if 3.7.7 is better but I did not see any thing relating to a leak fixed in the list of changes.

Surely somebody must know something about this.

It's my understanding that the Terracotta Open Source shouldn't even be using direct memory - is that correct?
What exactly is going on with Terracotta Open Source ?

It seems to me that there is a deliberate attempt to hide links to versions of Terracotta Open Source in attempt to convince people to pay for the commercial version.

I find this act despicable.
I have a Terracotta 3.7.0 (OpenSource) array comprised of 2 servers running on Linux boxes.

Recently the host machines would freeze up and sometimes reboot on their own. It was discovered that they ran out of memory.

Further investigation showed that Terracotta was the source of the memory leak.

The leak was not in the regular java heap - it was a leak into direct memory.

Basically, the Terracotta process would continuously grown until it crashed the machine.

So I placed a limit on the direct memory buffer size by using the JVM option: -XX:MaxDirectMemorySize

Well that prevented the process from growing infinitely, but Terracotta would die after it attempted to grow beyond that size.

The exception I get is:

Code:
java.lang.RuntimeException: java.lang.OutOfMemoryError: Direct buffer memory
         at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:293)
 Caused by: java.lang.OutOfMemoryError: Direct buffer memory
         at java.nio.Bits.reserveMemory(Bits.java:632)
         at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:97)
         at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288)
         at com.tc.bytes.TCByteBufferImpl.<init>(TCByteBufferImpl.java:33)
         at com.tc.bytes.TCByteBufferFactory.createNewInstance(TCByteBufferFactory.java:78)
         at com.tc.bytes.TCByteBufferFactory.getFromPoolOrCreate(TCByteBufferFactory.java:125)
         at com.tc.bytes.TCByteBufferFactory.getFixedSizedInstancesForLength(TCByteBufferFactory.java:


I have a few question concerning this direct memory.

Why is the Terracotta (Open Source) putting stuff into this direct memory?

How can I prevent it from putting stuff into direct memory or at least limit the amount of data it puts into it?

This is somewhat urgent so I do need a reply.

Thanks.

I created a bug report:

https://jira.terracotta.org/jira/browse/EHCTERR-45
 
Profile for halbert -> Messages posted by halbert [24] Go to Page: 1, 2 Next 
Go to:   
Powered by JForum 2.1.7 © JForum Team