Author |
Message |
02/05/2014 11:31:26
|
halbert
journeyman
Joined: 12/22/2010 14:18:29
Messages: 29
Offline
|
I have a Terracotta 3.7.0 (OpenSource) array comprised of 2 servers running on Linux boxes.
Recently the host machines would freeze up and sometimes reboot on their own. It was discovered that they ran out of memory.
Further investigation showed that Terracotta was the source of the memory leak.
The leak was not in the regular java heap - it was a leak into direct memory.
Basically, the Terracotta process would continuously grown until it crashed the machine.
So I placed a limit on the direct memory buffer size by using the JVM option: -XX:MaxDirectMemorySize
Well that prevented the process from growing infinitely, but Terracotta would die after it attempted to grow beyond that size.
The exception I get is:
Code:
java.lang.RuntimeException: java.lang.OutOfMemoryError: Direct buffer memory
at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:293)
Caused by: java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:632)
at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:97)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288)
at com.tc.bytes.TCByteBufferImpl.<init>(TCByteBufferImpl.java:33)
at com.tc.bytes.TCByteBufferFactory.createNewInstance(TCByteBufferFactory.java:78)
at com.tc.bytes.TCByteBufferFactory.getFromPoolOrCreate(TCByteBufferFactory.java:125)
at com.tc.bytes.TCByteBufferFactory.getFixedSizedInstancesForLength(TCByteBufferFactory.java:
I have a few question concerning this direct memory.
Why is the Terracotta (Open Source) putting stuff into this direct memory?
How can I prevent it from putting stuff into direct memory or at least limit the amount of data it puts into it?
This is somewhat urgent so I do need a reply.
Thanks.
|
|
|
02/07/2014 17:00:24
|
halbert
journeyman
Joined: 12/22/2010 14:18:29
Messages: 29
Offline
|
Surely somebody must know something about this.
It's my understanding that the Terracotta Open Source shouldn't even be using direct memory - is that correct?
|
|
|
02/07/2014 17:13:08
|
klalithr
consul
Joined: 01/23/2011 10:58:07
Messages: 489
Offline
|
Thats correct - OSS doesnt use BigMemory (direct memory). There is something else that you are doing thats incorrect. Probably a look at your tc-config should help.
The other recommendation I have is to move to the latest version on the 3.7 line.
|
Karthik Lalithraj (Terracotta) |
|
|
02/07/2014 17:29:06
|
halbert
journeyman
Joined: 12/22/2010 14:18:29
Messages: 29
Offline
|
Thanks for the reply.
It's obviously an unintentional direct memory leak bug. The fact that it grew well past the heap maximum (-Xmx) is proof to me that it's misbehaving. It actually grew 20G more than the configure heap max !!!
And the fact that setting -XX:MaxDirectMemory caused Terracotta to quit after some time means it's leaking.
I tried using JDK 6 and 7 but neither helped. I initially thought it was a Java bug.
I'll see if 3.7.7 is better but I did not see any thing relating to a leak fixed in the list of changes.
|
|
|
02/07/2014 18:07:17
|
halbert
journeyman
Joined: 12/22/2010 14:18:29
Messages: 29
Offline
|
klalithr wrote:
Thats correct - OSS doesnt use BigMemory (direct memory). There is something else that you are doing thats incorrect. Probably a look at your tc-config should help.
The other recommendation I have is to move to the latest version on the 3.7 line.
If Terracotta Open Source (OSS) isn't intentionally using BigMemory (direct memory), then how does one explain the exception:
java.lang.OutOfMemoryError: Direct buffer memory
That came from the Terracotta log.
|
|
|
02/07/2014 19:48:41
|
klalithr
consul
Joined: 01/23/2011 10:58:07
Messages: 489
Offline
|
Fair point. I need to look at the complete logs and the tc-cofig.xml to be sure. But nevertheless, I would stronly recommend the latest version in the 3.7 line, just to be sure.
|
Karthik Lalithraj (Terracotta) |
|
|
02/07/2014 20:21:06
|
halbert
journeyman
Joined: 12/22/2010 14:18:29
Messages: 29
Offline
|
Here's the complete stack trace of the exception:
Code:
java.lang.RuntimeException: java.lang.OutOfMemoryError: Direct buffer memory
at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:293)
Caused by: java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:632)
at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:97)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288)
at com.tc.bytes.TCByteBufferImpl.<init>(TCByteBufferImpl.java:33)
at com.tc.bytes.TCByteBufferFactory.createNewInstance(TCByteBufferFactory.java:78)
at com.tc.bytes.TCByteBufferFactory.getFromPoolOrCreate(TCByteBufferFactory.java:125)
at com.tc.bytes.TCByteBufferFactory.getFixedSizedInstancesForLength(TCByteBufferFactory.java:162)
at com.tc.net.core.TCConnectionImpl$WriteContext.getPackedUpMessage(TCConnectionImpl.java:963)
at com.tc.net.core.TCConnectionImpl$WriteContext.<init>(TCConnectionImpl.java:914)
at com.tc.net.core.TCConnectionImpl.buildWriteContextsFromMessages(TCConnectionImpl.java:414)
at com.tc.net.core.TCConnectionImpl.doWriteToBufferInternal(TCConnectionImpl.java:475)
at com.tc.net.core.TCConnectionImpl.doWriteToBuffer(TCConnectionImpl.java:361)
at com.tc.net.core.TCConnectionImpl.doWriteInternal(TCConnectionImpl.java:317)
at com.tc.net.core.TCConnectionImpl.doWrite(TCConnectionImpl.java:312)
at com.tc.net.core.CoreNIOServices$CommThread.selectLoop(CoreNIOServices.java:630)
at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:290)
java.lang.RuntimeException: java.lang.OutOfMemoryError: Direct buffer memory
at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:293)
Caused by: java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:632)
at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:97)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288)
at com.tc.bytes.TCByteBufferImpl.<init>(TCByteBufferImpl.java:33)
at com.tc.bytes.TCByteBufferFactory.createNewInstance(TCByteBufferFactory.java:78)
at com.tc.bytes.TCByteBufferFactory.getFromPoolOrCreate(TCByteBufferFactory.java:125)
at com.tc.bytes.TCByteBufferFactory.getFixedSizedInstancesForLength(TCByteBufferFactory.java:162)
at com.tc.net.core.TCConnectionImpl$WriteContext.getPackedUpMessage(TCConnectionImpl.java:963)
at com.tc.net.core.TCConnectionImpl$WriteContext.<init>(TCConnectionImpl.java:914)
at com.tc.net.core.TCConnectionImpl.buildWriteContextsFromMessages(TCConnectionImpl.java:382)
at com.tc.net.core.TCConnectionImpl.doWriteToBufferInternal(TCConnectionImpl.java:475)
at com.tc.net.core.TCConnectionImpl.doWriteToBuffer(TCConnectionImpl.java:361)
at com.tc.net.core.TCConnectionImpl.doWriteInternal(TCConnectionImpl.java:317)
at com.tc.net.core.TCConnectionImpl.doWrite(TCConnectionImpl.java:312)
at com.tc.net.core.CoreNIOServices$CommThread.selectLoop(CoreNIOServices.java:630)
at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:290)
java.lang.RuntimeException: java.lang.OutOfMemoryError: Direct buffer memory
at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:293)
Caused by: java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:632)
at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:97)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288)
at com.tc.bytes.TCByteBufferImpl.<init>(TCByteBufferImpl.java:33)
at com.tc.bytes.TCByteBufferFactory.createNewInstance(TCByteBufferFactory.java:78)
at com.tc.bytes.TCByteBufferFactory.getFromPoolOrCreate(TCByteBufferFactory.java:125)
at com.tc.bytes.TCByteBufferFactory.getFixedSizedInstancesForLength(TCByteBufferFactory.java:162)
at com.tc.net.core.TCConnectionImpl$WriteContext.getPackedUpMessage(TCConnectionImpl.java:963)
at com.tc.net.core.TCConnectionImpl$WriteContext.<init>(TCConnectionImpl.java:914)
at com.tc.net.core.TCConnectionImpl.buildWriteContextsFromMessages(TCConnectionImpl.java:395)
at com.tc.net.core.TCConnectionImpl.doWriteToBufferInternal(TCConnectionImpl.java:475)
at com.tc.net.core.TCConnectionImpl.doWriteToBuffer(TCConnectionImpl.java:361)
at com.tc.net.core.TCConnectionImpl.doWriteInternal(TCConnectionImpl.java:317)
at com.tc.net.core.TCConnectionImpl.doWrite(TCConnectionImpl.java:312)
at com.tc.net.core.CoreNIOServices$CommThread.selectLoop(CoreNIOServices.java:630)
at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:290)
java.lang.RuntimeException: java.lang.OutOfMemoryError: Direct buffer memory
at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:293)
Caused by: java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:632)
at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:97)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288)
at com.tc.bytes.TCByteBufferImpl.<init>(TCByteBufferImpl.java:33)
at com.tc.bytes.TCByteBufferFactory.createNewInstance(TCByteBufferFactory.java:78)
at com.tc.bytes.TCByteBufferFactory.getFromPoolOrCreate(TCByteBufferFactory.java:125)
at com.tc.bytes.TCByteBufferFactory.getFixedSizedInstancesForLength(TCByteBufferFactory.java:162)
at com.tc.net.core.TCConnectionImpl$WriteContext.getPackedUpMessage(TCConnectionImpl.java:963)
at com.tc.net.core.TCConnectionImpl$WriteContext.<init>(TCConnectionImpl.java:914)
at com.tc.net.core.TCConnectionImpl.buildWriteContextsFromMessages(TCConnectionImpl.java:414)
at com.tc.net.core.TCConnectionImpl.doWriteToBufferInternal(TCConnectionImpl.java:475)
at com.tc.net.core.TCConnectionImpl.doWriteToBuffer(TCConnectionImpl.java:361)
at com.tc.net.core.TCConnectionImpl.doWriteInternal(TCConnectionImpl.java:317)
at com.tc.net.core.TCConnectionImpl.doWrite(TCConnectionImpl.java:312)
at com.tc.net.core.CoreNIOServices$CommThread.selectLoop(CoreNIOServices.java:630)
at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:290)
That's with the JVM option -XX:MaxDirectMemorySize=5g, so it's using a lot of direct memory for something that's not supposed to be using it at all !!!!!!
It's clear from the stack trace that it is writing to direct memory. Seems related to NIO operations.
The tc-config.xml is attached.
Filename |
tc-config.xml |
Download
|
Description |
|
Filesize |
3 Kbytes
|
Downloaded: |
2370 time(s) |
|
|
|
02/07/2014 20:49:01
|
klalithr
consul
Joined: 01/23/2011 10:58:07
Messages: 489
Offline
|
The JVM option (-XX:MaxDirectMemorySize=5g) is not an issue.
All it does it tell the JVM that there could be potentialy upto 5gb of space that could be allocated outside of the heap. This doesnt allocate any. You can safely ignore that parameter.
Your tc-config looks good. You have a number of extra config parameters which I would remove but that is unrelated to the current issue. I would suggest to use the latest in the 3.7 line and report back.
|
Karthik Lalithraj (Terracotta) |
|
|
02/12/2014 18:07:24
|
halbert
journeyman
Joined: 12/22/2010 14:18:29
Messages: 29
Offline
|
I'm in the process of updating to Terracotta Open Source 3.7.7.
I did notice that the client process would infrequently shutdown on it's own with this in the log:
Code:
2014-02-12 19:56:26,931 [L1_L2:TCComm Main Selector Thread_R (listen 0:0:0:0:0:0:0:0:57642)] ERROR STDERR - at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:293)
2014-02-12 19:56:26,931 [L1_L2:TCComm Main Selector Thread_R (listen 0:0:0:0:0:0:0:0:57642)] ERROR STDERR - Caused by: com.tc.util.TCAssertionError: Unexpected message while in handshaking mode: Message Class: com.tc.net.protocol.delivery.OOOProtocolMessageImpl
Sealed: true, Header Length: 48, Data Length: 369, Total Length: 417
Header (com.tc.net.protocol.delivery.OOOProtocolMessageHeader)
Type=TYPE_SEND sessId=4aea2806906649db89dd2f1b2a6d46b0 seq=341847
Payload:
Buffer 0: TCByteBufferJDK14@1170860088(java.nio.HeapByteBuffer[pos=0 lim=369 cap=369])
369 bytes:
00000000: 0102 0045 0000 0000 0000 0001 0000 0000 ...E............
00000010: 0b00 0000 0000 002e 2f00 0000 0100 0000 ......../.......
00000020: 0100 0000 0006 cd0e 8000 8000 0000 0000 ................
00000030: 0000 0000 0000 002e 5f00 0000 0100 0000 ........_.......
00000040: 0100 0000 0006 cd0e 8010 0000 0000 0000 ................
00000050: 0000 0000 0000 002e 8200 0000 0100 0000 ................
00000060: 0100 0000 0006 cd0e 8004 0000 0000 0000 ................
00000070: 0000 0000 0000 002e b200 0000 0100 0000 ................
00000080: 0100 0000 0006 cd0e 8000 2000 0000 0000 .......... .....
00000090: 0000 0000 0000 002d d300 0000 0100 0000 .......-........
000000a0: 0100 0000 0006 cd0e 8008 0000 0000 0000 ................
000000b0: 0000 0000 0000 002e 9100 0000 0100 0000 ................
000000c0: 0100 0000 0006 cd0e 8000 0200 0000 0000 ................
000000d0: 0000 0000 0000 002e 7100 0000 0100 0000 ........q.......
000000e0: 0100 0000 0006 cd0e 8000 0400 0000 0000 ................
000000f0: 0000 0000 0000 002e ae00 0000 0100 0000 ................
00000100: 0100 0000 0006 cd0e 8001 0000 0000 0000 ................
00000110: 0000 0000 0000 002e b800 0000 0100 0000 ................
00000120: 0100 0000 0006 cd0e 8000 0800 0000 0000 ................
00000130: 0000 0000 0000 002e 3000 0000 0100 0000 ........0.......
00000140: 0100 0000 0006 cd0e 8000 4000 0000 0000 ..........@.....
00000150: 0000 0000 0000 002e 3200 0000 0100 0000 ........2.......
00000160: 0100 0000 0006 cd0e 8000 1000 0000 0000 ................
00000170: 00 .
Looks like it's NIO related.
|
|
|
02/18/2014 10:28:23
|
halbert
journeyman
Joined: 12/22/2010 14:18:29
Messages: 29
Offline
|
I've updated to Terracotta 3.7.7 Open Source.
The direct memory leak has not gone away. It still runs out of direct memory.
Here's the stack trace:
Code:
java.lang.RuntimeException: java.lang.OutOfMemoryError: Direct buffer memory
at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:293)
Caused by: java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:632)
at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:97)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288)
at com.tc.bytes.TCByteBufferImpl.<init>(TCByteBufferImpl.java:33)
at com.tc.bytes.TCByteBufferFactory.createNewInstance(TCByteBufferFactory.java:78)
at com.tc.bytes.TCByteBufferFactory.getFromPoolOrCreate(TCByteBufferFactory.java:125)
at com.tc.bytes.TCByteBufferFactory.getFixedSizedInstancesForLength(TCByteBufferFactory.java:162)
at com.tc.net.core.TCConnectionImpl$WriteContext.getPackedUpMessage(TCConnectionImpl.java:990)
at com.tc.net.core.TCConnectionImpl$WriteContext.<init>(TCConnectionImpl.java:941)
at com.tc.net.core.TCConnectionImpl.buildWriteContextsFromMessages(TCConnectionImpl.java:420)
at com.tc.net.core.TCConnectionImpl.doWriteToBufferInternal(TCConnectionImpl.java:481)
at com.tc.net.core.TCConnectionImpl.doWriteToBuffer(TCConnectionImpl.java:367)
at com.tc.net.core.TCConnectionImpl.doWriteInternal(TCConnectionImpl.java:324)
at com.tc.net.core.TCConnectionImpl.doWrite(TCConnectionImpl.java:317)
at com.tc.net.core.CoreNIOServices$CommThread.selectLoop(CoreNIOServices.java:630)
at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:290)
java.lang.RuntimeException: java.lang.OutOfMemoryError: Direct buffer memory
at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:293)
Caused by: java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:632)
at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:97)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288)
at com.tc.bytes.TCByteBufferImpl.<init>(TCByteBufferImpl.java:33)
at com.tc.bytes.TCByteBufferFactory.createNewInstance(TCByteBufferFactory.java:78)
at com.tc.bytes.TCByteBufferFactory.getFromPoolOrCreate(TCByteBufferFactory.java:125)
at com.tc.bytes.TCByteBufferFactory.getFixedSizedInstancesForLength(TCByteBufferFactory.java:162)
at com.tc.net.core.TCConnectionImpl$WriteContext.getPackedUpMessage(TCConnectionImpl.java:990)
at com.tc.net.core.TCConnectionImpl$WriteContext.<init>(TCConnectionImpl.java:941)
at com.tc.net.core.TCConnectionImpl.buildWriteContextsFromMessages(TCConnectionImpl.java:388)
at com.tc.net.core.TCConnectionImpl.doWriteToBufferInternal(TCConnectionImpl.java:481)
at com.tc.net.core.TCConnectionImpl.doWriteToBuffer(TCConnectionImpl.java:367)
at com.tc.net.core.TCConnectionImpl.doWriteInternal(TCConnectionImpl.java:324)
at com.tc.net.core.TCConnectionImpl.doWrite(TCConnectionImpl.java:317)
at com.tc.net.core.CoreNIOServices$CommThread.selectLoop(CoreNIOServices.java:630)
at com.tc.net.core.CoreNIOServices$CommThread.run(CoreNIOServices.java:290)
Can somebody at Terracotta take a look at that please. Why is something that's not supposed to be using direct memory at all, using it and using gigs of it !!!
It's not making sense to me.
|
|
|
02/19/2014 12:10:15
|
halbert
journeyman
Joined: 12/22/2010 14:18:29
Messages: 29
Offline
|
The frequency of the Direct Memory (BigMemory) OOM crashes has decreased since upgrading to 3.7.7 but has not disappeared.
The Big Memory OOM crashes are at a level that's currently tolerable since I have cron job that runs every minute and checks to see if Terracotta is running. If it is not, it restarts it automatically.
|
|
|
02/20/2014 19:45:56
|
hhuynh
cherubim
Joined: 06/16/2006 11:54:06
Messages: 761
Offline
|
How much memory does the box that runs Terracotta server have? Could you also post the terracotta server log? You might not be using BigMemory but Java NIO library (which TC server uses) does use direct memory
so it's not unusual to see the stack trace with direct buffer memory error.
|
|
|
02/21/2014 12:33:37
|
halbert
journeyman
Joined: 12/22/2010 14:18:29
Messages: 29
Offline
|
The machines have 40GB of memory.
The Direct Buffer OOM happened with this JVM option set: -XX:MaxDirectMemorySize=5g
Without that option set, Terracotta will eat up every last byte of availalbe system memory and crash the machine itself.
I've included 2 logs files, they are from the same log.
head.log is the top (2000 lines) of the log file
tail.log is the bottom (10,000 lines) of the log file
Hope you folks manage to get to the bottom of this.
Thanks.
Filename |
tail.log |
Download
|
Description |
|
Filesize |
953 Kbytes
|
Downloaded: |
1688 time(s) |
Filename |
head.log |
Download
|
Description |
|
Filesize |
423 Kbytes
|
Downloaded: |
1970 time(s) |
|
|
|
02/25/2014 14:01:56
|
halbert
journeyman
Joined: 12/22/2010 14:18:29
Messages: 29
Offline
|
Any progress on resolving this issue?
|
|
|
02/25/2014 17:40:20
|
hhuynh
cherubim
Joined: 06/16/2006 11:54:06
Messages: 761
Offline
|
Could you describe your use case a little bit in details?
This log suggests huge values being asked to allocate over and over (around 16MB each time)
2014-02-21 13:27:28,512 [L2_L1:TCWorkerComm # 10_W] WARN com.tc.bytes.TCByteBufferFactory - Asking for a large amount of memory: 15806723 bytes
Could you point to what that might be?
|
|
|
|