[Logo] Terracotta Discussion Forums (LEGACY READ-ONLY ARCHIVE)
  [Search] Search   [Recent Topics] Recent Topics   [Members]  Member Listing   [Groups] Back to home page 
[Register] Register / 
[Login] Login 
[Expert]
Messages posted by: ssubbiah  XML
Profile for ssubbiah -> Messages posted by ssubbiah [115] Go to Page: 1, 2, 3, 4, 5, 6, 7, 8 Next 
Author Message
What version of Terracotta are you using ? Please open a ticket at https://jira.terracotta.org. We would most probably need all the logs and a cluster dump to help you debug this problem.
What is the version of the software you are using ? Can u add the logs of the failure ?
Please jira ur request here http://jira.terracotta.org and attach the thread dumps / cluster dumps there for analysis.
Please open a jira and add the logs, config and if possible a reproducible test case so that we can track this issue.
Can u please open a jira and attach the logs there ? Its probably a better place to track this.
BTW, we are in the process of revamping some of the Toolkit interfaces (Toolkit 2.0) for the next major release. It will address this exact issue by providing an AtomicToolkit interface that u can use to create Atomic transactions explicitly.
If I have to guess, it could be a Berkeley DB bug or a corrupt disk. With the upgrade of Terracotta, did you throw away the data ? If you did and started fresh and still hit this exception, it could be caused by a corrupt disk.

The last time we closely looked at a problem similar to this, we found bad sectors in the hard disk caused the corruption.
It will be in 3.6 that is scheduled to release end September, beta 2 will be out anytime next week. Give it a try and let us know how it works for you.
Another thing to note is that we are greatly improving DGC times for distributed ehcache in the next release with the new inline DGC feature. With this feature, you dont have to run DGC often and garbage is collected as and when an object is garbage. So the system doesnt have to pause at all during normal operations.
The workaround is to run all your distributed Ehcache in "DCV2" mode and not in "classic" mode. The work around is only available for distributed Ehcache usecase.
This problem should be fixed in trunk and 3.5 line. The next release 3.5.2 should contain the fix.
Let me try and explain how things work in DCV2. Like Gary said it is highly tuned for significantly large caches.

By default DCV2 has 2048 segments. Each segment acts as an individual unit and the maxElementsOnDisk setting from the cache is divided down to the segment. So setting a low number as 20 as ur maxElementsOnDisk will make each segment have its maxElementsOnDisk as 1 as it cant go any lower than that. So even if you add 40 elements, its probably all getting hashed to different segments and not triggering eviction. But you can reduce the number of segments by lowering the concurrency to say 1 or 2 in ur config to work around this.

Also in DCV2, we do lazy expiry of elements based on tti/ttl. What this means is that we dont unnecessarily scan elements trying to look for expired elements until either its accessed or if the number of elements overshoots way beyond maxElementsOnDisk. This is also done for performance reasons.

Even though DCV2 is tuned to perform well for large caches out of the box, its easy to make it work the way you want it to perform with smaller caches. So I would not suggest moving to "classic" mode unless you have a strong reason to do so. Going forward all new features and improvements will only be supported for "DCV2"
You can find some examples in this project.

https://svn.terracotta.org/repo/forge/projects/terracotta-ehcache/trunk

Tests could be found under system-tests/src/test/java/org/terracotta/ehcache/tests . One such test is CacheConsistencyTest

Hope that helps.
How are you spawning the servers (L2) and clients (L1) in your unit test ? If you are spawning a new L1 for every test while reusing the same JVM, I can see how you might run out of perm gen. This is probably just a test setup issue.

Can you try running each test as a separate test, spawning a new VM everytime ? The other thing to try is not spawn a new L1 but reuse the same L1 and run each test within the same L1. Dunno if thats possible with your test setup.

You could look at how the tests are setup in our source repository. We spawn L2s externally and run most tests individually in their own VM.



I have created a Jira to track this.

https://jira.terracotta.org/jira/browse/CDV-1563

If you have a reproducible test case, please add it to the jira.
 
Profile for ssubbiah -> Messages posted by ssubbiah [115] Go to Page: 1, 2, 3, 4, 5, 6, 7, 8 Next 
Go to:   
Powered by JForum 2.1.7 © JForum Team