[Logo] Terracotta Discussion Forums
  [Search] Search   [Recent Topics] Recent Topics   [Members]  Member Listing   [Groups] Back to home page 
[Register] Register / 
[Login] Login 
[Expert]
Using Terracotta to increase heap space beyond limits of 32-bit jvm  XML
Forum Index -> General
Author Message
potatoe

journeyman

Joined: 03/27/2008 08:01:17
Messages: 10
Offline

I am considering using Terracotta to increase the heap size of an app beyond the 4GB limitations of a 32-bit JVM. Switching to a 64-bit JVM for the application has been ruled out as an option. I have a few questions about how the server stores clustered objects.

I would like to design it by taking select parts of the application which use up a lot of memory and strategically pick objects as shared roots.

My understanding is that the terracotta server will store all the clustered objects the applications share in memory. Terracotta will replace less frequently used objects on a client when it is full and request the objects it needs from the server (much like memory and virtual memory). Is this correct?

If the terracotta server is running on a 32-bit jvm, is it limited to 4GB of memory?

Does it tap into virtual memory if we use it to store more than 4GB?

I suppose I'm failing to see how a 32-bit application can have up to 1TB of heap as mentioned on the site.
amiller

ophanim

Joined: 08/29/2007 09:05:48
Messages: 722
Location: St. Louis, MO
Offline

The server also pages objects out of memory onto disk in persistent mode, which allows it to store more objects in the clustered heap than can fit in the heap of any particular JVM in the cluster (including the server).

The data itself is stored on disk using Berkley DB in a sort of condensed serialized form (not Java serialization, but something with less overhead).

Memory management on the client is done on a combination of LFU and LRU.

Hope that helps - if not we can go from there.

Alex Miller (Terracotta Engineer) - Want to post to the forums? Sign up here
[WWW]
tgautier

seraphim

Joined: 06/05/2006 12:19:26
Messages: 1781
Offline

potatoe wrote:
My understanding is that the terracotta server will store all the clustered objects the applications share in memory. Terracotta will replace less frequently used objects on a client when it is full and request the objects it needs from the server (much like memory and virtual memory). Is this correct?
 


Yes

potatoe wrote:
If the terracotta server is running on a 32-bit jvm, is it limited to 4GB of memory?
 


No - it swaps to disk.

potatoe wrote:
Does it tap into virtual memory if we use it to store more than 4GB?
 


Yes - as Alex explained, it uses a filesystem for external storage.
[WWW]
potatoe

journeyman

Joined: 03/27/2008 08:01:17
Messages: 10
Offline

tgautier wrote:

potatoe wrote:
My understanding is that the terracotta server will store all the clustered objects the applications share in memory. Terracotta will replace less frequently used objects on a client when it is full and request the objects it needs from the server (much like memory and virtual memory). Is this correct?
 


Yes
 


I'm not sure if I made my question clear. Does the server hold every single shared object in its memory? Or is the server managing the memory of all the connected JVMs, and moving objects between them to make a larger heap (up to 4GB*<# of JVMs>)? (on 32-bit JVMs.)

P.S. thanks a lot for all the help so far
tgautier

seraphim

Joined: 06/05/2006 12:19:26
Messages: 1781
Offline


'm not sure if I made my question clear. Does the server hold every single shared object in its memory? Or is the server managing the memory of all the connected JVMs, and moving objects between them to make a larger heap (up to 4GB*<# of JVMs>)? (on 32-bit JVMs.)
 


Yes

Let me explain in more detail - hope it helps.

The Terracotta Server cluster always has the most up to date version of the objects in the clustered heap.

It acts as a peer to all members in the cluster. This architecture means that no cluster member is a peer to another cluster member, but they are all peers to the server cluster.

The Terracotta server(s) can move data in and out of their local heap to disk, so the data they store is not limited to any one JVM physical heap size.

The cluster members, we refer to the them as Terracotta clients or "L1"s, page data in and out of their physical heap (to and from the Terracotta servers, or "L2"s) as necessary.

You can think of this as a hierarchical memory subsystem, much like the CPU registers and caches work in a typical computer architecture.

A typical computer architecture will work with at least three levels of data replication - the cpu registers, the cpu cache (nowadays there are multiple levels of cache) and the main memory.

The cpu "register space" is not limited to any physical limitation - they load and save data from the cpu cache. The cpu cache likewise is not limited to any physical limitation - it loads and stores data from main memory.

So the total memory available to a typical computer architecture is the amount of physical memory installed. Of course modern OS's take this one step further and make main memory appear to be larger than it is by swapping pages of memory in and out of disk.

Terracotta operates on exactly the same principles. The heap in the Terracotta client is dynamically paged in and out (we call it "faulting" and "flushing" respectively) to and from the Terracotta server(s).

The Terracotta server(s) dynamically fault and flush their data to and from from disk. So here disk is equivalent to memory in the core computer architecture, and is the fundamental size of the whole system. So any connected client can see as much Virtual Heap as is provisioned under the Terracotta server(s).

In that way you should consider the physical heap of the Terracotta client (your application) to be a window onto the cluster wide data. With Terracotta, unlike traditional peer to peer systems, every client can have a different set of data in it's "window". This happens due to the organic relocation of data to the application heap. If your application splits it's workload, Terracotta will move the data to the application to accomodate the workload. As such, if the workload is well partitioned, Terracotta knows what data is where, and will not waste network bandwidth updating objects across the network. We call this feature "update only where necessary". It eliminates unnecessary network broadcasts.

Again, the parallel in a traditional computer architecture can be made when by considering how that systems behaves when a second CPU is added. When this is done, each CPU can either have the same or different data loaded -- it depends on the workload the CPU is asked to handle by the application. The data in each CPU is a replicated copy of the data in the main memory.

So the parallel to draw is CPU == Terracotta client, and main memory/system bus == Terracotta server(s). And again, the parallel here is that each CPU can have different parts of the main memory loaded into its cache, and work independently from one another, while still seeing a coherent view of the entire system memory.
[WWW]
monster

journeyman

Joined: 10/22/2008 09:36:23
Messages: 36
Offline

NB: I'm new here. Didn't see anything against bumping threads, so I thought I should post my question in a thread about what I want to know...

So, I can use Terracotta as an "unlimited virtual heap". This is exactly what I would like. Now the tricky part is: Do I really need to have a server in it's own JVM? What I want to know is, can I run my application in the "server" JVM (making it a client as well), so that I basically get the "unlimited virtual heap" without any NAM/DOS ?

If that works, has it that got some not-so-obvious problems, like that the memory footprint is doubled because all objects have to be represented as "server object" and "client object" at the same time, or that the "client" and "server" part still communicate with sockets, making it sub-optimal?

Really, I have nothing to share, but I just need more then 1.5GB RAM. We have a client/server architecture, and our clients have run-of-the-mill XP PCs and we distribute our app through Web-Start and we had to refuse clients who's data don't fit in a 32-bit VM, so now I'm looking for a simple way of "expanding the heap" of client PCs without the clients having to change or install anything.

I've been looking at EHCache and the like, but Terracotta looks like a more "transparent" solution.
amiller

ophanim

Joined: 08/29/2007 09:05:48
Messages: 722
Location: St. Louis, MO
Offline

Probably better in the future to post to a new thread so your new question doesn't get lost.

Do you need a server in it's own JVM?  

Yes.

Can I run my application in the "server" JVM?  

No.

I don't know that those answers will be correct for all time, but that's the current state.

Terracotta can be used to serve more than 1.5G heap but you need to consider how your application is architected. In particular, it's difficult to get true benefits of larger heaps without partitioning at the application layer where your clients are each using a subset of the heap. If you are regularly using all of the clustered heap on a client, then that inevitably means paging objects in and out of memory from the server which has a certain amount of overhead.

I know that in the past there have been problems with using Java Web Start apps with Terracotta as they lack a way to modify the bootclasspath of the application (which is required for Terracotta clients). I can't remember if that's changed or not.

Alex Miller (Terracotta Engineer) - Want to post to the forums? Sign up here
[WWW]
 
Forum Index -> General
Go to:   
Powered by JForum 2.1.7 © JForum Team