Terracotta Discussion Forums (LEGACY READ-ONLY ARCHIVE)

Hi,

I have to deal with the following architectural scenario here:

Have several server nodes with java server application (no application server used) - each with its own database - each node works with its own data There is special service used for "clustering" providing db synchronization on the database level so each server node is using the same data.
Another simple loadbalancing mechanism is responsible for dispatching the clients between nodes - there may happen small inconsistencies in data on nodes (client considering the all nodes as one but may receive different data from each node before they are synchronized - but this race condition happens rarely and if it happens - then there is no impact to the business)

Now - I'd like to prepare future architecture - but for now I have to (with minimal pain / changes in code and minimal refactoring) create the common cluster from these nodes. As what I read - I think terracotta may be capable of doing this - so my idea is:

Create one cluster based on terracotta - each jvm - one node of the cluster. The whole cluster will use one database synchronized with second.

And the problems and my questions:
1) how to handle database in this case (in case I'll use terracotta)
2) I saw async db sync use case on your web - so in case I use terracotta for application clustering - may I use it to provide similar async db sync to secondary database based on terracotta too?
3) do you have any better idea how I can do this?

I know, this is very curious use case - and I'll be very grateful for any suggestions or ideas!

Thanks in advance!

Regards
YF

The question is, what's in the db? does the app or business need an actual SQL interface? Apps typically need SQL in order to search for data, to make sure that data is on disk, and perhaps to report to the business on various metrics such as revenue, products sold, user activity etc.

If you absolutely need the data in SQL form for reporting or what have you, then use TC either as a write-behind or write-thru cache of the database.

If you don't need the data in SQL form, consider leveraging the fact that TC Servers store the data on disk so copying the data to a DB is redundant / wasted.

Assuming you need a DB, to get the data to an active and a passive copy / backup DB instance consider Sequoia with Terracotta. You could use db-level replication since you are already comfortable with it. But Sequoia always seemed elegant to me.

Glad you found the async db replication use case on our site. I think the key requirement in that use case that may not be clear is that the active and passive DB instances were across a WAN link from each other, and DB-based replication, even when both db servers were on the LAN was lowering db server performance by 10X. TC-based replication was not. Since you are not talking about WAN nor are you concerned with the overhead of db-based replication I would not undergo TC-based replication and instead rely on Sequoia or the db itself to replicate.

Cheers,

--Ari

Hi Ari, thanks for the reply!

The DB in my case is needed - I checked Sequoia project and it looks great. The data stored in the database are processed periodically by the another detached systems responsible for auditing, statistics and many other things.

Let me talk about the caching now a little bit:

My plan is to use terracotta to create one coherent cluster from these central nodes as I was mentioning. The caching approach is another problem:

The cluster is at the moment connected up to 1000 nodes (client nodes) - each client node is using small subset of data from the cluster - so there is requirement for paritioned caching. These client nodes are either connected using various types of connections - and in worse case - the connection may be very faulty and slow.

My plan is to use terracotta for creation of that coherent cluster and for caching I'm thinking about some another layer using this cluster as backend - partitioned (using one cache capable of partitioning or many caches (like jboss cache for example)) - for each client node - one cache in this layer over the central coherent terracotta based cluster. Hope I'm clear enough here ...

In case of central cluster will be done and caching layer will be working - then the whole cluster can use one database and replicate it to the second (in this case I'll have to implement some failover logic of the cluster) or I can use Sequoia here as more sophisticated (and complex) approach.

So - what do you think about that caching layer? Isn't it overhead - may this be solved directly and better using terracotta and some of the caching interface that terracotta provides? Again, what I'm facing here is non-standard clustering scenario due to that coherent cluster and it's partitioned client nodes.

Thanks again for any idea!

Cheers,
YF