I am experimenting with DSO as database replacement. While trying to populate object graph with data i discovered scalability problem.
version used: 2.4.3
server heap: 1G
client heap: 512M
Most of data is stored in: ConcurrentHashMap<String, Object>
data inserted individualy, no batching, aprox 1.2M of total inserts for 8G.
Implementation of ConcurrentHashMap is making lot of garbage. after data migration preGC object count reached 41 million.
GC was able to load all objectIds in memory and remove reachables.
rescuing collected small amount of new references, because data migration was over. call to ObjectIDSet2.retainAll() crashed VM, because of extreemly wastefull preallocation of ArrayList. Size of toRemove can't exceed size of rescuables, theere is no need to allocate moore, no need to even create ArrayList - long[] is sufficient. In scenario described above ObjectIDSet2.retainAll() try's to allocate new ArrayList(40000000) for holding < 100 entries.
We also have known issues with our CHM implementation which are being remedied as of now. You should find those improvements in an upcoming nightly release. We'll apprise you of which nightly would have all the required fixes (partial collection for CHM, improved Read performance etc.). 2.4.3 dosen't have these...