Wouldn't this be cool?

Chris_Neal · October 15, 2015, 6:34pm

Enticing subject I hope.

Here's my thought. I've got servers with 256GB RAM. The optimal heap size for an ES JVM is 32GB (or just under that). How cool would it be if there was something like a "data-shared data node" (TM) that could run on the same server as my "regular" data node, but NOT need it's own copy of the shard data on disk? It would instead refer to the data already there from the "regular" data node.

This data-shared data node could use its entire heap for queries alone, and query off the data that is "owned" by the "regular" data node. I could run 1 regular and 2 shared JVMs per server, and dramatically increase my query potential without having to create new copies of the data!

That would make my day.

magnusbaeck · October 15, 2015, 6:40pm

The optimal heap size for an ES JVM is 64GB (or just under that)

Almost—64 GB is a good RAM size if you want to go with the recommendation of giving ~50% of the RAM to the JVM. The JVM heap should be kept below 30.5 GB to avoid uncompressed pointers.

Chris_Neal · October 15, 2015, 6:41pm

Correct. I mistyped. Edited my post

Christian_Dahlqvist · October 15, 2015, 6:47pm

This sounds a lot like the shadow replica functionality, although that requires the use of a shared file system for the entire cluster, not just a single server.

Chris_Neal · October 15, 2015, 6:59pm

Similar for sure. Seems like searching a shared cluster-wide file system would be a bit on the slow side. Multiple JVMs searching the same local disk data could be super fast.

Ivan · October 15, 2015, 8:53pm

Considering the data stored on disk is not quite used as-is, the only real
benefit you would see is lower disk space utilization, which is not the
bottleneck.

Thanks to mmap/docvalues, more than just the JVM heap is used to store
process data. Perhaps if you found a way to share doc value data, then you
would have something.

Cheers,

Ivan

Topic		Replies	Views
Heap sizing Elasticsearch	5	344	July 6, 2017
Node uses too much memory, I think Elasticsearch	4	651	July 6, 2017
Storage per heap? Elasticsearch	3	963	July 5, 2017
Should ES_HEAP_SIZE be less than 31G? Elasticsearch	8	1809	July 6, 2017
Optimal usage of big server (use VMs or not ...) Elasticsearch	3	425	July 6, 2017

Wouldn't this be cool?

Related topics