RAM and Data size

jigsm_shah · July 5, 2020, 1:14am

Hello,

I am a newbie exploring Elastic Search and having hard time understanding how does Elastic Search handle data which is larger than the RAM size.

I am going through the documentations and it says to allocate 50% of your RAM size to Elastic Search. Also one of the documentation says to disable swapping.

https://www.elastic.co/guide/en/elasticsearch/reference/current/heap-size.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration-memory.html

As the swapping is disabled ,then how will Elastic Search get the data for a request which is not available in the RAM?

Any information on the working of the Elastic Search will be helpful.

Thanks!!!

rugenl · July 5, 2020, 2:02am

Elasticsearch uses inverted indexes

A "rule of thumb" is that a node can handle 5-7Tb of data, best on a 64G system with about 32G heap (look at discussions on max heap size).

There are limits on the returned data size, such as number of "buckets" and events returned events.

jigsm_shah · July 5, 2020, 5:26am

Len Rugen,

Thanks for the reply.

Apologizes, but still I am not clear on how a node can handle 5-7TB of data on 32GB RAM.

Does swapping occurs to fetch the data not available on RAM?

As I am coming from a SQL Server background, I am still stuck with physical and logical reads.

Any insight will be helpful to understand how Elastic Search manages such a huge data with around

32GB RAM.

Thanks!!!

Christian_Dahlqvist · July 5, 2020, 6:34am

Elasticsearch does not store all data on the heap. Instead data is read from disk when required and the heap is basically used as working memory. This is why the heap should be as most 50% of available RAM (ideally as small as the use case allows). The rest of available RAM is used for some off-heap storage and the operating system page cache, which are both essential for good performence.

jigsm_shah · July 26, 2020, 11:55pm

Thanks Christian_Dahlqvist for the reply.

Please can you throw some light on why the documentation asks to disable swapping
https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration-memory.html
Thanks

warkolm · July 26, 2020, 11:59pm

That page does mention this;

Most operating systems try to use as much memory as possible for file system caches and eagerly swap out unused application memory. This can result in parts of the JVM heap or even its executable pages being swapped out to disk.

Swapping is very bad for performance, for node stability, and should be avoided at all costs. It can cause garbage collections to last for minutes instead of milliseconds and can cause nodes to respond slowly or even to disconnect from the cluster. In a resilient distributed system, it’s more effective to let the operating system kill the node.

system · August 23, 2020, 11:59pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ES - Possible to use RAM and Secondary Storage if Heap is full? Elasticsearch	4	871	July 5, 2017
Elasticsearch is using most of the memory Elasticsearch	1	578	August 25, 2017
Elasticsearch Memory Optimization? Elasticsearch	6	255	March 13, 2023
Elastic search using more RAM and not release Elasticsearch	2	360	March 21, 2021
Configuration for memory store Elasticsearch	1	295	July 6, 2017

RAM and Data size

Related topics