What levers are present to control memory usage in Elastic Search?

antonpious · November 28, 2017, 11:34pm

Since elastic search is a search engine, the amount of memory given to it would be less than the total data that is stored in search engine.

Is this assumption valid?
If we have 100 GB of Data can elastic be configured in such a way that the memory is limited say 10 GB but we need the ability to search all data. What would happen as the data increases to 200GB can elastic search still work with the same allocated 10 GB memory? Basically is there a combination of in-memory structures and in disk structures when the physical memory is not enough.

At present the following are the various memory terms that X-Pack shows under memory usage. Can light be shed on what levers are present to control these type of memory usage.

Elastic Search Memory Usage

Index Memory - Lucene 1

Stored Fields: Heap Memory Used by Stored Fields (E.g _source)
Doc Values: Heap memory used by Doc Values
Norms: Heap memory used by Norms (Normalization Factors for query-time, text scoring)

Index Memory - Lucene 2
Terms: Heap memory used by Terms (e.g text)
Points: Heap memory used by Points (e.g numbers, IPs and Geo Data)

Index Memory - Lucene 3
Fixed Bitsets: Head memory used by Fixed Bit Sets (e.g deeply nested documents)
Term Vectors: Heap memory used by Term Vectors
Version Maps: Heap memory used by Versioning (e.g updates and deletes)

Index Memory - Elastic Search
Query Cache : Heap memory used by Query Cache (e.g cached filters)
Request Cache : Heap memory used by Request Cache (E.g instant aggregations)
Field data : Heap memory used by Fielddata (e.g global ordinals or explicitly enabled field data on text fields).
Index Writer : Heap memory used by the Index Writer

Apart from reducing documents to save memory, can the rest of the levers be exposed so that we can control how elastic search uses memory for these types of memory quotas, if it exceed can it load some to disk and reload it again if the query is based on these types of memory.

I found one article on how to control the field data .
https://www.elastic.co/guide/en/elasticsearch/reference/6.0/modules-fielddata.html

Can the corresponding lever documentation be exposed or pointed to find how to limit memory usage? specifically Terms and Fixed Bitsets.

warkolm · November 29, 2017, 2:08am

Elasticsearch is not an in memory data store, so you don't need a 1:1, memory:data ratio.

Trying to optimise for something that may not be a problem is a waste of time. So my question is why are you asking all these questions? Are you having issues, that may be solved other ways?

antonpious · November 29, 2017, 4:31am

Once we have confirmed that its not a 1:1 Memory:Data ratio the problem becomes obvious.

Lets say we have 1 Node with 1 Index with 10 GB memory of which half is given to the JVM namely 5 GB and another half for Lucene which is 5 GB. As recommended in "Give less than Half your memory to Lucene" (https://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html)

As we load data to this index namely 10 GB, 20 GB ... 100 GB, the elastic node memory keeps increasing in any or all of the above mentioned buckets namely Stored Values, Terms, Fixed Bitsets. Eventually the Heap Size fills up namely 4.9 GB of 5 GB etc, then the problem starts, all the new ingestion are at extremely slow pace.

So how do we get the lever to make sure that we say only Keep X bytes for each memory bucket (Stored Values, Terms etc) in memory.

Even if we do keep it, how do we make sure something like Least Recently Used (LRU) is picked to remove from memory and store in disk.

The problem becomes compounded when we now have multiple indices in a Node and multiple Nodes having multiple indices.

Lets say we have 3 nodes Node1, Node2 and Node3
We have 3 indices Index1, Index 2 and Index 3 with with 1 replica and 5 shards.
We now have a node to shard distribution of
3 Nodes 15 Primary Shards and 15 Replica Shards a total of 30 shards / 3 nodes with each node having 10 shards each.

Now if one of the Nodes namely Node2 happens to have the shards of the indices Index 1 and 2 and the memory is full the rest of the indices ingestion suffers for this Index 1 on Node2. So not only is the lever required at the Node Level we also need the levers at the index level.

Note:
With X-Pack Monitoring we are not sure where the other half given to native Lucene goes namely the off heap goes. There is no track of the memory utilized in this space namely the 5 GB allocated to Lucene.

warkolm · November 29, 2017, 5:33am

Elasticsearch already uses LRU in a number of caches. It takes indices that haven't been used in 5 minutes and marks them as inactive and removes certain data structures from the heap. There are no levers to manage things like you mention.

Yet you seem to be talking about theories here so I will ask again, are you having specific problems you can provide details on?

It's not allocated to Lucene, it's used by the OS for Lucene files. You'd need to monitor your OS for details into that.

antonpious · November 29, 2017, 6:11am

If there are no levers to control what is stored in memory. The only option is to wait for the 5 minute LRU cache inactive then the original assumption that we could store any amount of data into the same memory allocated to elastic search is flawed if all the indices are used at the same time.

This mean, that elastic search would need to have the memory increased as more and more data is ingested into the system.

Is this the same when the system data is at rest, i.e when there are no records being added, for the existing records too the memory can't be controlled leading to memory required even at rest.

Not sure I understand your theory comment. This is design with abstraction, if we substitute Index1, Index2 with tweet and customer indices you get your practical problem. Let me know if you would prefer specifics as opposed to abstraction.

warkolm · November 29, 2017, 6:13am

Are you asking these questions because you have a specific problem you are solving, or are you just asking how Elasticsearch handles memory allocation?

antonpious · November 29, 2017, 6:24am

I am having the exact problem which I have described which is as more and more data is ingested into the multiple indices present across multiple nodes. We allocate X memory to elastic nodes, we load Y data till the memory X peaks after Y, we increase memory to X' to elastic nodes and then we continue to load Y' data.

So want to understand if this is the expected design of memory management in elastic or anything else is available to control this behavior. Does it use the disk at all when the memory is full or can we force it to use disk when memory is full.

How can we expect to size this system if we are not sure what would be the memory requirement when all the data is loaded both for current and future needs or to say we could only afford X memory but we want Elastic to work only with X memory. Unlike Disk which are expandable, the memory is not as there are standard sizes, to increase memory most cloud providers also expect you to increase CPU which compounds the cost.

warkolm · November 29, 2017, 7:39am

Elasticsearch will use any memory you provide. It does that to make performance as optimal as possible. But if you aren't seeing OOM or problems, then it's doing as it designed. If you don't add more heap then it'll manage that accordingly.

If you have a specific problem then providing specifics about that would be best. If you are seeing increasing heap use, what does that mean to your cluster? What exactly is "memory peaks" to you? What is your heap size? What is your infrastructure? What is the use pattern? Are you using Monitoring to check things? Are you OOMing? What are the GC cycles like? What version of Elasticsearch are you on? What JVM? What OS? What do your mappings look like? What queries are you running? How are you sending data to Elasticsearch? How much data is that? How many indices and shards? What are your future volume estimates?

antonpious · December 1, 2017, 11:07pm

This conversation started with Elastic search memory management is not efficient in managing the memory it was provided or work with constrained memory when the volume of data is not proportionate to the memory provided . There are not many levers or details present to control and influence how to manage the memory provided to it.

I have raised the following feature requests

github.com/elastic/elasticsearch

Ability to have a memory and cpu water mark for shard movement similar to the disk water mark

opened 09:22PM - 01 Dec 17 UTC

closed 08:27AM - 30 Apr 18 UTC

antonpious

feedback_needed :Distributed/Allocation

Currently elastic search moves shards to different nodes which has free disk space based on a water mark. The similar logic is also needed for moving shard based on the memory availability in Java Heap on a node. Lets say we have 5 nodes, Node1, Node2, Node3, Node 4 and Node 5 We have 3 Indices Index1, Index2 and Index3 of 3 shards each with 1 replication. This would lead to 9 Primary Shards and 9 Replica Shards a total of 18 Shards and 18/5 an initial balanced ratio of 4 shards in each of Node1, Node2, Node3, Node 4, and 2 Shards on Node 5. Lets assume that Index3 is not present in Node 5. The total available memory on each nodes is 10 GB of which 5 GB is allocated to JVM and 5 GB is for Off Heap. The disk space is 100 GB. Data is loaded into all the indices and the data grows so too the memory structures that hold memory. The Index 3 data ingest is more than the other index 1 and index 2. The data volume in Index 3 keeps growing. There is no disk pressure so the shard movement does not happen. The Memory in Node1, Node2, Node3 and Node 4 constantly grows to reach 4.9 GB of 5 GB. The memory of Node 5 which did not have the index is still at 2.0 GB of 5 GB. Since the memory in Node1... Node 4 is full due to the various structure of terms, fixed bitsets etc the nodes are not able to ingest more and more data leading to a slow rate of ingest due to memory pressure. If during this time the memory water mark is invoked one of the shards of Index 3 could have been moved to Node 5 leading to a balancing of memory across nodes for the ingest to happen. The problem becomes more as more and more nodes are added say 7 nodes. In this instance if one or 2 indices take up the memory in Nodes 2, 3,4 all ingest to these nodes are drastically slowed downed though there is memory available on the other nodes and these indices shards can be moved. The other is both these water mark namely disk and memory should work in combination. Right now, when the disk is not available on a node, the shard gets moved to the node where there is disk space, but without consideration for both CPU or Memory. Now when disk is added we should be able to rebalance based on CPU and Memory. The similar problem of CPU water mark is present when most of the work is done only on few of the nodes while the rest of the nodes are free, this CPU water mark should also be configurable.

github.com/elastic/elasticsearch

Ability to have levers to control memory structures when amount of data indexed is more than the memory available

opened 10:57PM - 01 Dec 17 UTC

closed 02:55PM - 24 Apr 18 UTC

antonpious

>feature feedback_needed :Core/Infra/Circuit Breakers

Lets say we have a Node with 8 GB RAM and 100 GB disk with 1 index with 4 GB allocated to Heap and 4 GB allocated on heap. We have a constraint on the amount of memory to be allocated to a node. We can only increase the disk but not the amount of memory available. i.e we should be able to index all the data that come in without increase in memory. As we load data to this index namely 10 GB, 20 GB ... 100 GB we fill the disk and we increase the disk to 200 GB but not the memory, the elastic node memory keeps increasing in any or all of the buckets namely Stored Fields, Doc Values, Norms, Terms, Points, Fixed Bitsets, Term Vectors, Version Maps, Query Cache, Request Cache, Field data etc. Eventually the Heap Size fills up namely 3.9 GB of 4 GB leading to lessor memory available for new data. When this happens, the ingestion of new data comes to a crawling pace. Can levers be given to control the quota for each of these namely Stored Fields, Doc Values, Norms etc buckets so beyond a threshold only some policy based data is in memory while others are save to disk. The same levers need to be given both at the Index level and/or node level. When these are set at Node Level, the quota apply to all shards in that node When these are set at Index Level, the quota apply to all shards of that particular index across nodes. The Memory Quota should be distinguished between New Data and Existing data. So if an Index has 200 GB of Data which is at Rest the quota should apply to data at rest If an index is being ingested with new Data then there should be a separate quota on receiving this data and indexing without having to clash with the data that is already at rest, like around 1 GB of data coming in new within a time gap T1

warkolm · December 2, 2017, 3:01am

Ok, thanks for raising those

antonpious · December 6, 2017, 12:45am

Do you have specific "how to" or articles on how to find the usage for any OS that the development team uses. This question has been asked many times but all answer don't have specifics. How do we find how much memory is used by the OS in relation to the root process of Elasticsearch? Basically what is the basis of recommendation to reserve 50% and how do we validate it? Namely is 50% up to the max of 32 GB memory which is 16 GB for Heap and 16 GB to OS. Or Max heap of 32 GB which would mean a 64 GB System that would give 32 GB for Heap and 32 GB for OS. At large memory allocation for OS, we should be able to identify if this is actually being used than just a trust statement.

warkolm · December 6, 2017, 1:06am

The basis is to let the OS cache things. What do you mean by validate?

It's up to the max heap, which is 31GB.

https://www.linuxatemyram.com/ is a good run down of what is happening here and has links to other resources

system · January 3, 2018, 1:06am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Memory utilization - predicting 'out of heap space' errors Elasticsearch	13	435	July 6, 2017
Memory usage per index Elasticsearch	9	10256	July 6, 2017
ES Memory consuption Elasticsearch	13	629	July 6, 2017
Regarding memory consumption in elastic search Elasticsearch	7	1471	July 6, 2017
Where is all my memory? Or how to estimate better Elasticsearch	4	474	July 6, 2017

What levers are present to control memory usage in Elastic Search?

Related topics