High elastic search heap memory consumption while indexing huge files


(Akash Sudhakar) #1

Hi,
We are trying to index large files in elastic search. Elastic heap memory consumption shoots up while indexing.

Basic observation is that while indexing memory can go upto 30x and post indexing it remains at around 10x.
For eg: while indexing 600MB file, memory usage is around 6GB and it can shoot upto 18 GB.
while indexing 1GB file, memory usage is around 10GB and it can shoot upto 30 GB.

Is this expected behaviour. Is there a way we can reduce the memory footprint.

Any help with this query is highly appreciated.

This behavior is seen in both ES 2.4 and ES 5.5 versions

2 data nodes, 1 dedicated master node.
ES Heap memory - 31 GB in all nodes


(Christian Dahlqvist) #2

How are you indexing the data? How large are your documents? How large are your bulk requests? How many concurrent indexing threads do you use?


(Akash Sudhakar) #3

Hi Christian,

Please find details below -

How are you indexing the data?

We are calling bulk indexing API using curl command. We initially tried with transport client, but we were getting OutOfMemoryError.

How large are your documents?

We have tested with 600 MB and 1 GB data.

How large are your bulk requests?

We were trying to index a single document of 1 GB data. Document contains single key value pair.

How many concurrent indexing threads do you use?

Not sure about this. How to find this out


(Christian Dahlqvist) #4

Questions around indexing very large documents have been asked before on the forum. Please have a look at:


(Akash Sudhakar) #5

Above links are similar to our requirement where we need to index large documents. They are suggesting to split the document which would add complexity to search part.

Any idea why while indexing heap memory goes up to 30 x and post indexing it remains at around 10 x ??

Is there any settings we can change in order to reduce memory usage, say around 3 x or 4 x.


(Christian Dahlqvist) #6

I am not surprised the heap usage grows a lot as there is a lot of analysis and processing going on. As mentioned in the links I provided, documents of that size is really beyond what Elasticsearch was designed for and you may also face issue searching them.


(Akash Sudhakar) #7

Ok Thanks a lot for the information.


(system) #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.