High elastic search heap memory consumption while indexing huge files

akash · August 23, 2017, 6:26am

Hi,
We are trying to index large files in elastic search. Elastic heap memory consumption shoots up while indexing.

Basic observation is that while indexing memory can go upto 30x and post indexing it remains at around 10x.
For eg: while indexing 600MB file, memory usage is around 6GB and it can shoot upto 18 GB.
while indexing 1GB file, memory usage is around 10GB and it can shoot upto 30 GB.

Is this expected behaviour. Is there a way we can reduce the memory footprint.

Any help with this query is highly appreciated.

This behavior is seen in both ES 2.4 and ES 5.5 versions

2 data nodes, 1 dedicated master node.
ES Heap memory - 31 GB in all nodes

Christian_Dahlqvist · August 23, 2017, 8:57am

How are you indexing the data? How large are your documents? How large are your bulk requests? How many concurrent indexing threads do you use?

akash · August 23, 2017, 9:12am

Hi Christian,

Please find details below -

How are you indexing the data?

We are calling bulk indexing API using curl command. We initially tried with transport client, but we were getting OutOfMemoryError.

How large are your documents?

We have tested with 600 MB and 1 GB data.

How large are your bulk requests?

We were trying to index a single document of 1 GB data. Document contains single key value pair.

How many concurrent indexing threads do you use?

Not sure about this. How to find this out

Christian_Dahlqvist · August 23, 2017, 9:21am

Questions around indexing very large documents have been asked before on the forum. Please have a look at:

akash · August 23, 2017, 9:35am

Above links are similar to our requirement where we need to index large documents. They are suggesting to split the document which would add complexity to search part.

Any idea why while indexing heap memory goes up to 30 x and post indexing it remains at around 10 x ??

Is there any settings we can change in order to reduce memory usage, say around 3 x or 4 x.

Christian_Dahlqvist · August 23, 2017, 9:45am

I am not surprised the heap usage grows a lot as there is a lot of analysis and processing going on. As mentioned in the links I provided, documents of that size is really beyond what Elasticsearch was designed for and you may also face issue searching them.

akash · August 23, 2017, 10:14am

Ok Thanks a lot for the information.

system · September 20, 2017, 10:15am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Indexing large number of files each with a huge size Elasticsearch	3	456	July 6, 2017
Beginning Question: Memory consumption while idle Elasticsearch	3	1288	July 6, 2017
High heap during indexing documents Elasticsearch	4	1982	April 12, 2017
Trying to identify high heap memory usage (v1.7.5) Elasticsearch	2	782	May 4, 2017
Memory usage seems excessive Elasticsearch	3	323	July 6, 2017

High elastic search heap memory consumption while indexing huge files

Related topics