Hi,
We are trying to index large files in elastic search. Elastic heap memory consumption shoots up while indexing.
Basic observation is that while indexing memory can go upto 30x and post indexing it remains at around 10x.
For eg: while indexing 600MB file, memory usage is around 6GB and it can shoot upto 18 GB.
while indexing 1GB file, memory usage is around 10GB and it can shoot upto 30 GB.
Is this expected behaviour. Is there a way we can reduce the memory footprint.
Any help with this query is highly appreciated.
This behavior is seen in both ES 2.4 and ES 5.5 versions
2 data nodes, 1 dedicated master node.
ES Heap memory - 31 GB in all nodes
Above links are similar to our requirement where we need to index large documents. They are suggesting to split the document which would add complexity to search part.
Any idea why while indexing heap memory goes up to 30 x and post indexing it remains at around 10 x ??
Is there any settings we can change in order to reduce memory usage, say around 3 x or 4 x.
I am not surprised the heap usage grows a lot as there is a lot of analysis and processing going on. As mentioned in the links I provided, documents of that size is really beyond what Elasticsearch was designed for and you may also face issue searching them.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.