I have a 3 node cluster (16 vCPU, 64 GB of RAM, 3 Tb of data per node, JVM Heap at 30GB) with 450 indices (1 primary shard and 1 replica per indice).
Following an upgrade from 6.7 to 6.8, the activation of TLS on Transport and HTTP and the activation of security (native authentication), we started seeing circuit breaking exceptions in the elastic logs.
After some investigations I found out that the JVM Heap is mainly used by fielddata, and most of the fielddata memory is used by the "_id" field :
Is this a normal behavior ? How can I decrease the memory used ?
The third node was added to the cluster recently to try to split the load but it doesn't change anything.
I have a lot of fields in my indexes, would decreasing the number of fields change that ?
We checked our searches, visualizations and dashboards and didn't find any sorting or aggregations using the _id field in them.
We are using ElastAlert https://github.com/Yelp/elastalert to query the logs we are ingesting in ElasticSearch and none of our ElastAlert rules are using it either.
Also, I don't know if this information will be of any use, but when I restart the node it takes a while for the _id fielddata memory to build up.
I'm restarting each node twice a day to free the memory
We found what causes the issue : when an ElastAlert rule matches, we add a link to Kibana in the alert with the _id of the log that matched the rule. When someone clicks on the link, _id values are loaded in the JVM Heap.
I don't think that copying the _id value in another field would change that as we have a lot of logs, at some point Elasticsearch will have to load these values to search in them. And doc_values would require to read this information from the disk so I guess performance will not be great either.
TIL we have an API for clearing caches which includes field data. It isn't a good long-term fix but it is a lot less disruptive than restarting nodes to clear this memory usage.
The link goes to the Discover tab in Kibana so yes a search is performed at that time.
I configured the "Logs" app in Kibana to display the logs that are in our logstash-* indexes and performed a few search on _id, I do not have the issue that way so we will modify the links in ElastAlert to use this app.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.