I have a cluster 2.2.1 with 3 nodes. 4 CPU / 14 Gb RAM / SSD. Java(TM) SE 1.8.0_66-b17
ES_HEAP is 7 gb.
RefreshInterval set to 30s.
mlockall: true
ect.
I'm bulk indexing documents. There is no search query others than for Marvel monitoring.
I see a lot of spikes on the JVM Heap usage - https://infinit.io/_/3jvRpGs
What could explain the behaviour. Is it normal ? Any parameters to check ?
I have another cluster, with the same configuration and i don't see theses spikes.
If you're indexing a ton with a 30s refresh, that's not terribly surprising.
Despite the transaction log, Lucene is building up data in memory to be flushed to a new segment. The point of the transaction log is to act as a backup to this process, if Elasticsearch goes down, you can replay the index updates.
So yes I'd expect quite a bit of JVM churn when writing, and probably a lot of in-memory objects with a 30s refresh interval.
So it could be interested to reduce this time to reduce the quantity of data in memory but. A lot of refresh interval is a costly operation and will have an impact on performance right ?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.