Thanks for your product!
We have several Elasticsearch clusters and one of them has a problem when upgrading from 7.7.1 to 7.10.1.
We needed an update due to a problem with the circuit breaker (https://github.com/elastic/elasticsearch/pull/59394)
We use Elasticsearch as a database for searching and displaying data.
- Elasticsearch is running in docker (in k8s), image docker.elastic.co/elasticsearch/elasticsearch-oss
- Exporter metrics justwatch/elasticsearch_exporter:1.1.0
- Load Profile: ~ 2000-3000 bulk (batch 200) inserts data at each node and 150-200 read requests to each node.
- 3 servers (8 CPU (
Xeon E3-1240 v6 3.7Ghz) / 64 Gb RAM / 4x480 Gb SSD (JBOD)).
- 1 server (16 CPU (
Xeon E-2288G 3.7Ghz) / 64 Gb RAM / 4x480 Gb SSD (JBOD)).
- POD limits:
memory 52Gi && cpu 8
- 1 main index (most active): 80 Gb for 3 shards. 2 replicas for each shard.
- After updating ES to 7.10.1, indexing time increased several times, which led to product problems (we do not have time to update data in elasticsearch) and almost all server resources were consumed.
This is how the cpu load looks like:
From the application side, nothing has changed. The application profile with ES has not changed.
For the last week we have been investigating the problem and cannot find a solution to this question: so that the indexing time returns to its previous values.
The only thing we noticed - it is a serious change of memory used for index segments:
Displayed metric: elasticsearch_indices_segments_memory_bytes
But we did not find how we can influence this metric in order to try to return it to its old values or to see the influence of its changes on it.
Please help in the analysis of this problem.
- How can we affect elasticsearch_indices_segments_memory_bytes?
- Maybe we needed to bring specific changes to elasticsearch after upgrading to version 7.10.1?
- Maybe we need to provide more information on our configuration / load to understand the problem?