Concurrent Indices stats requests cause cluster to go red

We use the elasticsearch-prometheus-exporter plugin. Occasionally it seems the cluster status turns red from making index stats requests via prometheus-exporter plugin.

has anyone else experienced this issue?

I obtained a threaddump when the cluster was red and saw this:

seems like the issue is the management threadpool is doing too much work. Does this sound right?

I should add I tried to open a bug here: https://github.com/elastic/elasticsearch/issues/36773

How often are you requesting indices stats? The completion stats in particular look nontrivial to calculate, and are not even mentioned in the list of stats in your plugin's README so maybe you can simply avoid asking for them. The indices stats API supports requesting subsets of stats.

The plugin's readme also says this:

NOTE: The exporter fetches information from an ElasticSearch cluster on every scrape, therefore having a too short scrape interval can impose load on ES master nodes, particularly if you run with -es.all and -es.indices . We suggest you measure how long fetching /_nodes/stats and /_all/_stats takes for your ES cluster to determine whether your scraping interval is too short. As a last resort, you can scrape this exporter using a dedicated job with its own scraping interval.

This seems like wise advice.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.