How to diagnose when indexing performance suddenly droped

111336 · May 13, 2020, 12:40pm

Hi. I'm running a es cluster which has 50 data nodes, 20 for a logging index, 20 for a data index, 10 for other indices, and 4 coordinators and 3 master nodes.

One day the indexing performance of the logging index was suddenly dropped.

es 7.4.0 + docker environment
cpus was used about 10%
gc was less than 5ms
ssd disk io was about 20MB/s
hot threads didn't exists in most logging data nodes.
-- https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-hot-threads.html
tasks manager show lot of actions 'indices:data/write/bulk[s]' or '[s][p]' in both coordinator and data nodes with high running_time_in_nanos higher than 5,000,000,000
-- all coordinator nodes : lot of status were rerouted
-- data nodes of logging index : log of status were waiting_on_primary, primary. shard numbers were different
-- other data nodes : few indexing action with low running_time_in_nanos

After I restart containers, it back to normal.

cpus went high closed to 80%
gc was about 20ms
ssd disk io was abount 450MB/s
no task with long running_time_in_nanos

It looked like indexing was stuck somewhere between requests of indexing accepted and writing it into shard. I don't know how to figure it out. Is there any tool to use?

warkolm · May 13, 2020, 11:39pm

Welcome

Ideally you should split those use cases out, so that they don't impact each other and then the users of each dataset.

Do you have Monitoring enabled?

111336 · May 14, 2020, 2:27am

Thank to reply.

While indexing of the logging data got slow, other indexing of indices hadn't any problem. It would be good to split cluster for each purpose, but allocating index sperately was enough.

All metric are collected using fluentd. I looked into machine cpu, memory, io of all es nodes. Since the task manager api said indexing was slow, I guess logstash was not a candidate causes of problems.

Thanks.

system · June 11, 2020, 2:27am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Indexing Performance drop after a few hours Elasticsearch elastic-stack-monitoring	9	514	January 27, 2022
Indexing rate drops periodically - performance troubleshooting Elasticsearch	2	312	November 18, 2020
Indexing performance Elasticsearch	6	398	July 6, 2017
Elasticsearch Indexing Performance Degradation Elasticsearch	6	276	April 22, 2024
Debugging extremely slow indexing Elasticsearch	39	7114	February 16, 2021

How to diagnose when indexing performance suddenly droped

Related topics