When this happens, users start to notice that kibana becomes pretty much unresponsive, for minutes. In the above case, it became unresponsive for about 20 minutes.
How do we debug and improve this? Any suggestions?
Is there anything in the logs around that time that might correlate? Can you run the hot threads API when this happens to see what is going on? What is the specification of the cluster? Which version of Elasticsearch are you using?
We will try the hot threads API when this happens next. ES version is 7.10
Cluster is running 15 data nodes with 10TB disks and is ingesting logs from a lot of backend services.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.