I am looking for some help in understanding the current state and optimising the response time of elasticsearch cluster that we are running in production. We are having High CPU utilisation and the search query time, indexing time has increased.
A few elasticsearch cluster stats.
Total Number of Nodes : 9
Data + Master : 4
Data : 4
Dedicated Master : 1
We are using i3 instance types from AWS
15.75 GB RAM
475 GB NVMe SSD on each instance
Total Data on Elasticsearch Cluster : 95.6GB
Number of HTTP Open/minute : 4000-4500 (approx 45000 requests /10 minute interval)
Total Heap Allocated on each instance : 8 GB
We recently enabled Kibana Monitoring. Here is a screenshot for one of the index which has high latency as to 600 ms avg.
I am not sure where to begin and what exactly are we missing here. Can someone guide us in the right direction of what needs to change?
Please let us know if there is additional data required.