What are some key metrics to monitor master nodes?
We currently have a cluster with 30 nodes. We have 3 master-eligible nodes which also serve as data-nodes.
I suspect our cluster is oversharded (~1500 shards per node). Occasionally some actions like creation/deletion of indices and index-template update fails with timeout (and when it doesn't it still takes quite a long time to process).
I have a plan how I can decrease a number of shards and plan to install some master-only nodes, but I'd love to have some metrics to look at to have an objective criteria of success, so my question is what are some metrics using which we'd be able to monitor those issues.
We are running elastic 6.8