I currently have 3 main clusters, 1 with 19 nodes and 2 with 15 nodes. In all 3 cases I've left things mostly in their default state, with all nodes as data and master eligible. After a disastrous upgrade followed by a stressful downgrade back to the previous version of Elasticsearch (6.8.3 -> 6.8.12 -> 6.8.3 if you're curious) I'm re-evaluating everything that I've set up.
My use case is Graylog for receiving and processing all of our log data.
I have each cluster set to have shards=number_of_nodes and 1 replica, so each node hosts 2 shards. Clusters have been expanded to add more nodes when disk space becomes an issue, but nothing is CPU bound.
Indices are rotated at 200-300GB, which is 2-3x per day.
Everything is run on bare metal in our datacenters, not hosted in a cloud service.
Would it make sense to change anything, like having dedicated masters, or fewer shards? When the servers run low on disk space, memory allocation and garbage collection becomes an issue. Otherwise things are fine. I'm just wondering if I could make better use of my resources.