I would greatly appreciate if anyone can share some insight about the elastic search cluster setup for the production environment.
First question is about the data node numbers and hardware guidance to meet the goal of daily data digest of 6TB. I just can't find any documents about the throughout general guidance. As for master node, apparently three should be fine. How about data node? how many should be good and what kind of hardware specification?
Second question is about multiple coordinating nodes and how to connect to kibana. I believe the best practice would config multiple coordinating nodes for fault tolerance, and also can keep large number of connections at the same time. Those coordinating nodes do not need load balancer in front of them because they have build-in smart load balancing function. Logstash and many other languages can handle this multiple hosts without loading balancer. However, kibana can’t connect to multiple coordinating nodes, based on below the latest doc:
Based on above doc, the recommend way is to setup kibana on the same coordinating node. However, in this case, there is only one coordinating node for the whole ES cluster, which should not be the best practice in the production environment.
Really appreciate any help!