you have to make a plan like data retention period,shard strategy (if your per day data size is more than 50 gb then you have to create 2 shards for one index)
you can read this documentation.
it is good to have 2 client/coordinate nodes in your elasticsearch cluster, apart from 3 nodes of elastic.
Yes, that is correct. You always want to have threee master eligible nodes in the cluster and if the nodes have the same specification it makes sense to let them have all roles.
With only three nodes the default is fine. The amount of data each node can handle will generally be driven by the amount of heap you have and how efficiently you use this and/or performance requirements.
I disagree. Adding coordinating nodes can benefit, but it is very use case specific. Adding two data nodes will in my mind make a much bigger difference than adding two coordinating nodes.
Why do you need a single access point? All client libraries, as well as Beats and Logstash, supports specifying multiple node addresses. If you add a client node that will be a single point of failure, which is typically what you are trying to avoid when using a cluster. If you have more than one client node for resiliency the clients need to be able to connect to all of them, and in that case they could just as well connect directly to multiple data nodes.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.