Hi, I am planning to use Elasticsearch Cluster with 4 different servers with each Node on one server. Out of which 3 Nodes are Master Eligible and Data Nodes and the other is Coordinating only Node(Kibana is also running on the same machine).
Data ingestion would be 2GB daily. I am planning to have 2 Primary Shards and 1 Replica Shard for each index(Having too many Primary Shards would effect Performance).
Hardware Specs:
CPU - 4x2.2ghz
Memory- 32gb
Disk - 600gb
Please let me know if I am missing anything. Your help is highly appreciated.
It's a wide opened question so that's not a surprised only a few would answer.
The best architecture is the one which fit to your need.
Here is from my experience what I can say.
Your node have everything they need to support an elk solution.
discovery.zen.ping.unicast.hosts:
network.host:
the other is Coordinating only
From the documentation
Requests like search requests or bulk-indexing requests may involve data held on different data nodes. A search request, for example, is executed in two phases which are coordinated by the node which receives the client request — the coordinating node.
Maybe you are looking for an ingest node:
In my point of view, 2Gb a day is not enough to have that kind of process, i would use the 4th node in the cluster.
I am not using ingest node because there is another node with Logstash which pre-process the documents before they get indexed. Based upon the data ingestion I can always add another node as per your suggestion.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.