I am currently working on designing a ELK Domain via AWS ES Service, and I trying to decided how many dedicated master, data, ingest, etc. nodes I require.
I am ingesting data from AWS Cloudtrail via Lambda, bringing in log data across several different accounts, and will be adding log data ingestion from several IIS servers.
Currently, I see near 1,000,000 documents being index per 24 hours. My ELK domain as of yesterday had 3 default nodes with 10GB's of storage each. None dedicated anything.
I am waiting for my ELK to process adding an additional 5 nodes and increase the storage of each node to 30GB's.
I am waiting to see how many logs are to be expected from IIS - though I would assume a good approximation would be another 1,000,000 just to have a marginal overestimation.
I am thinking of going by the rule of thumb and include 3 dedicated master nodes, with the 8 default that currently are being processed, and then I am thinking of including an ingestion node but am not sure what the rule of thumb is for that node type.
For that little storage I would recommend going with a basic cluster of 3 identical nodes which all hold data and are master eligible. At this size there is no point adding dedicated node types. How large you need to make the nodes depend on how much storage you need.
Is that 50 indices in total or 50 different time-based indices?
Try to keep the number of indices to a minimum as having lots of small indices and shards in a cluster is very inefficient and can cause performance problems.
There is often no need to have a separate index per log type so I would recommend you consolidate. Also try adjust the time period covered by each index so you get a shard size ideally over 1GB.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.