I currently have a single node set up (a single machine where E-L-K is installed ) where a couple of gzip files are being ingested , each of these files would be of ~600MB and has several thousand lines of logs and understandably takes too much time to populate my kibana dashboards .
The current machine(single node) set up is of 16GB RAM and 100GB HDD with 8 cores, i think it may be time to set up a cluster to increase ingestion/indexing speed. I want to know if machines of above configuration would be enough to set up a machine and how many would be needed ?
And some notes on setting up a cluster would be helpful as well ?
Scaling out may help with indexing speed, but as indexing often is limited by I/O performance I would recommend switching to SSDs. I would not be surprised if disk performance is your current bottleneck given that you are using spinning disks. This video explains the benefit of using SSDs compared to HDD quite well.
Thank you @Christian_Dahlqvist for suggestion on SSD, would machines of RAM 16GB be sufficient as well ? and how many hosts would i need . My current setup is a virtual machine
If you look at resources around capacity planning, e.g. this webinar, you will notice that indexing performance often is limited by CPU and disk I/O while the total amount of data a node can hold and serve queries for often is limited by heap size. Without knowing data volumes and retention periods it is difficult to give recommendations.
thank you my JVM heap size is at 4GB now . I'm pasting the current index doc count and storage size that came in from 5 log files, each ~500MB. The rentention period for this data , would be for 6 hours .
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.