Improving log ingestion speed and faster elasticsearch indexing


I currently have a single node set up (a single machine where E-L-K is installed ) where a couple of gzip files are being ingested , each of these files would be of ~600MB and has several thousand lines of logs and understandably takes too much time to populate my kibana dashboards .

The current machine(single node) set up is of 16GB RAM and 100GB HDD with 8 cores, i think it may be time to set up a cluster to increase ingestion/indexing speed. I want to know if machines of above configuration would be enough to set up a machine and how many would be needed ?

And some notes on setting up a cluster would be helpful as well ?

Scaling out may help with indexing speed, but as indexing often is limited by I/O performance I would recommend switching to SSDs. I would not be surprised if disk performance is your current bottleneck given that you are using spinning disks. This video explains the benefit of using SSDs compared to HDD quite well.

Thank you @Christian_Dahlqvist for suggestion on SSD, would machines of RAM 16GB be sufficient as well ? and how many hosts would i need . My current setup is a virtual machine

If you look at resources around capacity planning, e.g. this webinar, you will notice that indexing performance often is limited by CPU and disk I/O while the total amount of data a node can hold and serve queries for often is limited by heap size. Without knowing data volumes and retention periods it is difficult to give recommendations.

1 Like

thank you :slight_smile: my JVM heap size is at 4GB now . I'm pasting the current index doc count and storage size that came in from 5 log files, each ~500MB. The rentention period for this data , would be for 6 hours .

And thank you for the webinar link :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.