Improving log ingestion speed and faster elasticsearch indexing

Shreesh_Narayanan · October 4, 2021, 6:16pm

Hello,

I currently have a single node set up (a single machine where E-L-K is installed ) where a couple of gzip files are being ingested , each of these files would be of ~600MB and has several thousand lines of logs and understandably takes too much time to populate my kibana dashboards .

The current machine(single node) set up is of 16GB RAM and 100GB HDD with 8 cores, i think it may be time to set up a cluster to increase ingestion/indexing speed. I want to know if machines of above configuration would be enough to set up a machine and how many would be needed ?

And some notes on setting up a cluster would be helpful as well ?

Christian_Dahlqvist · October 4, 2021, 6:34pm

Scaling out may help with indexing speed, but as indexing often is limited by I/O performance I would recommend switching to SSDs. I would not be surprised if disk performance is your current bottleneck given that you are using spinning disks. This video explains the benefit of using SSDs compared to HDD quite well.

Shreesh_Narayanan · October 4, 2021, 7:10pm

Thank you @Christian_Dahlqvist for suggestion on SSD, would machines of RAM 16GB be sufficient as well ? and how many hosts would i need . My current setup is a virtual machine

Christian_Dahlqvist · October 4, 2021, 7:15pm

If you look at resources around capacity planning, e.g. this webinar, you will notice that indexing performance often is limited by CPU and disk I/O while the total amount of data a node can hold and serve queries for often is limited by heap size. Without knowing data volumes and retention periods it is difficult to give recommendations.

Shreesh_Narayanan · October 4, 2021, 7:23pm

thank you my JVM heap size is at 4GB now . I'm pasting the current index doc count and storage size that came in from 5 log files, each ~500MB. The rentention period for this data , would be for 6 hours .

Shreesh_Narayanan · October 4, 2021, 7:23pm

And thank you for the webinar link

system · November 1, 2021, 7:24pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Cluster configuration for log storage. 140Gb/day Elasticsearch	5	3413	November 10, 2017
Recommendation/Tips for insert heavy deployment (500K/sec)? Elasticsearch	4	539	July 6, 2017
ES indexing throughput and scalability Elasticsearch	7	1060	July 6, 2017
Index speed degradation Elasticsearch	7	470	July 6, 2017
Cluster optimization(indexing/query performace) Elasticsearch	4	345	July 6, 2017

Improving log ingestion speed and faster elasticsearch indexing

Related topics