Optimizing configuration for ingestion

Hi all,

I've been experimenting with my 3 nodes cluster with a focus on pushing the ingestion performance. The data set I use has 3 million lines, ~840MB, going in through logstash. Although I get decent performance ingesting into an empty index, getting ~6k lines/second, the ingestion slows down as the number of documents in the index grow. After a while I see in the logstash log file entries indicating the elasticsearch ingestion endpoint is not responding. Looking on the motoring tab and running REST query calls with Postman, I see the number of segments fluctuates constantly, seemingly indicating frequent segments merging, which I thought can get expensive as the size of the segments grow and cause elasticsearch to throttle down ingestion.

I do have my 3 hosts on VMs sharing a spinning disk managed by VMware ESXi, but before I try to switch to SSD for datastore, does anybody has any suggestion on how I can debug on the elasticsearch side and narrow down or confirm the cause of my slowing ingestion performance?


This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.