I have a production elk cluster where i have setup a logstash node to read aws elb logs via s3-input-plugin from s3 bucket and than parse it and ingest it to Elasticsearch cluster. Last week there was a huge count of hits to my elastic load balancer (15k/sec) and the logstash node is not able to process that much data at higher pace. Due to this there was a significance lag of events in elasticsearch and my dashboards were not displaying latest graphs. Even the lag increased to 24 hours after a day.
To rectify this temporarily, i update logstash sicnedb file to latest timestamp so that it should get sync to latest elb logs and forget older ones.
My question here is, can we have multiple logstash nodes which can read from same s3 location parallely and process data such that each node should be reading unique file from the given location and not the same which other node is reading.