Filebeats -> Kafka 0.9 ->Logstash 5.0.2 -> ElasticSearch 5.2
We push more then 30k/sec events from Logstash to ES using multiple indexers consuming from Kafka. some peak numbers we see upto 70 mil per 30 mins. ES is 18 data node/2 master node cluster with 9:1 primary:replica for indices . Its all works perfect for 4 days and all of a sudden Kafka lag tops to 15 mil with no processing but ES node is all green and healthy.
Restarted one indexer and now I see 429 errors and no more processing in indexer logs. stopped file beats as lag is growing. Any inputs are appreciated.