I'm trying to tune our Logstash jobs that are sending documents to Elasticsearch. I have noticed that I get a number of 429 errors, so we've adjusted the bulk queue size from 50 to 500 and this has diminished, though still occurs for some jobs.
Our Logstash jobs typically will run for days before finishing processing all our data from Kafka topics.
One thing I've noticed is that regardless of the changes I make to the jobs - eg, running a single Logstash on one machine, multiple Logstashes on one machine, or multiple Logstashes on two machines, the indexing rate starts out high (eg, 12,000 docs/s) but gradually drifts down til after say 6 hours, it's between 4000 and 5000 docs/s.
The Kafka hosts, Logstash hosts and Elasticsearch hosts are reportedly not heavily loaded during this time, which is odd.
Is it normal for the indexing rate to "lose momentum" over time?