Index mgmt as it relates to elastic ingest rate

Im running into some problems with elastic ingest.

We're running 5 data nodes on kubernetes (hosts have 56 cores / 64Gb RAM). Data is coming as beats -> kafka -> logstash (also via k8) -> elastic

Ingest rate seems to be topping out at 200-300 e/s so we're lagging considerably behind our data feed.

Seeing errors like this pretty frequently:

[logstash.outputs.elasticsearch] retrying failed action with response code: 429 ({"type"=>"es_rejected_execution_exception", "reason"=>"rejected execution of processing of [7142251][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[filebeat-2019.08.05][0]] containing [124] requests, target allocation id: aSctrLvyTpSZkzPqVIvE0A, primary term: 2 on EsThreadPoolExecutor[name = elasticsearch-data-1/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@8085181[Running, pool size = 1, active threads = 1, queued tasks = 200, completed tasks = 5601415]]"})

Looking at the above index ("filebeat-2019.08.05") it's 30Gb (with >40M docs) (and growing).


  1. Should I start rolling this index over every hour?
  2. My cluster has ~2600 primary shards and ~800 indexes. Too many?
  3. I've split up my busier inputs into separate pipelines. On several of these Im seeing "output" values like "1.92k ms/e". Is this latency due to something I have (mis)configured?

Cluster is all 7.3.0 - 5 data nodes, 5 ingest and 3 master.