Template continuously submitted possibly causing failed ingestion

Hi!

So, I have an odd question having to do with index template submission through Filebeat. We have Filebeat configured on containers that scale up based on usage setup to submit an index template with the overwrite option set to enabled.

We were under the impression that Filebeat would submit these templates once per start up of the Filebeat process. However, we are noticing that in our cluster's pending_task there is a constant stream of task like the below.

{
  "tasks": [
    {
      "insert_order": 115026,
      "priority": "URGENT",
      "source": "create-index-template [filebeat-omahaserver-*], cause [api]",
      "executing": true,
      "time_in_queue_millis": 10814,
      "time_in_queue": "10.8s"
    },
    {
      "insert_order": 115027,
      "priority": "URGENT",
      "source": "create-index-template [filebeat-v6.4-20*], cause [api]",
      "executing": false,
      "time_in_queue_millis": 10692,
      "time_in_queue": "10.6s"
    },
    {
      "insert_order": 115028,
      "priority": "URGENT",
      "source": "create-index-template [filebeat-v6.4-20*], cause [api]",
      "executing": false,
      "time_in_queue_millis": 10456,
      "time_in_queue": "10.4s"
    },
    {
      "insert_order": 115029,
      "priority": "URGENT",
      "source": "create-index-template [filebeat-v6.4-20*], cause [api]",
      "executing": false,
      "time_in_queue_millis": 10143,
      "time_in_queue": "10.1s"
    },
    {
      "insert_order": 115030,
      "priority": "URGENT",
      "source": "create-index-template [filebeat-omahaserver-*], cause [api]",
      "executing": false,
      "time_in_queue_millis": 6466,
      "time_in_queue": "6.4s"
    }
  ]
}

The pending_task queue can sometimes grow to 50+ task, where we've noticed that we start having trouble ingesting logs.

What I'm curious about is the following:

  1. Is the operation to update an index template expensive?
  2. When exactly does the Filebeat process submit its index template to the cluster, does it continuously attempt to submit templates?
  3. Would a backup of the pending_task queue of create-index-template cause us to also have a back up of ingesting logs?

Really appreciate the help!
Max

Adding some additional context to this (I'm on Max's team):

When we're experiencing this issue with many URGENT tasks in the task queue, we see the write queues associated with our daily indices filling up to capacity (200 tasks queued, most being rejected), and we begin to lose logs. Additionally, we see many errors like the following in the ES log on the master node, even when not experiencing the dropped log issue:

[2019-01-15T03:05:55,889][DEBUG][o.e.a.a.i.m.p.TransportPutMappingAction] [prod-lift-elasticsearch26] failed to put mappings on indices [[[filebeat-v6.4-2019.01.15-000001/kkuVQdpAQ9a5zsSSDn6AoA]]], type [doc]
org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException: failed to process cluster event (put-mapping) within 30s

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.