My_index/_stats/indexing (high number of "index_failed" operations)

Logan_H · April 14, 2021, 9:17pm

I'm supporting a cluster with an index that is seeing a high number in "indexed_failed" operations as shown by the indexing stats. I'm trying to understand what type of scenarios would cause this counter to increment up. What exactly is considered a "indexed_failed" operation and what might the causes be?

Logstash indexing to > Elasticsearch 5.6.4 and seeing an excessive amount io wait time and poor performance overall.

GET my_index/_stats/indexing 
{
  "_shards" : {
    "total" : 360,
    "successful" : 360,
    "failed" : 0
  },
  "_all" : {
    "primaries" : {
      "indexing" : {
        "index_total" : 39546274,
        "index_time_in_millis" : 32693876,
        "index_current" : 0,
        "index_failed" : 17762193,
        "delete_total" : 1059084,
        "delete_time_in_millis" : 216423,
        "delete_current" : 0,
        "noop_update_total" : 0,
        "is_throttled" : false,
        "throttle_time_in_millis" : 0
      }
    },
    "total" : {
      "indexing" : {
        "index_total" : 108377320,
        "index_time_in_millis" : 117903886,
        "index_current" : 6,
        "index_failed" : 17762193,
        "delete_total" : 3072604,
        "delete_time_in_millis" : 965006,
        "delete_current" : 0,
        "noop_update_total" : 0,
        "is_throttled" : false,
        "throttle_time_in_millis" : 0
      }
    }
  },
  "indices" : {
    "my_index" : {
      "primaries" : {
        "indexing" : {
          "index_total" : 39546274,
          "index_time_in_millis" : 32693876,
          "index_current" : 0,
          "index_failed" : 17762193,
          "delete_total" : 1059084,
          "delete_time_in_millis" : 216423,
          "delete_current" : 0,
          "noop_update_total" : 0,
          "is_throttled" : false,
          "throttle_time_in_millis" : 0
        }
      },
      "total" : {
        "indexing" : {
          "index_total" : 108377320,
          "index_time_in_millis" : 117903886,
          "index_current" : 6,
          "index_failed" : 17762193,
          "delete_total" : 3072604,
          "delete_time_in_millis" : 965006,
          "delete_current" : 0,
          "noop_update_total" : 0,
          "is_throttled" : false,
          "throttle_time_in_millis" : 0
        }
      }
    }
  }
}

warkolm · April 14, 2021, 11:08pm

Welcome to our community!

5.6 has been EOL for 2 years now, you really need to upgrade as a matter of urgency.

Why do you have so many shards?

Logan_H · April 15, 2021, 12:13am

I couldn't agree with you more that the version is old and this is a lot of shards. We're actively working on upgrading to the latest version but need to solve the problem in order to clear a path to get there. The index has a 937GB pri.store.size with 60 primary shards and (5 replicas at the moment for testing).

But what I'm trying to understand is exactly what causes the "index_failed" count to be incremented and how concerned I should be that the index_failed count is nearly 50% of the index_total count.

The exact meaning of the counter isn't documented well anywhere that I can find. It's not documented in 5.4 and the 7.x docs say that it's the "Number of failed indexing operations" Which doesn't help me to understand what causes that to happen. I don't see any errors in Logstash and we don't see to be missing any documents.

warkolm · April 15, 2021, 3:25am

The could be failures that Logstash has automatically retried for you.

Logan_H · April 17, 2021, 4:16pm

Update. I was able to confirm what was causing the the "index_failed" counts. Logstash was configured to use external version numbers but there was a flaw in our config that was causing it to use the same version number multiple times. This was causing updates in Elasticsearch to return a 409 error due to a version conflict which increments the index_failed counter in the index stats.

system · May 15, 2021, 4:17pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to trigger "indexing.index_failed" stat Elasticsearch	1	751	June 29, 2020
Index_failed number is increasing after adding a new node to elasticsearch cluster(previously single node) Elasticsearch	3	528	April 18, 2023
Indexing operations have failed Elasticsearch	5	549	January 30, 2023
Index_current and Index_failed Elasticsearch	1	1168	June 23, 2017
Troubleshoot "low" Logstash -> ES indexing rate Elasticsearch	17	2411	April 13, 2017

My_index/_stats/indexing (high number of "index_failed" operations)

Related topics