Index and lifecycle error and failed to rotate to warm phrase by ILM

vector-diags-000005 failed to rotate.
As you can see, 000004 was automatically rotated to 000005.
But 000005 continued on to grow in size until it errors out.

vector-diags-000005  3  P  STARTED  1745886 262.8mb 192.168.146.54  es-data-r2-4
vector-diags-000005  1  P  STARTED  2727704 374.5mb 192.168.156.114 es-data-r1-4
vector-diags-000005  5  P  STARTED  2734417 379.9mb 192.168.142.170 es-data-r1-2
vector-diags-000005  0  P  STARTED  2734853 374.7mb 192.168.187.119 es-data-r2-3
vector-diags-000005  4  P  STARTED  2754168 375.1mb 192.168.180.247 es-data-r1-0
vector-diags-000005  2  P  STARTED  2768421 374.8mb 192.168.173.185 es-data-r1-3
vector-diags-000004  5  P  STARTED    86699  10.2mb 192.168.187.119 es-data-r2-3
vector-diags-000004  2  P  STARTED    89166  10.5mb 192.168.173.185 es-data-r1-3
vector-diags-000004  4  P  STARTED    90304  10.7mb 192.168.156.114 es-data-r1-4
vector-diags-000004  3  P  STARTED    90890  10.7mb 192.168.131.218 es-data-r2-5
vector-diags-000004  0  P  STARTED    90953  10.7mb 192.168.163.252 es-data-r2-2

Clicking "Show phase definition" from above screen shows the following.

    {
      "policy": "daas_standard_policy",
      "phase_definition": {
        "min_age": "0ms",
        "actions": {
          "rollover": {
            "max_size": "50gb",
            "max_age": "3d"
          }
        }
      },
      "version": 3,
      "modified_date_in_millis": 1573018277330
    }

It was worked around by the following method.

POST /_aliases 
{ 
  "actions": [ 
    { 
      "add": { 
        "index": "vector-diags-000005", 
        "alias": "vector-diags", 
        "is_write_index": false 
      } 
    } 
  ] 
} 

PUT vector-diags-000006
{ 
  "aliases": { 
    "vector-diags":{ 
      "is_write_index": true  
    } 
  } 
} 

Hey there Chunan!

Two questions:

  1. What is the output when you run the following in Dev Tools?

(*assuming the index template that points to the 'vector-diags' index is called 'vector-diags')

GET _template/vector-diags

  1. This is a long shot, but I noticed from your second screen shot the Current action time the value is 2019-11-12 15:18:02; what is your current system time?

GET _cat/templates

vector-diags                [vector-diags-*]              0          1
.slm-history                [.slm-history-1*]             2147483647 
.ml-notifications           [.ml-notifications]           0          7040199
vector-custom_ats_2         [vector-custom_ats_2-*]       0          1
.ml-meta                    [.ml-meta]                    0          7040199
vector-teakd                [vector-teakd-*]              0          1
daas_def                    [*]                           999        
.logstash-management        [.logstash]                   0          
.data-frame-internal-2      [.data-frame-internal-2]      0          7040199
.management-beats           [.management-beats]           0          70000
vector-messages             [vector-messages-*]           0          1
.ml-config                  [.ml-config]                  0          7040199
daas_sys                    [.*]                          1000       
.ml-state                   [.ml-state*]                  0          7040199
.ml-anomalies-              [.ml-anomalies-*]             0          7040199
.data-frame-notifications-1 [.data-frame-notifications-*] 0          7040199

GET _template/vector-diags

{
  "vector-diags" : {
    "order" : 0,
    "version" : 1,
    "index_patterns" : [
      "vector-diags-*"
    ],
    "settings" : {
      "index" : {
        "lifecycle" : {
          "name" : "daas_standard_policy",
          "rollover_alias" : "vector-diags"
        },
        "number_of_shards" : "6",
        "number_of_replicas" : "0"
      }
    },
    "mappings" : {
      "dynamic_templates" : [
        {
          "message_field" : {
            "path_match" : "message",
            "mapping" : {
              "norms" : false,
              "type" : "text"
            },
            "match_mapping_type" : "string"
          }
        },
        {
          "string_fields" : {
            "mapping" : {
              "norms" : false,
              "fields" : {
                "keyword" : {
                  "ignore_above" : 128,
                  "type" : "keyword"
                }
              },
              "type" : "text"
            },
            "match_mapping_type" : "string",
            "match" : "*"
          }
        }
      ],
      "properties" : {
        "msg" : {
          "norms" : false,
          "type" : "text"
        },
        "server" : {
          "norms" : false,
          "type" : "text"
        },
        "hostname" : {
          "norms" : false,
          "type" : "text",
          "fields" : {
            "keyword" : {
              "ignore_above" : 128,
              "type" : "keyword"
            }
          }
        },
        "log_type" : {
          "type" : "keyword"
        },
        "site" : {
          "type" : "keyword"
        },
        "@timestamp" : {
          "type" : "date"
        },
        "@version" : {
          "type" : "keyword"
        },
        "beat" : {
          "properties" : {
            "name" : {
              "type" : "keyword"
            },
            "version" : {
              "type" : "keyword"
            }
          }
        },
        "source" : {
          "norms" : false,
          "type" : "text"
        },
        "timestamp" : {
          "norms" : false,
          "type" : "text"
        }
      }
    },
    "aliases" : { }
  }
}

Our system time is UTC and accurately updated. As you can see from the following screenshot. All indices were properly rotated by ILM (small index size). But vector-diags-000005 was either dropped off from ILM or ILM failed, so it did not get rotated, but spent months to grow all the way up to 2gb that reached a capacity limit (my guess) and eventually failed. Maybe that's why it is dated back in 2019-11-12.

This is weird. Our daas-standard-policy has the following setting. So, the vector-diags-000005 should be rotated to a new vector-diags-000006 index in 3 days, but did not. It should not error out before hitting 50gb, which was specified in hot phase spec.

Good afternoon Chunan,

Has this index rolled over successfully since you added the alias?

I just found a similar problem that occurred before. There was a case for it.
https://support.elastic.co/customers/s/case/5004M00000Yns4zQAB

Thanks Chunan.

Do you have logging turned for this cluster, and if so, would it be possible to see when this first happened? There should be an entry in the logs of...

....policy [daas_standard_policy] for index [vector-diags-000005] failed on step [{"phase":"hot","action":"rollover","name":"check-rollover-ready"}]. Moving to ERROR step....

I found the following from all logs. They contains vector-diags-000005. I will paste them in the following.

[2020-03-26T14:13:58,162][INFO ][l.o.elasticsearch        ] retrying failed action with response code: 503 ({"type"=>"unavailable_shards_exception", "reason"=>"[vector-diags-000005][2] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[vector-diags-000005][2]] containing [2] requests]"})
[2020-03-26T14:13:58,162][INFO ][l.o.elasticsearch        ] retrying failed action with response code: 503 ({"type"=>"unavailable_shards_exception", "reason"=>"[vector-diags-000005][2] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[vector-diags-000005][2]] containing [2] requests]"})
[2020-03-26T14:13:58,163][INFO ][l.o.elasticsearch        ] retrying failed action with response code: 503 ({"type"=>"unavailable_shards_exception", "reason"=>"[vector-diags-000005][2] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[vector-diags-000005][2]] containing [index {[vector-diags][_doc][S4MwF3EBumyRkgKTXhK4], source[{\"beat\":{\"version\":\"6.5.4\",\"name\":\"ccdn-ats-tk-22315-02.romeoville.il.chicago.comcast.net\"},\"@version\":\"1\",\"msg\":\"NOTE: Failure threshold met failcount:14 >= threshold:10, http parent proxy odol-atsec-sbe-03.mishawaka.in.sbend.comcast.net:80 marked down\",\"@timestamp\":\"2020-03-26T14:07:50.476Z\",\"hostname\":\"ccdn-ats-tk-22315-02\",\"log_type\":\"vector-diags\",\"source\":\"/opt/trafficserver/var/log/trafficserver/diags.log\",\"timestamp\":\"Mar 26 14:07:50.476\",\"server\":\"{0x2b7833c45700}\",\"site\":\"22315\"}]}]]"})
[2020-03-26T14:13:58,235][INFO ][l.o.elasticsearch        ] retrying failed action with response code: 503 ({"type"=>"unavailable_shards_exception", "reason"=>"[vector-diags-000005][2] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[vector-diags-000005][2]] containing [index {[vector-diags][_doc][oS0wF3EBauAFzIczXnLU], source[{\"beat\":{\"version\":\"6.5.4\",\"name\":\"ccdn-ats-tk-16801-01.beaverton.or.bverton.comcast.net\"},\"@timestamp\":\"2020-03-26T14:07:13.880Z\",\"msg\":\"NOTE: recovery clearing offsets of Vol /dev/sds 229376:146440164 : [702694728192, 702703116800] sync_serial 51310 next 51311\",\"@version\":\"1\",\"hostname\":\"ccdn-ats-tk-16801-01\",\"log_type\":\"vector-diags\",\"source\":\"/opt/trafficserver/var/log/trafficserver/diags.log\",\"timestamp\":\"Mar 26 14:07:13.880\",\"server\":\"{0x2b788e545700}\",\"site\":\"16801\"}]}]]"})
[2020-03-26T14:15:04,199][INFO ][l.o.elasticsearch        ] retrying failed action with response code: 503 ({"type"=>"unavailable_shards_exception", "reason"=>"[vector-diags-000005][2] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[vector-diags-000005][2]] containing [index {[vector-diags][_doc][R68xF3EBumyRkgKTYAef], source[{\"beat\":{\"version\":\"6.5.4\",\"name\":\"ccdn-ats-tk-22315-02.romeoville.il.chicago.comcast.net\"},\"@timestamp\":\"2020-03-26T14:07:56.993Z\",\"msg\":\"ERROR: [72526471] Slow Request: client_ip: 127.0.0.1:52199 protocol: http url: http://mpeg2origin.sys.comcast.net/mp2vod17/upfaithandfamily.com/UPMV0000161723681302/1583157284/upfaithandfamily.comUPMV0000161723681302.ts status: 206 unique id:  redirection_tries: 0 bytes: 40055 fd: -1 client state: 5 server state: 0 ua_begin: 0.000 ua_first_read: 0.000 ua_read_header_done: 0.000 cache_open_read_begin: 0.000 cache_open_read_end: 0.000 dns_lookup_begin: 0.000 dns_lookup_end: 0.000 server_connect: 0.000 server_connect_end: -0.001 server_first_read: 13.814 server_read_header_done: 13.814 server_close: 13.826 ua_write: 13.814 ua_close: 13.814 sm_finish: 13.826 plugin_active: 0.000 plugin_total: 0.000\",\"@version\":\"1\",\"hostname\":\"ccdn-ats-tk-22315-02\",\"log_type\":\"vector-diags\",\"source\":\"/opt/trafficserver/var/log/trafficserver/diags.log\",\"timestamp\":\"Mar 26 14:07:56.993\",\"server\":\"{0x2b7833f4b700}\",\"site\":\"22315\"}]}]]"})
[2020-03-26T15:01:26,582][INFO ][l.o.elasticsearch        ] retrying failed action with response code: 503 ({"type"=>"unavailable_shards_exception", "reason"=>"[vector-diags-000005][2] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[vector-diags-000005][2]] containing [index {[vector-diags][_doc][bnZbF3EBauAFzIcz1fyM], source[{\"beat\":{\"version\":\"6.5.4\",\"name\":\"ccdn-ats-tk-16702-03.pittsburgh.pa.pitt.comcast.net\"},\"@version\":\"1\",\"msg\":\"ERROR: [12356995] Slow Request: client_ip: 127.0.0.1:57439 protocol: http url: http://mpeg2origin.sys.comcast.net/mp2vod7/natgeochannel.com/MOHD4030619320181022/1540200164/natgeochannel.comMOHD4030619320181022.ts status: 206 unique id:  redirection_tries: 0 bytes: 40054 fd: -1 client state: 5 server state: 0 ua_begin: 0.000 ua_first_read: 0.000 ua_read_header_done: 0.000 cache_open_read_begin: 0.000 cache_open_read_end: 0.000 dns_lookup_begin: 0.000 dns_lookup_end: 14.642 server_connect: 14.642 server_connect_end: -0.001 server_first_read: 14.655 server_read_header_done: 14.655 server_close: 14.747 ua_write: 14.655 ua_close: 14.655 sm_finish: 14.747 plugin_active: 0.000 plugin_total: 0.000\",\"@timestamp\":\"2020-03-26T15:00:22.658Z\",\"hostname\":\"ccdn-ats-tk-16702-03\",\"log_type\":\"vector-diags\",\"source\":\"/opt/trafficserver/var/log/trafficserver/diags.log\",\"timestamp\":\"Mar 26 15:00:22.658\",\"server\":\"{0x2b1ab3e51700}\",\"site\":\"16702\"}]}]]"})
[2020-03-26T15:01:29,554][INFO ][l.o.elasticsearch        ] retrying failed action with response code: 503 ({"type"=>"unavailable_shards_exception", "reason"=>"[vector-diags-000005][2] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[vector-diags-000005][2]] containing [index {[vector-diags][_doc][DndbF3EBauAFzIcz4ATE], source[{\"beat\":{\"version\":\"6.5.4\",\"name\":\"ccdn-ats-tk-13301-01.lancaster.pa.pitt.comcast.net\"},\"@version\":\"1\",\"msg\":\"ERROR: [34319979] Slow Request: client_ip: 127.0.0.1:51974 protocol: http url: http://mpeg2origin.sys.comcast.net/mp2vod3/indemand.com/INMV0719204000624041/1561541449/indemand.comINMV0719204000624041.ts status: 206 unique id:  redirection_tries: 0 bytes: 8039 fd: -1 client state: 5 server state: 0 ua_begin: 0.000 ua_first_read: 0.000 ua_read_header_done: 0.000 cache_open_read_begin: 0.000 cache_open_read_end: 0.000 dns_lookup_begin: 0.000 dns_lookup_end: 14.070 server_connect: 14.070 server_connect_end: -0.001 server_first_read: 14.115 server_read_header_done: 14.115 server_close: 14.149 ua_write: 14.115 ua_close: 14.115 sm_finish: 14.149 plugin_active: 0.000 plugin_total: 0.000\",\"@timestamp\":\"2020-03-26T15:00:22.061Z\",\"hostname\":\"ccdn-ats-tk-13301-01\",\"log_type\":\"vector-diags\",\"source\":\"/opt/trafficserver/var/log/trafficserver/diags.log\",\"timestamp\":\"Mar 26 15:00:22.061\",\"server\":\"{0x2af10e828700}\",\"site\":\"13301\"}]}]]"})

More

[2020-03-26T15:01:32,007][INFO ][l.o.elasticsearch        ] retrying failed action with response code: 503 ({"type"=>"unavailable_shards_exception", "reason"=>"[vector-diags-000005][2] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[vector-diags-000005][2]] containing [index {[vector-diags][_doc][PgVbF3EBumyRkgKT6q5y], source[{\"beat\":{\"version\":\"6.5.4\",\"name\":\"ccdn-ats-tk-19601-03.scranton.pa.pitt.comcast.net\"},\"@timestamp\":\"2020-03-26T15:00:25.033Z\",\"msg\":\"ERROR: [77309161] Slow Request: client_ip: 127.0.0.1:50906 protocol: http url: http://mpeg2origin.sys.comcast.net/mp2vod3/noggin.com/XPMV0000000001254263/1578673485/noggin.comXPMV0000000001254263.ts status: 206 unique id:  redirection_tries: 0 bytes: 8044 fd: -1 client state: 5 server state: 0 ua_begin: 0.000 ua_first_read: 0.000 ua_read_header_done: 0.000 cache_open_read_begin: 0.000 cache_open_read_end: 0.000 dns_lookup_begin: 0.000 dns_lookup_end: 14.270 server_connect: 14.270 server_connect_end: -0.001 server_first_read: 14.290 server_read_header_done: 14.290 server_close: 14.800 ua_write: 14.290 ua_close: 14.291 sm_finish: 14.800 plugin_active: 0.000 plugin_total: 0.000\",\"@version\":\"1\",\"hostname\":\"ccdn-ats-tk-19601-03\",\"log_type\":\"vector-diags\",\"source\":\"/opt/trafficserver/var/log/trafficserver/diags.log\",\"timestamp\":\"Mar 26 15:00:25.033\",\"server\":\"{0x2ab1aca68700}\",\"site\":\"19601\"}]}]]"})
[2020-03-26T15:16:26,903][INFO ][l.o.elasticsearch        ] retrying failed action with response code: 503 ({"type"=>"unavailable_shards_exception", "reason"=>"[vector-diags-000005][2] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[vector-diags-000005][2]] containing [index {[vector-diags][_doc][GBNpF3EBumyRkgKTkteO], source[{\"beat\":{\"version\":\"6.5.4\",\"name\":\"ccdn-ats-tk-19601-01.scranton.pa.pitt.comcast.net\"},\"@timestamp\":\"2020-03-26T15:15:24.977Z\",\"msg\":\"ERROR: [89208964] Slow Request: client_ip: 127.0.0.1:60544 protocol: http url: http://mpeg2origin.sys.comcast.net/mp2vod3/cinesony.com/SDSD1186800000003203/1582246765/cinesony.comSDSD1186800000003203.ts status: 206 unique id:  redirection_tries: 0 bytes: 8051 fd: -1 client state: 5 server state: 0 ua_begin: 0.000 ua_first_read: 0.000 ua_read_header_done: 0.000 cache_open_read_begin: 0.000 cache_open_read_end: 0.000 dns_lookup_begin: 0.000 dns_lookup_end: 14.098 server_connect: 14.098 server_connect_end: -0.001 server_first_read: 14.118 server_read_header_done: 14.118 server_close: 14.120 ua_write: 14.118 ua_close: 14.118 sm_finish: 14.120 plugin_active: 0.000 plugin_total: 0.000\",\"@version\":\"1\",\"hostname\":\"ccdn-ats-tk-19601-01\",\"log_type\":\"vector-diags\",\"source\":\"/opt/trafficserver/var/log/trafficserver/diags.log\",\"timestamp\":\"Mar 26 15:15:24.977\",\"server\":\"{0x2aeb16a52700}\",\"site\":\"19601\"}]}]]"})
[2020-03-26T15:16:29,904][INFO ][l.o.elasticsearch        ] retrying failed action with response code: 503 ({"type"=>"unavailable_shards_exception", "reason"=>"[vector-diags-000005][2] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[vector-diags-000005][2]] containing [index {[vector-diags][_doc][fBNpF3EBumyRkgKTnuRq], source[{\"beat\":{\"version\":\"6.5.4\",\"name\":\"ccdn-ats-tk-16702-04.pittsburgh.pa.pitt.comcast.net\"},\"@version\":\"1\",\"msg\":\"ERROR: [6696913] Slow Request: client_ip: 127.0.0.1:58333 protocol: http url: http://mpeg2origin.sys.comcast.net/mp2vod7/indemand.com/INMV1219231000702808/1576735457/indemand.comINMV1219231000702808.ts status: 206 unique id:  redirection_tries: 0 bytes: 8158 fd: -1 client state: 5 server state: 0 ua_begin: 0.000 ua_first_read: 0.000 ua_read_header_done: 0.000 cache_open_read_begin: 0.000 cache_open_read_end: 0.000 dns_lookup_begin: 0.000 dns_lookup_end: 10.092 server_connect: 10.092 server_connect_end: -0.001 server_first_read: 10.144 server_read_header_done: 10.144 server_close: 10.189 ua_write: 10.144 ua_close: 10.144 sm_finish: 10.189 plugin_active: 0.000 plugin_total: 0.000\",\"@timestamp\":\"2020-03-26T15:15:24.589Z\",\"hostname\":\"ccdn-ats-tk-16702-04\",\"log_type\":\"vector-diags\",\"source\":\"/opt/trafficserver/var/log/trafficserver/diags.log\",\"timestamp\":\"Mar 26 15:15:24.589\",\"server\":\"{0x2aac5fd63700}\",\"site\":\"16702\"}]}]]"})
[2020-03-26T15:17:28,905][INFO ][l.o.elasticsearch        ] retrying failed action with response code: 503 ({"type"=>"unavailable_shards_exception", "reason"=>"[vector-diags-000005][2] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[vector-diags-000005][2]] containing [index {[vector-diags][_doc][9RRqF3EBumyRkgKThMBn], source[{\"beat\":{\"version\":\"6.5.4\",\"name\":\"ccdn-ats-tk-19601-01.scranton.pa.pitt.comcast.net\"},\"@timestamp\":\"2020-03-26T15:15:24.977Z\",\"msg\":\"ERROR: [89208964] Slow Request: client_ip: 127.0.0.1:60544 protocol: http url: http://mpeg2origin.sys.comcast.net/mp2vod3/cinesony.com/SDSD1186800000003203/1582246765/cinesony.comSDSD1186800000003203.ts status: 206 unique id:  redirection_tries: 0 bytes: 8051 fd: -1 client state: 5 server state: 0 ua_begin: 0.000 ua_first_read: 0.000 ua_read_header_done: 0.000 cache_open_read_begin: 0.000 cache_open_read_end: 0.000 dns_lookup_begin: 0.000 dns_lookup_end: 14.098 server_connect: 14.098 server_connect_end: -0.001 server_first_read: 14.118 server_read_header_done: 14.118 server_close: 14.120 ua_write: 14.118 ua_close: 14.118 sm_finish: 14.120 plugin_active: 0.000 plugin_total: 0.000\",\"@version\":\"1\",\"hostname\":\"ccdn-ats-tk-19601-01\",\"log_type\":\"vector-diags\",\"source\":\"/opt/trafficserver/var/log/trafficserver/diags.log\",\"timestamp\":\"Mar 26 15:15:24.977\",\"server\":\"{0x2aeb16a52700}\",\"site\":\"19601\"}]}]]"})
[2020-03-31T00:33:48,766][INFO ][l.o.elasticsearch        ] retrying failed action with response code: 429 ({"type"=>"es_rejected_execution_exception", "reason"=>"rejected execution of processing of [335551529][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[vector-diags-000005][3]] containing [index {[vector-diags][_doc][E08CLnEBumyRkgKTNB3c], source[{\"beat\":{\"version\":\"6.5.4\",\"name\":\"ccdn-ats-tk-22001-02.49thst.pa.panjde.comcast.net\"},\"@version\":\"1\",\"msg\":\"ERROR: [2650964] Slow Request: client_ip: 127.0.0.1:60789 protocol: http url: http://mpeg2origin.sys.comcast.net/mp2vod15/indemand.com/INMV0217231000333232/1486410627/indemand.comINMV0217231000333232.index status: 206 unique id:  redirection_tries: 0 bytes: 8057 fd: -1 client state: 5 server state: 0 ua_begin: 0.000 ua_first_read: 0.000 ua_read_header_done: 0.000 cache_open_read_begin: 0.000 cache_open_read_end: 0.902 dns_lookup_begin: 31.062 dns_lookup_end: 31.062 server_connect: 31.062 server_connect_end: 0.936 server_first_read: 31.072 server_read_header_done: 31.072 server_close: -0.001 ua_write: 31.072 ua_close: 31.073 sm_finish: 31.073 plugin_active: 0.000 plugin_total: 0.000\",\"@timestamp\":\"2020-03-31T00:33:46.876Z\",\"hostname\":\"ccdn-ats-tk-22001-02\",\"log_type\":\"vector-diags\",\"source\":\"/opt/trafficserver/var/log/trafficserver/diags.log\",\"timestamp\":\"Mar 31 00:33:46.876\",\"server\":\"{0x2afbe6847700}\",\"site\":\"22001\"}]}], target allocation id: TpaPEPUoT0iJXRK53CEHmQ, primary term: 174 on EsThreadPoolExecutor[name = es-data-r2-4/write, queue capacity = 300, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@3ecd851[Running, pool size = 6, active threads = 6, queued tasks = 300, completed tasks = 165686243]]"})
[2020-04-05T19:02:38,158][INFO ][l.o.elasticsearch        ] retrying failed action with response code: 429 ({"type"=>"es_rejected_execution_exception", "reason"=>"rejected execution of processing of [562643113][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[vector-diags-000005][1]] containing [index {[vector-diags][_doc][_A65S3EB0_tjy993KLWH], source[{\"beat\":{\"version\":\"6.5.4\",\"name\":\"ccdn-ats-tk-11101-02.exeter.nh.boston.comcast.net\"},\"@timestamp\":\"2020-04-05T19:02:35.921Z\",\"msg\":\"ERROR: [130866440] Slow Request: client_ip: 127.0.0.1:60876 protocol: http url: http://mpeg2origin.sys.comcast.net/mp2vod19/sho.com/XPMV0001546340358000/1548491102/sho.comXPMV0001546340358000.8x_trick status: 206 unique id:  redirection_tries: 0 bytes: 8053 fd: -1 client state: 5 server state: 0 ua_begin: 0.000 ua_first_read: 0.000 ua_read_header_done: 0.000 cache_open_read_begin: 0.000 cache_open_read_end: 0.882 dns_lookup_begin: 32.835 dns_lookup_end: 32.835 server_connect: 32.835 server_connect_end: 32.852 server_first_read: 32.870 server_read_header_done: 32.870 server_close: -0.001 ua_write: 32.870 ua_close: 32.870 sm_finish: 32.870 plugin_active: 0.000 plugin_total: 0.000\",\"@version\":\"1\",\"hostname\":\"ccdn-ats-tk-11101-02\",\"log_type\":\"vector-diags\",\"source\":\"/opt/trafficserver/var/log/trafficserver/diags.log\",\"timestamp\":\"Apr  5 19:02:35.921\",\"server\":\"{0x2b7b4ae3a700}\",\"site\":\"11101\"}]}], target allocation id: 8dWPjYEZQemVtW7D0F5KOA, primary term: 57 on EsThreadPoolExecutor[name = es-data-r1-4/write, queue capacity = 300, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@613b73f4[Running, pool size = 6, active threads = 6, queued tasks = 300, completed tasks = 328410521]]"})
[2020-04-08T00:25:30,899][INFO ][l.o.elasticsearch        ] retrying failed action with response code: 503 ({"type"=>"unavailable_shards_exception", "reason"=>"[vector-diags-000005][3] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[vector-diags-000005][3]] containing [index {[vector-diags][_doc][jJcsV3EBwrzmcNz-kss5], source[{\"beat\":{\"version\":\"6.5.4\",\"name\":\"ccdn-ats-tk-21441-02.pompanobeach.fl.pompano.comcast.net\"},\"@version\":\"1\",\"msg\":\"ERROR: [104826098] Slow Request: client_ip: 127.0.0.1:54514 protocol: http url: http://mpeg2origin.sys.comcast.net/mp2vod8/sho.com/XPMV0001574940828000/1578448737/sho.comXPMV0001574940828000.-8x_trick status: 206 unique id:  redirection_tries: 0 bytes: 40811 fd: -1 client state: 5 server state: 0 ua_begin: 0.000 ua_first_read: 0.000 ua_read_header_done: 0.000 cache_open_read_begin: 0.000 cache_open_read_end: 0.894 dns_lookup_begin: 31.950 dns_lookup_end: 31.950 server_connect: 31.950 server_connect_end: 32.010 server_first_read: 32.077 server_read_header_done: 32.077 server_close: -0.001 ua_write: 32.077 ua_close: 32.077 sm_finish: 32.077 plugin_active: 0.000 plugin_total: 0.000\",\"@timestamp\":\"2020-04-07T20:37:36.251Z\",\"hostname\":\"ccdn-ats-tk-21441-02\",\"log_type\":\"vector-diags\",\"source\":\"/opt/trafficserver/var/log/trafficserver/diags.log\",\"timestamp\":\"Apr  7 20:37:36.251\",\"server\":\"{0x2b9d6a03a700}\",\"site\":\"21441\"}]}]]"})

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.