Curator delete_indices fail

Hi,

I'm aware of this topic:

I'm using ES 6.2.4 ans curator 5.5.4

My config.yml is:

client:
  hosts:
    - es1.corp.net
    - es2.corp.net
    - es3.corp.net
  port: 9200
  timeout: 30
logging:
  loglevel: DEBUG 

actions.yml

actions:
  1:
    action: delete_indices
    description: >-
      Delete indices older than 30 days, metricbeat
    options:
      ignore_empty_list: True
      disable_action: False
    filters:
    - filtertype: pattern
      kind: prefix
      value: metricbeat-
    - filtertype: age
      source: name
      direction: older
      timestring: '%Y.%m.%d'
      unit: days
      unit_count: 30

This curator config is running every day in a fesh docker container. It's not working since y upgrade from 6.2.3 to 6.2.4.

client = elasticsearch.Elasticsearch(hosts='es1.corp.net')

client.indices.stats(index='metricbeat-6.2.3-2018.05.24', metric='store,docs')
{'_shards': {'total': 2, 'successful': 2, 'failed': 0},
 '_all': {'primaries': {'docs': {'count': 766735, 'deleted': 0},
   'store': {'size_in_bytes': 281425602}},
  'total': {'docs': {'count': 1533470, 'deleted': 0},
   'store': {'size_in_bytes': 563361892}}},
 'indices': {'metricbeat-6.2.3-2018.05.24': {'primaries': {'docs': {'count': 766735,
     'deleted': 0},
    'store': {'size_in_bytes': 281425602}},
   'total': {'docs': {'count': 1533470, 'deleted': 0},
    'store': {'size_in_bytes': 563361892}}}}}

and (partial) debug log:

2018-06-11 12:33:58,867 DEBUG              curator.utils             get_client:803  kwargs = {'url_prefix': '', 'aws_secret_key': None, 'http_auth': None, 'certificate': None, 'aws_key': None, 'aws_sign_request': False, 'port': 9200, 'hosts': ['es1.corp.net', 'es2.corp.net', 'es3.corp.net'], 'timeout': 30, 'aws_token': None, 'use_ssl': False, 'master_only': False, 'client_cert': None, 'ssl_no_validate': False, 'client_key': None}
2018-06-11 12:33:58,870 DEBUG              curator.utils             get_client:878  "requests_aws4auth" module present, but not used.
2018-06-11 12:33:58,879 DEBUG              curator.utils          check_version:689  Detected Elasticsearch version 6.2.4
2018-06-11 12:33:58,879 DEBUG                curator.cli                    run:159  client is <class 'elasticsearch.client.Elasticsearch'>
2018-06-11 12:33:58,879 INFO                 curator.cli                    run:165  Trying Action ID: 1, "delete_indices": Delete indices older than 30 days, metricbeat
2018-06-11 12:33:58,879 DEBUG                curator.cli         process_action:44   Configuration dictionary: {'action': 'delete_indices', 'description': 'Delete indices older than 30 days, metricbeat', 'filters': [{'exclude': False, 'kind': 'prefix', 'filtertype': 'pattern', 'value': 'metricbeat-'}, {'direction': 'older', 'stats_result': 'min_value', 'filtertype': 'age', 'source': 'name', 'epoch': None, 'timestring': '%Y.%m.%d', 'exclude': False, 'unit_count': 30, 'unit': 'days'}], 'options': {}}
2018-06-11 12:33:58,880 DEBUG                curator.cli         process_action:45   kwargs: {'master_timeout': 30, 'dry_run': False}
2018-06-11 12:33:58,880 DEBUG                curator.cli         process_action:50   opts: {}
2018-06-11 12:33:58,880 DEBUG                curator.cli         process_action:62   Action kwargs: {'master_timeout': 30}
2018-06-11 12:33:58,880 DEBUG                curator.cli         process_action:91   Running "DELETE_INDICES"
2018-06-11 12:33:58,881 DEBUG          curator.indexlist          __get_indices:66   Getting all indices
2018-06-11 12:33:58,962 DEBUG              curator.utils            get_indices:644  Detected Elasticsearch version 6.2.4

2018-06-11 12:33:59,017 DEBUG          curator.indexlist     __build_index_info:81   Building preliminary index metadata for .monitoring-kibana-6-2018.01.21
2018-06-11 12:33:59,017 DEBUG          curator.indexlist          _get_metadata:175  Getting index metadata
2018-06-11 12:33:59,017 DEBUG          curator.indexlist       empty_list_check:224  Checking for empty list
2018-06-11 12:34:04,021 DEBUG          curator.indexlist       _get_index_stats:115  Getting index stats
2018-06-11 12:34:04,021 DEBUG          curator.indexlist       empty_list_check:224  Checking for empty list
2018-06-11 12:34:04,021 DEBUG          curator.indexlist           working_list:235  Generating working list of indices
2018-06-11 12:34:04,022 DEBUG          curator.indexlist           working_list:235  Generating working list of indices
2018-06-11 12:34:04,028 ERROR                curator.cli                    run:184  Failed to complete action: delete_indices.  <type 'exceptions.KeyError'>: 'indices'

Do you have an explanation ? In advance Thanks

How was curator installed in the docker container?

We'll see a lot more if you add an empty blacklist to your logging section, as follows:

logging:
  loglevel: DEBUG
  blacklist: []

The default behavior does not show the elasticsearch and urllib3 log traffic.

1 Like

I use this container in prod : https://hub.docker.com/r/bobrik/curator/

but for test I create a clean virtualenv on my machine an pip install it:

virtualenv -p python2 .venv
source .venv/bin/activate
pip install elasticsearch-curator
.venv/bin/curator --version                                                               
curator, version 5.5.4

Complete log is 306 line long, I cannot paste it here what are the relevant part ?

2018-06-11 17:02:41,979 DEBUG              curator.utils          check_version:689  Detected Elasticsearch version 6.2.4
2018-06-11 17:02:41,979 DEBUG                curator.cli                    run:161  client is <class 'elasticsearch.client.Elasticsearch'>
2018-06-11 17:02:41,979 INFO                 curator.cli                    run:167  Trying Action ID: 1, "delete_indices": Delete indices older than 30 days, metricbeat
2018-06-11 17:02:41,979 DEBUG                curator.cli         process_action:44   Configuration dictionary: {'action': 'delete_indices', 'description': 'Delete indices older than 30 days, metricbeat', 'filters': [{'exclude': False, 'kind': 'prefix', 'filtertype': 'pattern', 'value': 'metricbeat-'}, {'direction': 'older', 'stats_result': 'min_value', 'filtertype': 'age', 'source': 'name', 'epoch': None, 'timestring': '%Y.%m.%d', 'exclude': False, 'unit_count': 30, 'unit': 'days'}], 'options': {}}
2018-06-11 17:02:41,979 DEBUG                curator.cli         process_action:45   kwargs: {'master_timeout': 30, 'dry_run': False}
0, 'aws_token': None, 'use_ssl': False, 'master_only': False, 'client_cert': None, 'ssl_no_validate': False, 'client_key': None}
2018-06-11 17:02:41,972 DEBUG              curator.utils             get_client:878  "requests_aws4auth" module present, but not used.
2018-06-11 17:02:41,973 DEBUG         urllib3.util.retry               from_int:200  Converted retries value: False -> Retry(total=False, connect=None, read
=None, redirect=0, status=None)
2018-06-11 17:02:41,973 DEBUG     urllib3.connectionpool              _new_conn:208  Starting new HTTP connection (1): es1.corp.net
2018-06-11 17:02:41,978 DEBUG     urllib3.connectionpool          _make_request:396  http://es1.corp.net:9200 "GET / HTTP/1.1" 200 435
2018-06-11 17:02:41,979 INFO               elasticsearch    log_request_success:83   GET http://es1.corp.net:9200/ [status:200 request:0.006s]
2018-06-11 17:02:41,979 DEBUG              elasticsearch    log_request_success:85   > None
2018-06-11 17:02:41,979 DEBUG              elasticsearch    log_request_success:86   < {
  "name" : "es1",
  "cluster_name" : "corp-es-cluster",
  "cluster_uuid" : "O0IQki3oQ5KsxBhoYvoYNQ",
  "version" : {
    "number" : "6.2.4",
    "build_hash" : "ccec39f",
    "build_date" : "2018-04-12T20:37:28.497551Z",
    "build_snapshot" : false,
    "lucene_version" : "7.2.1",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

2018-06-11 17:02:41,980 DEBUG         urllib3.util.retry               from_int:200  Converted retries value: False -> Retry(total=False, connect=None, read=None, redirect=0, status=None)
2018-06-11 17:02:41,980 DEBUG     urllib3.connectionpool              _new_conn:208  Starting new HTTP connection (1): es3.corp.net
2018-06-11 17:02:42,000 DEBUG     urllib3.connectionpool          _make_request:396  http://es3.corp.net:9200 "GET /_all/_settings?expand_wildcards=open%2Cclosed HTTP/1.1" 200 64398
2018-06-11 17:02:42,003 INFO               elasticsearch    log_request_success:83   GET http://es3.corp.net:9200/_all/_settings?expand_wildcards=open%2Cclosed [status:200 request:0.023s]
2018-06-11 17:02:42,003 DEBUG              elasticsearch    log_request_success:85   > None
2018-06-11 17:02:42,003 DEBUG              elasticsearch    log_request_success:86   < {"metricbeat-6.0.1-2018.05.11":{" BLABLA


2018-06-11 17:02:46,616 WARNING            elasticsearch       log_request_fail:97   GET http://es2.fibrea.net:9200/.monitoring-es-6-2018.01.23,.monitoring-es-6-2018.01.25,.monitoring-es-6-2018.02.01, ETC...
[status:404  request:0.005s]
2018-06-11 17:02:46,616 DEBUG              elasticsearch       log_request_fail:105  > None
2018-06-11 17:02:46,617 DEBUG              elasticsearch       log_request_fail:110  < {"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"metricbeat-6.0.1-2018.05.10","index_uuid":"_na_","index":"metricbeat-6.0.1-2018.05.10"}],   "type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"metricbeat-6.0.1-2018.05.10","index_uuid":      "_na_","index":"metricbeat-6.0.1-2018.05.10"},"status":404}
2018-06-11 17:02:46,617 ERROR                curator.cli                    run:186  Failed to complete action: delete_indices.  <type 'exceptions.         KeyError'>: 'indices'

For the 404 error I check _cluster/health, I have no unassigned shard :frowning:

This indicates that there's something amiss in your cluster. Something says that metricbeat-6.0.1-2018.05.10 exists to Curator (an API call), but then when it issues another API call to delete it, Elasticsearch is responding with {"type":"index_not_found_exception","reason":"no such index". It's not there. Are all three hosts in your config.yml part of the same cluster?

  hosts:
    - es1.corp.net
    - es2.corp.net
    - es3.corp.net

If these are not all members of the same cluster, and that index is not on all members, that would result in the "not found" response. Curator round-robins the requests. The first request hits the first host, and then the delete hits the second. This is the most likely explanation of what is happening.

Yes all this hosts are part of the same cluster, and there is no unassigned shared that what bother me :thinking:

If I retry with just 1 host it work... So Maybe it's a problem with my ES cluster and not with curator.

Thanks for your help. My first conclusion was the same as yours, but I cannot believe there are index inconsistency in the cluster. Apparently it can :frowning:

EDIT: ok.... So ashame of this. One of the host got out of the cluster and I didn't notice... It didn't raise alarm in the monitoring system, so it was completely my fault... sorry.

1 Like

not find indces... with --runing test..

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.