Curator: "Invalid epoch received, unable to convert None to int"

Hello,

When I try to delete old indices, I get the following error:

2019-09-07 18:13:34,959 INFO      Preparing Action ID: 2, "delete_indices"
2019-09-07 18:13:34,965 INFO      Trying Action ID: 2, "delete_indices": Delete indices older than 45 days
2019-09-07 18:13:34,983 ERROR     Failed to complete action: delete_indices.  <class 'ValueError'>: Invalid epoch received, unable to convert None to int

It seems like Curator is getting the value 'None' from an Elasticsearch API call:

2019-09-07 18:26:14,275 DEBUG          curator.indexlist _get_field_stats_dates:306  Getting index date by querying indices for min & max value of timestamp field
2019-09-07 18:26:14,278 DEBUG          curator.indexlist _get_field_stats_dates:318  RESPONSE: {'took': 0, 'timed_out': False, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}, 'hits': {'total': {'value': 1, 'relation': 'eq'}, 'max_score': None, 'hits': []}, 'aggregations': {'min': {'value': 1567872175000.0, 'value_as_string': '20190907T160255Z'}, 'max': {'value': 1567872175000.0, 'value_as_string': '20190907T160255Z'}}}
2019-09-07 18:26:14,278 DEBUG          curator.indexlist _get_field_stats_dates:322  r: {'min': {'value': 1567872175000.0, 'value_as_string': '20190907T160255Z'}, 'max': {'value': 1567872175000.0, 'value_as_string': '20190907T160255Z'}}
2019-09-07 18:26:14,278 DEBUG          curator.indexlist _get_field_stats_dates:326  s: {'creation_date': 1567872058, 'min_value': 1567872175, 'max_value': 1567872175}
2019-09-07 18:26:14,280 DEBUG          curator.indexlist _get_field_stats_dates:318  RESPONSE: {'took': 0, 'timed_out': False, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}, 'hits': {'total': {'value': 7, 'relation': 'eq'}, 'max_score': None, 'hits': []}, 'aggregations': {'min': {'value': None}, 'max': {'value': None}}}
2019-09-07 18:26:14,280 DEBUG          curator.indexlist _get_field_stats_dates:322  r: {'min': {'value': None}, 'max': {'value': None}}
2019-09-07 18:26:14,280 ERROR                curator.cli                    run:191  Failed to complete action: delete_indices.  <class 'ValueError'>: Invalid epoch received, unable to convert None to int

You can see Curator is getting back valid time for the first index:

2019-09-07 18:26:14,278 DEBUG curator.indexlist _get_field_stats_dates:322 r: {'min': {'value': 1567872175000.0, 'value_as_string': '20190907T160255Z'}, 'max': {'value': 1567872175000.0, 'value_as_string': '20190907T160255Z'}}

And for the second index it gets None:

2019-09-07 18:26:14,280 DEBUG curator.indexlist _get_field_stats_dates:322 r: {'min': {'value': None}, 'max': {'value': None}}

I cannot figure out where None is coming from though, as all indices have filled timestamp fields.

My Curator config:

  2:
    action: delete_indices
    description: >-
      Delete indices older than 45 days
    options:
      ignore_empty_list: True
      timeout_override:
      continue_if_exception: False
      disable_action: False
    filters:
    - filtertype: age
      source: field_stats
      field: 'timestamp'
      direction: older
      unit: minutes
      unit_count: 1
      exclude:

As you can see, the timestamp field is filled:

root@es0-0:~# curl -X GET http://localhost:9200/mail/_search | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   608  100   608    0     0  83642      0 --:--:-- --:--:-- --:--:-- 86857
{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "mail",
        "_type": "_doc",
        "_id": "6KpoDG0Bz4cMnJ9oGSRr",
        "_score": 1,
        "_source": {
          "hostname": "mail0.emdmz.cyberfusion.cloud",
          "domain": "example.com",
          "mailbox": "tet@example.com",
          "timestamp": "20190907T174650Z",
          "diskusage": 69
        }
      },
      {
        "_index": "mail",
        "_type": "_doc",
        "_id": "6apoDG0Bz4cMnJ9oHCRH",
        "_score": 1,
        "_source": {
          "hostname": "mail0.emdmz.cyberfusion.cloud",
          "domain": "test.nl",
          "mailbox": "tset@test.nl",
          "timestamp": "20190907T174652Z",
          "diskusage": 523
        }
      }
    ]
  }
}

How can I debug which index is giving me 'None' and why it is doing so? I'm stuck.

I see only a single filter, age. This means that it will look for a timestamp in every index, including ones like Kibana and other system indices. I recommend adding a pattern filter before the age filter to limit the scope to indices you know and expect to have the timestamp field.

Thank you. I have added a pattern filter like this:

  3:
    action: delete_indices
    description: >-
      Delete mail indices older than 45 days
    options:
      ignore_empty_list: True
      timeout_override:
      continue_if_exception: False
      disable_action: False
    filters:
    - filtertype: pattern
      kind: regex
      value: "^mail$"
      exclude:
    - filtertype: age
      source: field_stats
      field: 'timestamp'
      direction: older
      unit: minutes
      unit_count: 1
      exclude:

This prevents the original error, but is not deleting indices older than 1 minute either. The timestamp field is:

"timestamp": { "type": "date", "format": "basic_date_time_no_millis" }

Curator output:

2019-09-07 20:57:56,228 DEBUG                curator.cli         process_action:99   Doing the action here.
2019-09-07 20:57:56,228 DEBUG          curator.indexlist       empty_list_check:226  Checking for empty list
2019-09-07 20:57:56,228 INFO                 curator.cli                    run:180  Skipping action "delete_indices" due to empty list: <class 'curator.exceptions.NoIndices'>
2019-09-07 20:57:56,228 INFO                 curator.cli                    run:201  Action ID: 3, "delete_indices" completed.
2019-09-07 20:57:56,228 INFO                 curator.cli                    run:202  Job completed.

Index:

root@es0-0:~/es_setup# curl -X GET http://localhost:9200/mail/_search | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   608  100   608    0     0   161k      0 --:--:-- --:--:-- --:--:--  197k
{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "mail",
        "_type": "_doc",
        "_id": "8qoSDW0Bz4cMnJ9oyyRT",
        "_score": 1,
        "_source": {
          "hostname": "mail0.emdmz.cyberfusion.cloud",
          "diskusage": 69,
          "mailbox": "tet@example.com",
          "domain": "example.com",
          "timestamp": "20190907T205318Z"
        }
      },
      {
        "_index": "mail",
        "_type": "_doc",
        "_id": "86oSDW0Bz4cMnJ9oyyTE",
        "_score": 1,
        "_source": {
          "hostname": "mail0.emdmz.cyberfusion.cloud",
          "diskusage": 523,
          "mailbox": "tset@example.nl",
          "domain": "example.nl",
          "timestamp": "20190907T205318Z"
        }
      }
    ]
  }
}

Any idea? I have tried setting timestring in the Curator config, but looking at the code and results, this gets ignored when source: field_stats is set.

The debug output should show why indices are selected (or not) by ages. I don’t see that in your output. Also, the pattern filter is only set to match one index: mail — not mail-* or any variants. I’m confused. If you only have one index to delete, why use Curator? Or do you need a different pattern?

Thanks again for responding. I am not too familiar with ES terminology yet. If I understand correctly, I have an index called 'mail' in which I store documents. I want to rotate documents older than 45 days in that index (I have set it to 1 minute for testing purposes). So, I set ^mail$ as pattern filter. I guess I need a different pattern, then?

Also, in my cleanup config, I have two cleanup indices jobs/tasks as follows. I figured I'd leave out the other one (with the same 'issue') to avoid confusion.

  2:
    action: delete_indices
    description: >-
      Delete web indices older than 45 days
    options:
      ignore_empty_list: True
      timeout_override:
      continue_if_exception: False
      disable_action: False
    filters:
    - filtertype: pattern
      kind: regex
      value: "^web$"
      exclude:
    - filtertype: age
      source: field_stats
      field: 'timestamp'
      direction: older
      unit: days
      unit_count: 45
      exclude:

  3:
    action: delete_indices
    description: >-
      Delete mail indices older than 45 days
    options:
      ignore_empty_list: True
      timeout_override:
      continue_if_exception: False
      disable_action: False
    filters:
    - filtertype: pattern
      kind: regex
      value: "^mail"
      exclude:
    - filtertype: age
      source: field_stats
      field: 'timestamp'
      direction: older
      unit: minutes
      unit_count: 1
      exclude:

Okay. Curator rolls entire indices, rather than the documents in an index. You might want to rethink your index plan to have an alias called mail and multiple rolling indices behind that, e.g. mail-000001, mail-000002, etc. and expire those indices as the content within them becomes stale to you.

I see. That makes sense. In that case, I guess I will just use the API to rotate documents, and use Curator for snapshots/backups, as I'm not sure it makes sense to separate indices by time with my dataset. I appreciate your assistance.

It’s still prescriptive. You should delete old indices, rather than documents from within indices. It’s a matter of efficiency.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.