Regarding to curator delete indices older than 5 days

Hello,

I have a index and mapping like this:
{
"_index": "user",
"_type": "profile",
"_id": "2",
"_score": 1,
"_source": {
"full_name": "Elon Musk",
"bio": "Elon Reeve Musk is a Canadian-American entrepreneur, engineer, inventor and investor. He is the CEO and CTO of SpaceX, CEO and product architect of Tesla Motors, and chairman of SolarCity.",
"age": 43,
"location": "37.7749290,-122.4194160",
"enjoys_coffee": false,
"created_on": "2015-05-02T15:45:10.000-04:00"
}
},

I want to delete indices that older than 5 days using curator, below are my config.yml and action.yml.
<config.yml>
client:
hosts: ["127.0.0.1:9200"]
url_prefix:
use_ssl: False
certificate:
client_cert:
client_key:
aws_key:
aws_secret_key:
aws_region:
ssl_no_validate: False
http_auth:
timeout: 30
master_only: False

logging:
loglevel: INFO
logfile:
logformat: default
blacklist: ['elasticsearch', 'urllib3']

<action.yml>
actions:
1:
action: delete_indices
description: "Delete indices older than 1 days (based on index name), for logstash- prefixed indices.
Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly"
options:
timeout_override:
continue_if_exception: False
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: user-
exclude:
- filtertype: age
source: name
direction: older
timestring: '%Y-%m-%d'
unit: days
unit_count: 1
exclude:

I run this command: curator --config ~/Desktop/curator-4.2.6/config.yml ~/Desktop/curator-4.2.6/action.yml on my Mac terminal, but it gives me this error: ERROR Unable to complete action "delete_indices". No actionable items in list: <class 'curator.exceptions.NoIndices'>

Anyone can help me please, I'm a beginner on curator, thanks.

The index example provided, which is named "user," does not contain a date string in it, e.g. user-2015.05.02, which is what the created_on date might indicate.

Your age filter can never work against indices such as this, nor can your pattern filter, unless there are other indices that start with user- (note the hyphen).

To delete indices older than a certain number of days when there is no time string identifier in the index name, you must use either source: creation_date, or if you've been ingesting old data and the creation date would be inaccurate, the most accurate is
source: field_stats.

creation_date is simply the epoch timestamp recorded at the time the index was created. With time-series data this is usually accurate, but it can be inaccurate if you've ingested some old syslog line, and Logstash creates an index for 2017.01.01 on the 20th of the month. Based on the content of the log line, Logstash makes the index name's date in the past, but the creation_date would still be 2017.01.20. To avoid scenarios like this, use source: field_stats to calculate index age.

The field_stats API in Elasticsearch will tell you what the min and max values of a field are in an Elasticsearch index. For Curator, presuming you're using the @timestamp field, the configuration might look like:

# '@timestamp' is the default value for 'field', so the 'field' line can be omitted if that is the case
# 'min_value' is the default for 'stats_result', so the 'stats_result' line can be omitted if that is the case
- filtertype: age
  source: field_stats
  field: '@timestamp'
  stats_result: min_value 
  direction: older
  unit: days
  unit_count: 3

This will calculate an index's age based on the minimum value found for @timestamp in the index. If that is older than 3 days ago, it will remain in the actionable list.

You could also use max_value for stats_result, which would calculate index age based on the "newest" value in the index. It is important to remember that for time calculation, min_value and max_value are going to be evaluating epoch time. As such, a bigger value indicates a more recent time stamp.

I do not know if you have a timestamp in every document/record in the indices you want to delete. You may be compelled to use source: creation_date if that is true, as that is the only way to tell the age of an index, otherwise.

Thanks buddy, one quick question, do I need to install logstach? Because I only installed ES 5.2.0 and Curator 4.2.6 on my local machine for testing. Thank you.

No, Logstash being mentioned above is just a quoted example. Curator is stand-alone.

Hi Aaron,

For case is I created an index called user and ingested data yesterday, so I can see my index "user" through Kibana by typing GET /user/profile/_search?q=* command. After running this, you will get
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "user",
"_type": "profile",
"_id": "2",
"_score": 1,
"_source": {
"full_name": "Elon Musk",
"bio": "Elon Reeve Musk is a Canadian-American entrepreneur, engineer, inventor and investor. He is the CEO and CTO of SpaceX, CEO and product architect of Tesla Motors, and chairman of SolarCity.",
"age": 43,
"location": "37.7749290,-122.4194160",
"enjoys_coffee": false,
"created_on": "2015-05-02T15:45:10.000-04:00"
}
},
{
"_index": "user",
"_type": "profile",
"_id": "1",
"_score": 1,
"_source": {
"full_name": "Andrew Puch",
"bio": "My name is Andrew. I am an agile DevOps Engineer who is passionate about working with Software as a Service based applications, REST APIs, and various web application frameworks.",
"age": 26,
"location": "41.1246110,-73.4232880",
"enjoys_coffee": true,
"created_on": "2015-05-02T14:45:10.000-04:00"
}
},
{
"_index": "user",
"_type": "profile",
"_id": "3",
"_score": 1,
"_source": {
"full_name": "Some Hacker",
"bio": "I am a haxor user who you should end up deleting.",
"age": 1000,
"location": "37.7749290,-122.4194160",
"enjoys_coffee": true,
"created_on": "2015-05-02T16:45:10.000-04:00"
}
}
]
}
}

My question is : since this index (the only one index that I created until now) is created by yesterday, how to modify my action.yml to make it delete this user index, how to set those setting based on my mapping?
Please let me know if you need more information, thanks.

Curator does not care about index mappings. It only looks at the index name, and index metadata settings, where it finds size and date information. To delete only the user index, you would have to set a regex to match only the user index:

  - filtertype: pattern
    kind: regex
    value: '^user$'

This will match only indices (well, a single index) named user.

Thanks Aaron, it works, so appreciated :slight_smile: May I add you on linkedIn? Currently, I am working on ES, and Kibana, Curator. Maybe ask you some questions later and wish u can help me :slight_smile:

Feel free to ask questions here in the community forums. I will answer as time permits. I am on LinkedIn, so feel free to look me up there. I won't be answering questions there, though. I keep my answers here for the benefit of the community.

Sure, I will write down my questions here. One curious question, when u try to delete indices that older than 5 days for example, how does the system know the creation date for each index?

When an index is created, there is a creation_date set in the index metadata settings. This information is used when source: creation_date.

If you name an index in Logstash or beats, it will usually be logstash-YYYY.MM.DD or ...beat-YYYY.MM.DD. Curator parses the name for a given time string and calculates when the index was created. This is source: name.

The most powerful is source: field_stats, which I already explained above. It calculates the age of the index based on a timestamp field, usually @timestamp.

This is how Curator calculates the age of an index.

Hi Araon,

Just one thing to make sure. Based on my previous user index example, I want to understand how does curator works, please correct me if I'm wrong. First it will search the index name that starts with u and ends with r, secondly it will check the creation_date from the index metadata settings. unit_count I set 1 is just to delete indices older than 1 day. And you can also change creation_date to field_stats, it works as well, am I right? Thanks.

filters:
- filtertype: pattern
kind: regex
value: '^user$'
exclude:
- filtertype: age
source: creation_date
direction: older
timestring: '%Y-%m-%d'
unit: days
unit_count: 1
exclude:

Just found that it doesn't implement the second -filtertype things, in other words, it doesn't delete indices based on day. How to solve this associated with my previous example? Thanks.

The user index may not be older than 1 day yet, so it isn't being deleted. You could change unit to hours and unit_count to 12 and run with the --dry-run flag to see if it finds the user index then.

Also, timestring is only needed when source is name. It can be omitted.

I think I made a typo there. What I found was it just deleted index without checking those settings in the second filtertype like this. It supposed to delete index older than 1 day.

-filtertype: age
source: creation_date
direction: older
unit: days
unit_count: 1
exclude:

It's a straight epoch time comparison from the time of execution to the time as determined by source. If it was more than 86400 seconds in difference, then it it was one day.

So If I recreate that user index and use the following scripts, the user index should not be deleted, right?
because I just created in 2 min and here it defines to delete index that created one day before. Thanks.

filters:

  • filtertype: pattern
    kind: regex
    value: '^user$'
    exclude:
  • filtertype: age
    source: creation_date
    direction: older
    unit: days
    unit_count: 1
    exclude:

That is correct. You can see the timings and comparisons if you run with loglevel: DEBUG and with the --dry-run flag.

Thanks Aaron, I have a question here. If I don't know the index name, I want to delete indices that generated 10 days ago, how to write that filter type things? for example as below, thanks.

filters:

  • filtertype: pattern
    kind: ?
    value: ?
    exclude:
  • filtertype: age
    source: creation_date
    direction: older
    timestring: '%Y-%m-%d'
    unit: days
    unit_count: 10
    exclude:

You have the right idea with the age filter. You wouldn't use the pattern filter if you don't know the pattern they'll be. I highly recommend employing the kibana filtertype here at the very least, to prevent deleting the .kibana index by accident. Some other manual exclusions might be wise here.

Hi Aaron, you mentioned deploying the Kibana filter type to prevent deleting .kibana. So I have to set exclude: False to remain it, am I right? otherwise exclude: True will remove .kibana once it found this indice, thanks.

  • filtertype: kibana
    exclude: False