Use greedy regexes in Curator filter


#1

I have set up Curator to delete old indexes via this filter:

(...)
filters:
- filtertype: pattern
  kind: regex
  value: '^xyz-us-(prod|preprod)-(.*)-'
  exclude:
- filtertype: age
  source: name
  direction: older
  timestring: '%Y.%m.%d'
  unit: days
  unit_count: 7
  exclude:
(...)

However, I realized that Curator uses non-greedy regexes, because this filter catches the index xyz-us-prod-foo-2018.10.11 but not xyz-us-prod-foo-bar-2018.10.11.

How can I modify the filter to catch both indexes?

Thanks in advance.


(Aaron Mildenstein) #2

You shouldn't need the second set of parenthesis.

Also, have you tried fully anchoring the end of the line? This example is untested, but should work.

'^xyz-us-(prod|preprod)-.*-\d{4}\.\d{2}\.\d{2}$'

Being more precise with this should allow the .* to truly capture anything between (prod|preprod) and the date.


#3

Thanks for your answer. I've tried it but the change doesn't have any effect.


(Aaron Mildenstein) #4

Seems to work just fine for me. With this config:

---
actions:
  1:
    action: delete_indices
    filters:
    - filtertype: pattern
      kind: regex
      value: '^xyz-us-(prod|preprod)-.*-\d{4}\.\d{2}\.\d{2}$'
      exclude:
    - filtertype: age
      source: name
      direction: older
      timestring: '%Y.%m.%d'
      unit: days
      unit_count: 7

I created these indices:

PUT xyz-us-prod-foo-2018.10.11
PUT xyz-us-prod-foo-bar-2018.10.11

and it matched both of them:

2018-10-29 20:02:29,091 DEBUG          curator.indexlist        filter_by_regex:425  Filtering indices by regex
2018-10-29 20:02:29,091 DEBUG          curator.indexlist       empty_list_check:225  Checking for empty list
2018-10-29 20:02:29,091 DEBUG          curator.indexlist           working_list:236  Generating working list of indices
2018-10-29 20:02:29,091 DEBUG          curator.indexlist        filter_by_regex:446  Filter by regex: Index: xyz-us-prod-foo-2018.10.11
2018-10-29 20:02:29,091 DEBUG          curator.indexlist           __actionable:35   Index xyz-us-prod-foo-2018.10.11 is actionable and remains in the list.
2018-10-29 20:02:29,091 DEBUG          curator.indexlist        filter_by_regex:446  Filter by regex: Index: xyz-us-prod-foo-bar-2018.10.11
2018-10-29 20:02:29,092 DEBUG          curator.indexlist           __actionable:35   Index xyz-us-prod-foo-bar-2018.10.11 is actionable and remains in the list.
2018-10-29 20:02:29,092 DEBUG          curator.indexlist        iterate_filters:1185 Post-instance: ['xyz-us-prod-foo-2018.10.11', 'xyz-us-prod-foo-bar-2018.10.11']
...
2018-10-29 20:02:29,097 INFO               curator.utils           show_dry_run:918  DRY-RUN MODE.  No changes will be made.
2018-10-29 20:02:29,097 INFO               curator.utils           show_dry_run:921  (CLOSED) indices may be shown that may not be acted on by action "delete_indices".
2018-10-29 20:02:29,097 INFO               curator.utils           show_dry_run:928  DRY-RUN: delete_indices: xyz-us-prod-foo-2018.10.11 with arguments: {}
2018-10-29 20:02:29,098 INFO               curator.utils           show_dry_run:928  DRY-RUN: delete_indices: xyz-us-prod-foo-bar-2018.10.11 with arguments: {}
2018-10-29 20:02:29,098 INFO                 curator.cli                    run:196  Action ID: 1, "delete_indices" completed.
2018-10-29 20:02:29,098 INFO                 curator.cli                    run:197  Job completed.

(Aaron Mildenstein) #5

For good measure, I added these two indices:

PUT xyz-us-preprod-foo-2018.10.12
PUT xyz-us-preprod-foo-bar-2018.10.12

and ran it again. It still works with both patterns:

2018-10-29 20:08:28,120 INFO               curator.utils           show_dry_run:928  DRY-RUN: delete_indices: xyz-us-preprod-foo-2018.10.12 with arguments: {}
2018-10-29 20:08:28,120 INFO               curator.utils           show_dry_run:928  DRY-RUN: delete_indices: xyz-us-preprod-foo-bar-2018.10.12 with arguments: {}
2018-10-29 20:08:28,120 INFO               curator.utils           show_dry_run:928  DRY-RUN: delete_indices: xyz-us-prod-foo-2018.10.11 with arguments: {}
2018-10-29 20:08:28,120 INFO               curator.utils           show_dry_run:928  DRY-RUN: delete_indices: xyz-us-prod-foo-bar-2018.10.11 with arguments: {}


#6

Thanks, it appears to work now. I probably made a mistake before.

One question: how does Curator manage to find the timestring in value ? Does it automatically search for it at the end of the string value ?


(Aaron Mildenstein) #7

Nope. It’s pretty much ^.*(\d{4}\.\d{2}\.\d{2}).*$ (or whatever your provided timestring extrapolates to).


(system) #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.