Curator - need help to delete old logs with timestamp


(Deirdre Storck) #1

I am trying to delete old logs using Curator in my Kubernetes Cluster. A sample ES log looks like this:

        "_index" : "logstash-2017.09.19",
        "_type" : "flb_type",
        "_id" : "sample_id",
        "_score" : 0.6599215,
        "_source" : {
          "@timestamp" : "2017-09-19T16:09:04",
          "log" : "2017-09-19 16:08:08,521 INFO      Preparing Action ID: 1, \"delete_indices\"\n",
          "stream" : "stdout",
          "time" : "2017-09-19T16:08:08.522064224Z",
          "kubernetes" : {
            "pod_name" : "curator-1505781600-8b4mv",
            "namespace_name" : "default",
            "container_name" : "curator",
            "docker_id" : "sample_docker_id",
            "pod_id" : "sample_pod_id",
            "labels" : {
              "controller-uid" : "sample_controller_id",
              "job-name" : "curator-15057816345"
            },
          }
        }
      },

I am trying to run a cronjob that in theory would fire off once a day, and kill any logs older than 2 weeks. For testing I have upped the cron job to run every two minutes, and kill off logs older than 10 minutes. I am trying to use the filtertype: age with source: field_stats to accomplish this, but I am clearly doing something incorrectly because every time the cron job runs it deletes everything. So can someone either help me figure out what bug I have and why the cron job is managing to kill everything, or if there would be a better way to delete everything older than two weeks based on the logs I am collecting.
Here are my configs:

apiVersion: "batch/v2alpha1"
kind: CronJob
metadata:
  name: curator
spec:
  schedule: "*/1 * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: curator
            image: bobrik/curator
            args: ["--config", "/etc/config/config.yml", "/etc/config/action_file.yml"]
            volumeMounts:
              - name: config-volume
                mountPath: /etc/config
          volumes:
            - name: config-volume
              configMap:
                name: curator-config
          restartPolicy: OnFailure
apiVersion: v1
kind: ConfigMap
metadata:
  name: curator-config
data:
  action_file.yml: |-
    ---
    # Remember, leave a key empty if there is no value.  None will be a string,
    # not a Python "NoneType"
    #
    # Also remember that all examples have 'disable_action' set to True.  If you
    # want to use this action as a template, be sure to set this to False after
    # copying it.
    actions:
      1:
        action: delete_indices
        description: >-
          Delete indices older than 5 minutes (based on index name), for logstash-
          prefixed indices. Ignore the error if the filter does not result in an
          actionable list of indices (ignore_empty_list) and exit cleanly.
        options:
          timeout_override:
          continue_if_exception: False
          disable_action: False
        filters:
        - filtertype: age
          source: field_stats
          direction: older
          unit: minutes
          unit_count: 10
          field: '@timestamp'
          stats_result: min_value

  config.yml: |-
    ---
    # Remember, leave a key empty if there is no value.  None will be a string,
    # not a Python "NoneType"
    client:
      hosts:
        - elasticsearch
      port: 9200
      url_prefix:
      use_ssl: False
      certificate:
      client_cert:
      client_key:
      ssl_no_validate: False
      http_auth:
      timeout: 30
      master_only: False

    logging:
      loglevel: INFO
      logfile:
      logformat: default
      blacklist: ['elasticsearch', 'urllib3']

(Aaron Mildenstein) #2

Using field_stats in this way means that if the smallest value of @timestamp in the index is more than 10 minutes old, it will delete the entire index. It doesn't sound like that is what you want.

Also, as this is the only filter, it will delete all indices matching this criteria. You probably don't want to delete your .kibana index, or perhaps other, similar indices. It would be best to limit an age filter to indices matching a given pattern first, and then apply the age filter.


(Deirdre Storck) #3

Really the goal was to delete all logs older than two weeks. I had looked into the age filter and tried this one I found:

        filters:
        - filtertype: pattern
          kind: prefix
          value: logstash-
          exclude: False
        - filtertype: age
          source: name
          direction: older
          timestring: '%Y.%m.%d'
          unit: minutes
          unit_count: 10
          exclude: False

but was running into the same problem. Based on what I have available in my logs what would you suggest is the best way to achieve this goal?


(Aaron Mildenstein) #4

The same thing applies here, though the method of age calculation is derived from a timestamp in the index name instead. Any index older than 10 minutes ago would be deleted.

        filters:
        - filtertype: pattern
          kind: prefix
          value: logstash-
          exclude: False
        - filtertype: age
          source: field_stats
          direction: older
          unit: days
          unit_count: 14
          field: '@timestamp'
          stats_result: min_value

Something like this is more likely to be what you want. This example will delete logstash- prefixed indices older than 14 days.


(Deirdre Storck) #5

This seems like it's working, thanks so much.


(Deirdre Storck) #6

Follow up (and i'm happy to create a new topic , but this seems related ) :
I got the filter working, and yesterday when I ran it, it went ahead and deleted all logs from index logstash-2017.09.19. ( I ran this on the 20th, so it deleted all older logs perfectly) . However, it didn't seem to actually delete the index, as I noted this morning when I ran /_cat/indices and saw:

health status index               uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   logstash-2017.09.20 vQ06hfLuSRmrctDt-Ejkyg   5   1    2264044            0      3.4gb          1.7gb
green  open   logstash-2017.09.21 lHspQ9QoR7eAaX4H_IhTdg   5   1    3142314            0      5.9gb          2.9gb
green  open   logstash-2017.09.19 kuRXxBRLSJywe1Qi9DsguA   5   1          0            0      1.2kb           650b

This caused a problem for curator, as it is now failing with : Failed to complete action: delete_indices. <class 'curator.exceptions.ActionError'>: Field "@timestamp" not found in index "logstash-2017.09.19" , and because of that error, nothing else will run. I have already tried including ignore_empty_list: True in my config options, but I'm still getting the same error.


(Aaron Mildenstein) #7

Curator cannot delete only documents and leave an index behind because it only makes use of the delete index API call (in this instance). Something else seems to have happened here. Without more information, I cannot explain what transpired. I can tell you, though, that some other process created that index, or something went wrong in your cluster and it attempted to recover from whatever that was, and tried to recreate the index, but only managed to make the structure (i.e., index, but no documents in it).

Do you make use of a process that creates indices on a time schedule?


(system) #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.