Index rollover malfunctioning

Hi Team,

We are running cron job for index rollover. We have the condition of max_age set to 1d as shown below , however it is not working as expected. The rollover is working but the index is not generated everyday, we see indices missing in between.

We do not have a data loss but the have a retention period of 90 days for the indices and it is storing more than that.

Please find below the spec of the cronjob

spec:
  concurrencyPolicy: Forbid
  failedJobsHistoryLimit: 1
  jobTemplate:
    metadata:
      creationTimestamp: null
    spec:
      backoffLimit: 2
      completions: 1
      parallelism: 1
      template:
        metadata:
          creationTimestamp: null
        spec:
          containers:
          - args:
            - -c
            - |
              curl -X POST "${domain}:9200/${index}/_rollover?pretty" -H 'Content-Type: application/json' -d'
               {
                 "conditions": {
                   "max_age":   "1d"
                 }
               }
               '
            command:
            - /bin/sh
            env:
            - name: index
              value: events_alias
            - name: domain
              value: elastic-tenant-client.thirdparty.svc
            image: curlimages/curl:7.69.1
            imagePullPolicy: IfNotPresent
            name: events-rollover
            resources:
              limits:
                cpu: 100m
                memory: 36M
              requests:
                cpu: 10m
                memory: 10M
            securityContext:
              allowPrivilegeEscalation: false
              readOnlyRootFilesystem: true
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
          dnsPolicy: ClusterFirst
          restartPolicy: Never
          schedulerName: default-scheduler
          securityContext: {}
          terminationGracePeriodSeconds: 30
  schedule: 0 1 * * *
  successfulJobsHistoryLimit: 1

Below is the list of indices , we can see that the missing dates of 24 and 27

green open events-2022.07.19-000285  vTny2nFdRBmugW4tylVYVw 5 1 23997052    2382  96.9gb  48.4gb
green open events-2022.07.20-000286  ky_epNN1Sn-6ansPwe0SKQ 5 1 47890056    4435 192.1gb    96gb
green open events-2022.07.22-000287  Qp3LjyPTTFG-UTiRBm3nAw 5 1 22492827    1791  89.9gb  44.9gb
green open events-2022.07.23-000288  G0ISbSBZQ-2W24Cth7OV0Q 5 1 32661721    4225 129.6gb  64.8gb
green open events-2022.07.25-000289  FpBBsQMdT8aoSZvZefYGxA 5 1 23133754    2086    93gb  46.5gb
green open events-2022.07.26-000290  LM2tRPalSvOIIzL4RvjDsA 5 1 48011063    4448 193.1gb  96.6gb
green open events-2022.07.28-000291  YbQ7VfVXQxSW_wSnyNdokA 5 1  4768287    1442  21.3gb  10.4g

Note: We have another cronjob to delete the indices on a retention of 90 days.

If you want exactly one index per day, why are you using rollover? The whole point of rollover is to let Elasticsearch create new backing indices when required based on age or size, which requires you to give up control over exactly when new backing indices are created. For you it would be a lot easier to generate the index name with the date based on event time or current time and write directly to this instead of trying to use rollover in this unnatural way.

If you still want to do what you are doing (which I do not recommend) you could set the max age threshold to less than a day to force rollover at that particular time. That way you do not risk the previous index being created just less than a day ago and rollover therefor not taking place.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.