Can't get ILM to work with filebeat / metricbeat 8.4

Dear all

I'm struggling to get the ILM on Filebeat and Metricbeat to work and was reading through the docs and blogs the last couple of days. I think I must miss something basic but can't figure out what...

I would like to configure different Index Retention types for our difference environments (local, dev, test, production) and I use docker..... I tried to setup a local test configuration that should delete the indices after 1 - 2 days but it seems I can't get it working....

I'm not sure which configurations you require so I post what I think might have an influence at the end of the post....

I have the following ILM filebeat policy configured:

When I execute an ILM Explain it seems to manage the index:

GET filebeat-8.4.2/_ilm/explain

{
  "indices": {
    ".ds-filebeat-8.4.2-2022.09.26-000005": {
      "index": ".ds-filebeat-8.4.2-2022.09.26-000005",
      "managed": true,
      "policy": "filebeat",
      "index_creation_date_millis": 1664222026640,
      "time_since_index_creation": "22.6h",
      "lifecycle_date_millis": 1664222026640,
      "age": "22.6h",
      "phase": "hot",
      "phase_time_millis": 1664222027781,
      "action": "rollover",
      "action_time_millis": 1664222028381,
      "step": "check-rollover-ready",
      "step_time_millis": 1664222028381,
      "phase_execution": {
        "policy": "filebeat",
        "phase_definition": {
          "min_age": "0ms",
          "actions": {
            "rollover": {
              "max_primary_shard_size": "50gb",
              "max_age": "1d"
            }
          }
        },
        "version": 3,
        "modified_date_in_millis": 1663961698491
      }
    },
    ".ds-filebeat-8.4.2-2022.09.25-000003": {
      "index": ".ds-filebeat-8.4.2-2022.09.25-000003",
      "managed": true,
      "policy": "filebeat",
      "index_creation_date_millis": 1664135025745,
      "time_since_index_creation": "1.94d",
      "lifecycle_date_millis": 1664222026578,
      "age": "22.6h",
      "phase": "hot",
      "phase_time_millis": 1664135026721,
      "action": "complete",
      "action_time_millis": 1664222028381,
      "step": "complete",
      "step_time_millis": 1664222028381,
      "phase_execution": {
        "policy": "filebeat",
        "phase_definition": {
          "min_age": "0ms",
          "actions": {
            "rollover": {
              "max_primary_shard_size": "50gb",
              "max_age": "1d"
            }
          }
        },
        "version": 3,
        "modified_date_in_millis": 1663961698491
      }
    }
  }
}

But when I check the indices I have it seems not to delete:

I would greatly appreciate any help to understand what I do wrong or misunderstand...

Kindly
Tom

Filebeat agent:

filebeat.inputs:

filebeat.autodiscover:
  # List of enabled autodiscover providers
  providers:
    - type: docker
      hints.enabled: true

# Enable filebeat config reloading
filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading
  reload.enabled: true


processors:
  - add_docker_metadata:
      host: "unix:///var/run/docker.sock"

  # The following example enriches each event with host metadata.
  - add_host_metadata: ~

  - add_process_metadata:
      match_pids: [ "system.process.ppid" ]
      target: system.process.parent

  - decode_json_fields:
      fields: [ "message" ]
      target: "json"
      overwrite_keys: true

output.elasticsearch:
  indices:
    - index: "filebeat-%{[agent.version]}-%{+yyyy.MM.dd}"

  # Boolean flag to enable or disable the output module.
  enabled: true

  hosts: ${ELK_MONITORING_ELASTIC_HOSTS}
  username: ${ELK_MONITORING_ELASTIC_USER}
  password: ${ELK_MONITORING_ELASTIC_USER_PASSWORD}
  ssl:
    enabled: ${ELK_MONITORING_ELASTIC_SSL_ENABLED}
    verification_mode: certificate
    certificate_authorities: ${ELK_MONITORING_ELASTIC_SSL_CERTIFICATEAUTHORITIES}

setup.dashboards.enabled: true

setup.dashboards.retry.enabled: true

# Duration interval between Kibana connection retries.
setup.dashboards.retry.interval: 60

# Maximum number of retries before exiting with an error, 0 for unlimited retrying.
setup.dashboards.retry.maximum: 0

# ====================== Index Lifecycle Management (ILM) ======================

setup.ilm.enabled: true

#setup.ilm.policy_name: ${ELK_MONITORING_FILEBEAT_ILM_POLICY_NAME}
setup.ilm.check_exists: true
setup.ilm.overwrite: false

setup.kibana:

  host: "${ELK_MONITORING_KIBANA_URL}"
  username: "${ELK_MONITORING_KIBANA_USERNAME}"
  password: "${ELK_MONITORING_KIBANA_USER_PASSWORD}"
  ssl:
    enabled: ${ELK_MONITORING_KIBANA_SSL_ENABLED}
    verification_mode: certificate
    certificate_authorities: ${ELK_MONITORING_KIBANA_SSL_CERTIFICATEAUTHORITIES}

# ================================== Logging ===================================
logging.json: true

logging.metrics.enabled: true

logging.selectors: [ "*" ]

logging.to_files: true
logging.level: ${ELK_MONITORING_FILEBEAT_LOG_LEVEL}

monitoring.enabled: true
monitoring.elasticsearch:

Docker-compose for filebeat:

  filebeat-agent:
    image: docker.elastic.co/beats/filebeat:8.4.2
    container_name: filebeat-agent
    labels:
      # specify the monitoring labels we want
      - "co.elastic.metrics/module=docker"
      - "co.elastic.metrics/hosts='unix:///var/run/docker.sock'"
      - "co.elastic.logs/enabled=true"

    # must be root to access container logs and /var/log
    user: root
    depends_on:
      elkMonitoringKib01:
        condition: service_healthy
      elkMonitoring01:
        condition: service_healthy
    healthcheck:
      disable: true
    environment:
      - ELK_MONITORING_KIBANA_URL=${ELK_MONITORING_KIBANA_URL}
      - ELK_MONITORING_KIBANA_USERNAME=elastic
      - ELK_MONITORING_KIBANA_USER_PASSWORD=${ELK_MONITORING_ELASTIC_USER_PASSWORD:?Please define the variable ELK_MONITORING_ELASTIC_USER_PASSWORD in your secrets file}
      - ELK_MONITORING_KIBANA_SSL_ENABLED=true
      - ELK_MONITORING_KIBANA_SSL_CERTIFICATEAUTHORITIES=${ELK_MONITORING_LIVIT_CA}

      - ELK_MONITORING_ELASTIC_HOSTS=${ELK_MONITORING_ELASTIC_HOSTS}
      - ELK_MONITORING_ELASTIC_USER=elastic
      - ELK_MONITORING_ELASTIC_USER_PASSWORD=${ELK_MONITORING_ELASTIC_USER_PASSWORD:?Please define the variable ELK_MONITORING_ELASTIC_USER_PASSWORD in your secrets file}
      - ELK_MONITORING_ELASTIC_SSL_ENABLED=true
      - ELK_MONITORING_ELASTIC_SSL_CERTIFICATEAUTHORITIES=${ELK_MONITORING_ELASTIC_CA}

      - ELK_MONITORING_FILEBEAT_LOG_LEVEL=${ELK_MONITORING_FILEBEAT_LOG_LEVEL}
      # Note: ILM ist currently disabled as it was not yet successfully tested
      - ELK_MONITORING_FILEBEAT_ILM_POLICY_NAME=${ELK_MONITORING_FILEBEAT_ILM_POLICY_NAME}
    volumes:
      - ./config/filebeat/filebeat.yml:/usr/share/filebeat/filebeat.yml
      - ./config/filebeat/modules.d:/usr/share/filebeat/modules.d
      - ./config/filebeat/${ELK_MONITORING_STAGE_SPECIFIC_CERT_FOLDER}:/usr/share/filebeat/certs
      # required to access docker socket for api's
      - /var/run/docker.sock:/var/run/docker.sock:ro
      # keep transaction and processing stats in volume so they outlive container
      - fileBeatData:/usr/share/filebeat/data:rw
      - ${ELK_MONITORING_LOCAL_LOG_MOUNT}/filebeatLogs:/usr/share/filebeat/logs

      # This is needed for filebeat to load logs for system and auth modules
      - /var/log/:/var/log/:ro

      # This is needed for filebeat to load logs for auditd module. you might have to install audit system
      # on ubuntu first (sudo apt-get install -y auditd audispd-plugins)
      - /var/log/audit/:/var/log/audit/:ro

      # This is needed for filebeat to load container log path as specified in filebeat.yml
      - /var/lib/docker/containers/:/var/lib/docker/containers/:ro
    command: --strict.perms=false

Docker Compose of elastic-search:

elkMonitoring01:
    image: docker.elastic.co/elasticsearch/elasticsearch:8.4.2
    container_name: elkMonitoring01
    labels:
      # specify the monitoring labels we want
      - "co.elastic.metrics/module=elasticsearch"
      - "co.elastic.metrics/module=docker"
      - "co.elastic.metrics/hosts='unix:///var/run/docker.sock'"
      - "co.elastic.logs/enabled=true"
      - "co.elastic.logs/module=elasticsearch"
    healthcheck:
      test: [ "CMD-SHELL", "curl --silent --fail --cacert config/${ELK_MONITORING_ELASTIC_CA} -u elastic:${ELK_MONITORING_ELASTIC_USER_PASSWORD} https://elkMonitoring01:9200/_cluster/health || exit 1" ]
      interval: 30s
      timeout: 30s
      start_period: 60s
      retries: 3
    environment:
      - cluster.initial_master_nodes=elkMonitoring01,elkMonitoring02
      - discovery.seed_hosts=elkMonitoring02
      - node.name=elkMonitoring01
      - cluster.name=monitoring-cluster
      - bootstrap.memory_lock=true
      - path.repo=${ELK_MONITORING_SNAPSHOT_DIRECTORY}

      # stack monitoring enabled - can only be enabled on stack (with basic license)
      - xpack.monitoring.collection.enabled=true
      - xpack.monitoring.history.duration=${ELK_MONITORING_MONITORING_RETENTION}

      # Security
      - xpack.security.enabled=true

      - xpack.security.http.ssl.enabled=true
      - xpack.security.http.ssl.verification_mode=certificate
      - xpack.security.http.ssl.key=${ELK_MONITORING_ELASTIC_NODE1_KEY}
      - xpack.security.http.ssl.key_passphrase=${ELK_MONITORING_ELASTIC_NODE1_PASSWORD:?Please define the variable ELK_MONITORING_ELASTIC_NODE1_PASSWORD in your secrets file}
      - xpack.security.http.ssl.certificate=${ELK_MONITORING_ELASTIC_NODE1_CERTIFICATE}
      - xpack.security.http.ssl.certificate_authorities=${ELK_MONITORING_ELASTIC_CA}

      - xpack.security.transport.ssl.enabled=true
      - xpack.security.transport.ssl.verification_mode=certificate
      - xpack.security.transport.ssl.key=${ELK_MONITORING_ELASTIC_NODE1_KEY}
      - xpack.security.transport.ssl.key_passphrase=${ELK_MONITORING_ELASTIC_NODE1_PASSWORD:?Please define the variable ELK_MONITORING_ELASTIC_NODE1_PASSWORD in your secrets file}
      - xpack.security.transport.ssl.certificate=${ELK_MONITORING_ELASTIC_NODE1_CERTIFICATE}
      - xpack.security.transport.ssl.certificate_authorities=${ELK_MONITORING_ELASTIC_CA}

      # General options
      - ES_LOG_STYLE=${ES_LOG_STYLE}
      - ES_JAVA_OPTS=${ES_JAVA_OPTS}
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    volumes:
      - ./config/elasticsearch/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
      - ./config/elasticsearch/${ELK_MONITORING_STAGE_SPECIFIC_CERT_FOLDER}:/usr/share/elasticsearch/config/certs/
      # keep elastic data in volume to outlive container
      - ${ELK_MONITORING_LOCAL_DATA01}:/usr/share/elasticsearch/data
      - ${ELK_MONITORING_LOCAL_LOG_MOUNT}/elkMonitoring01Logs:/usr/share/elasticsearch/logs
      - ${ELK_MONITORING_LOCAL_SNAPSHOT_DIRECTORY}:${ELK_MONITORING_SNAPSHOT_DIRECTORY}
    ports:
      - 9300:9200

One Filebeat Log (all entries matching ILM):

{"log.level":"info","@timestamp":"2022-09-27T18:14:18.254Z","log.logger":"index-management","log.origin":{"file.name":"idxmgmt/std.go","file.line":231},"message":"Auto ILM enable success.","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-09-27T18:14:18.258Z","log.logger":"index-management.ilm","log.origin":{"file.name":"ilm/std.go","file.line":118},"message":"ILM policy filebeat exists already.","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-09-27T18:14:18.258Z","log.logger":"index-management","log.origin":{"file.name":"idxmgmt/std.go","file.line":366},"message":"Set settings.index.lifecycle.name in template to {filebeat {\"policy\":{\"phases\":{\"hot\":{\"actions\":{\"rollover\":{\"max_age\":\"30d\",\"max_primary_shard_size\":\"50gb\"}}}}}}} as ILM is enabled.","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-09-27T18:14:18.282Z","log.logger":"template_loader","log.origin":{"file.name":"template/load.go","file.line":115},"message":"Template \"filebeat-8.4.2\" already exists and will not be overwritten.","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-09-27T18:14:18.282Z","log.logger":"index-management","log.origin":{"file.name":"idxmgmt/std.go","file.line":267},"message":"Loaded index template.","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-09-27T18:14:18.356Z","log.logger":"publisher_pipeline_output","log.origin":{"file.name":"pipeline/client_worker.go","file.line":147},"message":"Connection to backoff(elasticsearch(https://elkMonitoring01:9200)) established","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-09-27T18:14:18.363Z","log.logger":"esclientleg","log.origin":{"file.name":"eslegclient/connection.go","file.line":291},"message":"Attempting to connect to Elasticsearch version 8.4.2","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-09-27T18:14:18.363Z","log.logger":"index-management","log.origin":{"file.name":"idxmgmt/std.go","file.line":231},"message":"Auto ILM enable success.","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-09-27T18:14:18.368Z","log.logger":"index-management.ilm","log.origin":{"file.name":"ilm/std.go","file.line":118},"message":"ILM policy filebeat exists already.","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-09-27T18:14:18.368Z","log.logger":"index-management","log.origin":{"file.name":"idxmgmt/std.go","file.line":366},"message":"Set settings.index.lifecycle.name in template to {filebeat {\"policy\":{\"phases\":{\"hot\":{\"actions\":{\"rollover\":{\"max_age\":\"30d\",\"max_primary_shard_size\":\"50gb\"}}}}}}} as ILM is enabled.","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-09-27T18:14:18.407Z","log.logger":"template_loader","log.origin":{"file.name":"template/load.go","file.line":115},"message":"Template \"filebeat-8.4.2\" already exists and will not be overwritten.","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-09-27T18:14:18.407Z","log.logger":"index-management","log.origin":{"file.name":"idxmgmt/std.go","file.line":267},"message":"Loaded index template.","service.name":"filebeat","ecs.version":"1.6.0"}

Hi @thomas_huesler,

thanks for providing such detailed information. Your setup looks fine to me at first glance with one exception: Overriding the output index name like you do in

output.elasticsearch:
  indices:
    - index: "filebeat-%{[agent.version]}-%{+yyyy.MM.dd}"

would interfere with the ILM mechanism, which relies on managing the read/write indices behind an alias. Could you try removing the indices override from your filebeat config?

1 Like

Dear weltenwort

Thank you very much for your help. I changed the configuration (removing the indices element) as you suggested and tested and monitored the behavior now for multiple days.

It works like a charm now.

Thanks very much and kind regards,
tom