Index Lifecycle Management "does not point to index" error

Long story story short, I had to set setup.ilm.enabled: false in order for my beats not 404 when pushing data to my Elasticsearch stack that is behind an Apache reverse proxy. (Apache is set up to use LDAP auth, which is why I don't just use the basic security built into ES.)

So I have attempted to manually configure ILM.

After creating policies and telling my indices to use them, they all complain with errors like this:

Index lifecycle error
illegal_argument_exception: index.lifecycle.rollover_alias [filebeat] does not point to index [filebeat-7.3.2-2019.11.06]

This is my Filebeat policy:

PUT _ilm/policy/filebeat-lane-custom
{
  "policy": {
    "phases": {
      "hot": {
        "min_age": "0ms",
        "actions": {
          "rollover": {
            "max_age": "30d",
            "max_size": "50gb"
          },
          "set_priority": {
            "priority": 100
          }
        }
      },
      "warm": {
        "actions": {
          "allocate": {
            "number_of_replicas": 1,
            "include": {},
            "exclude": {}
          },
          "forcemerge": {
            "max_num_segments": 1
          },
          "set_priority": {
            "priority": 50
          },
          "shrink": {
            "number_of_shards": 1
          }
        }
      },
      "cold": {
        "min_age": "90d",
        "actions": {
          "allocate": {
            "number_of_replicas": 0,
            "include": {},
            "exclude": {}
          },
          "freeze": {},
          "set_priority": {
            "priority": 0
          }
        }
      }
    }
  }
}

While writing this post, I noticed that I had not created a filebeat-* template, but I had done so for metricbeat. Looking at some of my metricbeat indices, they have this error:

Index lifecycle error
illegal_argument_exception: Rollover alias [metricbeat] can point to multiple indices, found duplicated alias [[metricbeat]] in index template [metricbeat]

My metricbeat template can be found here: https://gist.github.com/jerrac/be045f7c2cc6de4c564fca4cb875243a

My metricbeat policy is the same as filebeat, just replace "file" with "metric".

I've read the docs, mostly, I'm working my way through them again now, but I think everything boils down to these steps:

  • Create a life cycle policy.
  • Initialize a -00001 index for auto incrementing.
  • Create a template for your index pattern, like filebeat-*.
  • Apply the policy to existing indices via index.lifecycle.name and index.lifecycle.rollover_alias
    • In my case I set name to the name of the policy, and rollover_alias to the first part of the incrementing index.

I'm obviously missing something. Anyone have a clue? A section of the docs I haven't grasped yet?

Is it possible to have more than one index template for a single index? As in the template managed by filebeat, plus a template that applies only the lifecycle settings? I tried that just now for filebeat, but it doesn't seem to have changed anything yet. I'm not sure if that' just because it takes time for it to be triggered, or something else...

1 Like

You need to remove the alias from the index template, ie. remove this section:

  "aliases": {
    "filebeat": {}
  },

You provide the alias when you kickstart ILM, by manually creating the first index. You also need to specify that this index is the write index for the alias:

PUT filebeat-1
{
  "aliases": {
    "filebeat": {
      "is_write_index": true
    }
  }
}

The write index will automatically be switched to the new index when the index will rollover.

3 Likes

I removed the aliases from my custom templates, though I'm not clear on why, or what that does.

Also, um, how will this interact with the templates <file/metric/journal/etc>beat create?

Each time there a new version of the beat, it creates a new index template for that version. This is something I'm pretty sure I want, since I don't want to manually update mappings and other settings every time I upgrade my beats...

The beats also create new version and date based indices each day. Like filebeat-7.3.2-2019.11.06.

Will this template make the version and date based indices get rolled into filebeat-0000N after 30 days?

PUT _template/filebeat-lane-custom
{
  "index_patterns": [
    "filebeat-*"
  ],
  "settings": {
    "index": {
      "lifecycle": {
        "name": "filebeat-lane-custom",
        "rollover_alias": "filebeat"
      }
    }
  }
}

Or will the version/date based indices require a version/date based rollover? Like filebeat-7.4.2-2019.11.27-000001?

That seems to be what this error is indicating:

illegal_argument_exception: index.lifecycle.rollover_alias [filebeat] does not point to index [filebeat-7.4.2-2019.11.27]

Which all makes me think I need to be telling beats to use the beat-00000N indices in the first place, but how do they know which one to use? And what happens if I end up with beat version 7.5.0 on one server, and beat version 7.6.0 on another?

Ok, all that makes me think I need to get a better handle on how indices actually work. So I'm going to sleep on it and find time tomorrow to read the docs. Any pointers in the meantime would be appreciated. :slight_smile:

Lots of great questions. Let me try to give you some pointers.

ILM is typically used in combination with rollovers. The rollover API causes new indexes to be created continuously. Initially, all of your data goes into an index called something like logs-000001. After that index reaches a certain age or a certain size, it "rolls over" to a new index logs-000002 and all new data will go to that new index. After a while, that index will rollover to a new index logs-000003 etc etc.

Now, your applications (Beats) do not need to know what the name if the current active index is. That's because all these indexes will sit behind an alias (logs-alias in the diagram below). When querying the data, all you need to query is that alias, and the query will automatically hit the underlying indexes. When you write data, you write the data to that alias, and the data will end up in the active index. This works because the alias will have one "write index", which is always pointing to the current active index. When an index rolls over from my_index-000002 to my_index-000003, the write index is automatically adjusted.

The error you were seeing has to do with the alias. It was not properly set up. No write index had been configured. This would cause problems indexing the data, because Elasticsearch would not know what index to write the documents to. It also caused problems with the rollover API. An index can only rollover if it is the current write index for an alias that's pointing to multiple indexes.

To prevent those problems, don't configure aliases in the index template. It's fine to use index templates for settings and mappings, but not for the alias. You configure the alias when you create the first index. After that, ILM will take care of adding new indexes to the alias and flipping the write index for you.

14 Likes

Thank you @abdon for detailed explanation.
But i have seen a feature to skip rollover and just manage index retention with lifecycles.
Here is the link

i have been trying to use skip rollover action but ILM tries to roll over even after this setting is configured.
i have a post on this

can you please help on this?

Thank you for the clarification on how the beats should interact with the alias rather than the actual index name. One question, should I reconfigure my beats to use <beatname> instead of <beatname>-<version>-<date>?

So, set something like:

output.elasticsearch:
  index: "metricbea"

instead of

output.elasticsearch:
  index: "metricbeat-%{[agent.version]}-%{+yyyy.MM.dd}"

If so, how will the beats and elasticsearch deal with different versions? So, say on vm has 7.5.0 and another has 7.4.2 of metricbeat?

-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Ok, here is where I'm at now:

I think part of my confusion was that I was trying to apply a new ILM policy to existing data. I thought applying the policy would tell Elasticsearch to move data from <beat>-<version-<date> indices into the new <beat>-0000N index.

A more careful reading of https://www.elastic.co/guide/en/elasticsearch/reference/current/ilm-with-existing-periodic-indices.html revealed that that is not what happens.

I did try to follow https://www.elastic.co/guide/en/elasticsearch/reference/current/ilm-reindexing-into-rollover.html but I don't think I was doing a very good job...

So I followed your suggestion and removed the alias from the template. One interesting thing I discovered was I had a bunch of aliases pointing from the daily index to the <beat> alias. So I went ahead and removed those. Then I also recreated my <beat>-0000N indices.

Next I followed https://www.elastic.co/guide/en/elasticsearch/reference/current/ilm-with-existing-periodic-indices.html to deal with my existing indices. I copied my custom ILM policy, removed the rollover section, and saved it with -existing on the end. From there I updated all my current indices to use it via:

curl -X PUT "localhost:9200/metricbeat-7.4.3-*/_settings?pretty" -H 'Content-Type: application/json' -d'
{
  "index": {
    "lifecycle": {
      "name": "metricbeat-lane-custom-existing",
      "rollover_alias": null
    }
  }
}
'

(Modified to select smaller numbers of indices at a time and whichever beat I needed to target.)

With those changes, the number of indices with lifecycle errors has changed.

Hopefully all the errors will go away now. Just have to wait for all the tasks applying the policy has triggered to finish up. :slight_smile: Then I'll check for errors again.

1 Like

There are no more pending tasks, but I still have that life cycle error on 540 indices.

When viewing the filebeat-7.4.2-2019.11.27 index in Kibana, the ILM section looks like this:

Index lifecycle management
Index lifecycle error
illegal_argument_exception: index.lifecycle.rollover_alias [filebeat] does not point to index [filebeat-7.4.2-2019.11.27]

Lifecycle policy
    filebeat-lane-custom-existing
Current action
    rollover
Failed step
    check-rollover-ready

Current phase
    hot
Current action time
    2019-11-27 14:44:15

That specific index has this in its settings:

"index.lifecycle.name": "filebeat-lane-custom-existing",

It doesn't have any rollover_alias or other alias.

The only aliases that exist:

.kibana              .kibana_3              - - - -
.kibana_task_manager .kibana_task_manager_2 - - - -
metricbeat           metricbeat-000001      - - - true
heartbeat            heartbeat-000001       - - - true
filebeat             filebeat-000001        - - - true
journalbeat          journalbeat-000001     - - - true
logstash             logstash-000001        - - - true

This is the filebeat-lane-custom-existing policy:

PUT _ilm/policy/filebeat-lane-custom-existing
{
  "policy": {
    "phases": {
      "hot": {
        "min_age": "0ms",
        "actions": {
          "set_priority": {
            "priority": 100
          }
        }
      },
      "warm": {
        "actions": {
          "allocate": {
            "number_of_replicas": 1,
            "include": {},
            "exclude": {}
          },
          "forcemerge": {
            "max_num_segments": 1
          },
          "set_priority": {
            "priority": 50
          },
          "shrink": {
            "number_of_shards": 1
          }
        }
      },
      "cold": {
        "min_age": "90d",
        "actions": {
          "allocate": {
            "number_of_replicas": 0,
            "include": {},
            "exclude": {}
          },
          "freeze": {},
          "set_priority": {
            "priority": 0
          }
        }
      }
    }
  }
}

Um, on the specific index I listed above, I see this under the error message: Current action time 2019-11-27 14:44:15

Is that saying that the index hasn't tried to apply a ILM policy since 11/27?

There is one other lingering issue that I don't think is actually related to the life cycle error message, but I figured I'd better mention it. Just in case.

When viewing my policy in Kibana, I see this message:

No node attributes configured in elasticsearch.yml
You can't control shard allocation without node attributes.

I'm not sure what to do about that message since I'm running a test cluster on a single desktop via Docker Compose. My nodes look something like the following in my docker-compose.yml file.

  esnode3:
    container_name: esnode3
    image: docker.elastic.co/elasticsearch/elasticsearch:7.5.0
    environment:
      - cluster.name=reagan3-cluster
      - node.name=esnode3
      - node.master=true
      - discovery.seed_hosts=esnode1
      - cluster.initial_master_nodes=esnode1,esnode2,esnode3
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms6000m -Xmx6000m"
      - http.cors.enabled=true
      - http.cors.allow-origin="*"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - r3_cluster_esdata3:/usr/share/elasticsearch/data
      - r3_cluster_snapshots:/opt/elasticsearch/snapshots
    healthcheck:
      test: ["CMD", "curl","-s" ,"-f", "http://localhost:9200/_cat/health"]
    ports:
      - 127.0.0.1:9203:9200
    networks:
      - elknet
    restart: always

As you can see, I'm only using environment variables to configure ES. If I recall correctly, when I first saw that message, I thought it had something to do with not configuring node.master, but as you can see, adding node.master to the environment section didn't help...

Anyway, I think I'm stumped on this for the day. So I'll end here.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.