ILM and alias confusion

Hi,

I have a 5 node cluster. 3 ingestion nodes, 1 coordinator node and a cold node. All ingestion goes via my Logstash server and the output of each index is dated as so:

elasticsearch {
    index => "firewall-traffic-%{+YYYY.MM.dd}"
    hosts => [10.10.100.1, 10.10.100.2,10.10.100.3]
}

Everything works and has been for some time. I'm now looking at moving indices to the cold storage node. As I'm running 7.4.0 accross the board, I'm using Kibana's index management features to set up ILM. So using the above example, I have an index named:

firewall-traffic-2019.11.20

I've set up a few index templates trying to keep the tasks seperate for ease of altering anything in the future. So I have the default wildcard (*) index template that applies to all incoming indexes and sets up the number of replicas as so:

{
  "index": {
    "number_of_replicas": "2"
  }
}

This works fine and allocates the index to 3 ingest nodes.
The next in the processing line is the index template that ensures the indexes are placed on the host nodes and not the cold node. This template is applied to each index (I add new ones as I find something new to ingest):

{
  "index": {
    "routing": {
      "allocation": {
        "include": {
          "data": "hot"
        }
      }
    }
  }
}

Again, this works as expected.

Next is my index template that I have set up specifically for this firewall index so that I can apply specific settings to this index. Currently it has no settings/mappings applied to it.

Now I set up my ILM with a rollover at 50GB or 7 days. I skip the warm phase and set up the Cold phase, assign the cold node and ensure it is pointing at the cold node. Number of replicas are reduced to zero and timing for cold phase is set to 1 hour from rollover. Finally, deletion is set for 365 days after rollover. I name it Hot2Cold

Now this is where I'm struggling. I open up the ILM list and assign the ILM policy Hot2Cold to the firewall-traffic index template. I'm asked for the alias for rollover index - I've tried all sorts here to get this to work but for the sake of explaining the problem, I enter "firewall_rollover".
This changes the settings of the firewall-traffic index template to:

{
  "index": {
    "lifecycle": {
      "name": "50GB_move_to_cold",
      "rollover_alias": "firewall_rollover"
    }
  }
}

Now I'm expecting to see the index moved to the cold node after it hits 50GB or is 7 days old with the name "firewall_rollover" appended to it somewhere. Instead, within the 10 minute default timeframe for Elastic checking the index to see if it needs moving, I get the error message:

illegal_argument_exception: index.lifecycle.rollover_alias [firewall-traffic] does not point to index [firewall-traffic-2019.11.20]

So, from this error message, I'm assuming it, as it says, is missing an alias so I check:

GET _alias

This shows the following:

{
  ".kibana_1" : {
    "aliases" : {
      ".kibana" : { }
    }
  },
  "userinfo" : {
    "aliases" : { }
  },
  ".kibana_task_manager_1" : {
    "aliases" : {
      ".kibana_task_manager" : { }
    }
  },
  ".apm-agent-configuration" : {
    "aliases" : { }
  },
  ".monitoring-kibana-7-2019.11.20" : {
    "aliases" : { }
  },
  "firewall-traffic-2019.11.20" : {
    "aliases" : { }
  },
  ".monitoring-es-7-2019.11.20" : {
    "aliases" : { }
  },
  ".monitoring-logstash-7-2019.11.20" : {
    "aliases" : { }
  },
  "nps-2019.11.20" : {
    "aliases" : { }
  }
}

So clearly the alias I have given has not been added to this system index in Elastic & tied to the "firewall-traffic-2019.11.20" index.

The thing I'm missing is that I can't see how or where I configure Elasticsearch to update it's aliases. I do know that on another test stack we have, using winlogbeat pushing directly into elasticsearch, we don't see this issue. The alias for the winlogbeat indices is updated and the moving off to the cold node seems to work well. The aliases seem to be added here on the fly as the new daily indices are created.

Can someone point out what I'm missing? There is obvioulsy a simple way to do this that I'm missing. I'm guessing that I need to add additional info to the output of Logstash so that an alias is created at the time of ingestion but I've not found anything that points to that.

I'll try...

Using the date pattern in the logstash index name is incomparable with ILM rollover. The transition is something like this:

  • Create a new index following this if you want the create date in the name. (I think there is a typo in that doc, it looks like it ends in -1, it should be -000001, I think it was the last time I cited this section) . Use "firewall-traffic" for the part before the date AND for the alias, don't include "-rollover".

  • Insure that index has is_write_alias: true

  • change logstash index to just "firewall-traffic", it will write to the ONE index with "is_write_index: true" by referencing the alias.

  • You should be able to reference both old and new indices with "firewall-traffic-*" or you can add the alias to all the old indices.

  • Note that if logstash got data for prior dates before, it would update the index for that date. With ILM, those logs always go into the currently writing index.

  • Note that the date in the ILM index date is when it was created and can contain data (in your case) up to 7 days newer.

Hope this helps. This is pretty much what we are doing. You have to "bootstrap" ILM indices before you can use them. In our case, we have other variables in the Logstash index output statement, so we have to create several new and empty indices before we convert a pipeline.

Thanks for your reply.

I'm happy enough to not use dates in my index for the sake of simplicity and ease of integregation. So based on that, and from what you've said, it seems like I'm going to have to create an alias manually on each new index setup? This might be the step I was missing then. I was under the impression that this would happen automagically.
For the sake of getting a process down for myself and anyone else in this situation, can you confirm that the process is as follows:

  1. Logstash server outputs:

    elasticsearch {
    index => "firewall-traffic"
    hosts => [10.10.100.1, 10.10.100.2,10.10.100.3]
    }

  2. End up with an index named:

firewall-traffic

  1. Ensure there is a wildcard index template set up to sort the replicas as so:
{
  "index": {
    "number_of_replicas": "2"
  }
}
  1. Ensure there is an index template that will push the new index to the hot nodes:
{
  "index": {
    "routing": {
      "allocation": {
        "include": {
          "data": "hot"
        }
      }
    }
  }
}
  1. Now run the following to set up an alias for my new index:

curl -X POST "localhost:9200/_aliases?pretty" -H 'Content-Type: application/json' -d'
{
"actions" : [
{ "add" : { "index" : "firewall-traffic", "alias" : "firewall-traffic-ro-" } }
]
}

  1. This adds the new alias and points it to my new index. Next, set up my test ILM with a rollover at 20MB or 7 days. I skip the warm phase and set up the Cold phase to move to Cold 1 hour after rollover. Number of replicas are reduced to zero. Finally, deletion is set for 365 days after rollover. I name it Hot2Cold

  2. Open up the ILM list and assign the ILM policy Hot2Cold to the firewall-traffic index template. Add the name "firewall-traffic-ro" for the rollover alias.
    This changes the settings of the firewall-traffic index template to:

{
  "index": {
    "lifecycle": {
      "name": "50GB_move_to_cold",
      "rollover_alias": "firewall-traffic-ro"
    }
  }
}

So 2 questions -

a). Does this look right?

b). Can you confirm that with the setup above, the steps below is what is happening as I'm not seeing the expected additional index with the rollover alias.

  • Every 10 mins the index is checked to see if any aspect of it would trigger the rollover template that is assigned to it. In the case of my test, as soon as it hits 20MB, it should rollover and I should have a new index called "firewall-traffic-ro-00001". I don't see these additional indices so I'm not sure this is actually what happens as a:
    GET _cat/indices
    shows no additional indices.
  • After 1 hour, I should see a whole bunch of indices get moved to the Cold node as the data flow into this index is high. I can hit 20MB in under 1 min at peak times so there should be multiple rollover indices to move.

Still feel like I'm missing a step....

I think you may be making it harder than it needs to be, but I may not understand all of your requirements.

Your indices need to be named firewall-traffic-00001, with the suffix incrementing at each rollover. You should be able to assign the policy via the template, you shouldn't have to do it manually.

When the template is created, do this to create the index and alias:

PUT firewall-traffic-000001
{
  "aliases": {
    "firewall-traffic": {
      "is_write_index": true
    }
  }
}

Doc here, second code block.

I don't think rollover will work without the -000nn suffix, but you can do

GET firewall-traffic*/_ilm/explain

Or maybe use Kibana's Index management to look for problems. `

A properly configured Logstash Elasticsearch output should create the bootstrap index, but getting that configuration is difficult. (and won't work for us anyway). That's why I prefer to get at least the first one working this way.

I don't know how frequently ILM checks, but when it is working, it should do what your policy requests. I'll bet when you check Kibana you will see "ILM Errors" or some warning message.

"I think you may be making it harder than it needs to be"

I think you're right, that's certainly the way it's feeling!

"Your indices need to be named firewall-traffic-00001, with the suffix incrementing at each rollover"

So if I have Logstash outputting that index name, will the rollover then take care of incrementing of the index number?

"When the template is created, do this to create the index and alias:"

So you are saying that I need to pre-stage the index before I start ingesting data? I assumed this would be something that I could put either in the Logstash config itself (as in or as an output option in Logstash for the index). Off the back of that thought, I have found this: https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html

output {
  elasticsearch {
    ilm_rollover_alias => "custom"
    ilm_pattern => "000001"
    ilm_policy => "custom_policy"
  }
}

which I assume will update the alias index as well as set up the rollover details on the index at point of ingestion. This also applies the ILM policy to the index providing you've created the policy already. This was more how I expected it to work (providing I've >finally< understood it corectly). :slight_smile:

Will test this tomorrow at work.

Thanks for taking the time to reply.

That example should work, in your case ilm_rollover_alias is probably a better understood option than "index". From the doc:

If both index and ilm_rollover_alias are specified, ilm_rollover_alias takes precedence.

If logstash is "perfectly" configured, it will create the initial index, or you can do it manually, whichever works first :slight_smile:

This is getting frustrating...

I have an ILM called "20MB_move_to_cold"

I have an index template called number_of_replicas_2 that looks like:

{
  "index": {
    "number_of_replicas": "2"
  }
}

I have another index template called route_to_host to take care of routing all indices to the hot nodes that is applied to all indices:

{
  "index": {
    "routing": {
      "allocation": {
        "include": {
          "data": "hot"
        }
      }
    }
  }
}

The Logstash output:

elasticsearch {
ilm_rollover_alias => "firewall-traffic"
ilm_pattern => "000001"
ilm_policy => "20MB_move_to_cold"
hosts => [10.10.100.1, 10.10.100.2,10.10.100.3]

I have the index routing into my stack but it's named "firewall-traffic" and its missing the -000001 appended to it. Checking the _aliases in Elasticsearch I see that it hasn't created an alias at all so the output definition above does nothing.....

Ok, if there is an index "firewall-traffic", it was probably created when the config wasn't "perfect". It won't let an alias exist with that name. You'll have to delete that index or use a different alias. If the index is empty, just delete it.

You'll probably need to restart logstash after removing the index--alias conflict.

I've already tried that unfortnately. Logstash is restarted after each config change to Logstash. I can delete the index in Elastic and wait a few seconds only to see it arrive again still missing the -00001 appended to it.
I'm totally out of ideas. I'm starting to question if this isn't a bug when i read things like this: