Setting up Index Life Management

I use Logstash grok patterns to parse my logs. These logs are outputted to Elasticsearch in daily indexes. I want to create an index template to then apply to an ILM policy that will delete these logs after 60 days.

When I was creating the index template, I was a little confused about the following fields, and if it is necessary to set up:

  • Component templates
  • Index settings
  • Mappings
  • Aliases

Since all I want to do is attach a policy to these indexes (I use the wildcard to group them together, ie log1*, log2*, log3*), do I need to set up any of the other settings?

ILM wants to manage indices by age since "rollover", since your old-style daily indices don't roll over, ILM may not be a good fit, if it's even possible.

Your logstash may be using the @timestamp found in the event as part of the output index name. If you ever encounter old or incorrect dates, you can suddenly find new indices for old dates. I once discovered winlogbeat indices for most dates over the past several years.

ILM used aliases, all some-index-00001 thru some-index-00xxx have an alias of "some-index", but only one is "is-write-index": true. Any stale dated data will always be written to your newest index.

I'd recommend a clean start for some test or non-import index. Get ILM rollover by size/age working, get an understanding of ILM, them move away from all your date based indices.

You can still use the curator package to manage your legacy indices.

If you started with Elastic before 7.8, there were templates and they had index pattern specs, all templates matching the index name were merged together, in priority order and the results used as a pattern for the new index.

Now those are called legacy templates in doc new enough to know there is a difference. If you look at non Elastic information, be careful, if using Elastic doc, note the version it applies to and when it was written.

Now there are index templates and component templates, both confusingly referred to as compostable, I find it almost better to prethend you never heard the word "compostable".

Now an index template can include multiple component templates. But, only one index template will be used at index creation, the one with highest priority. Their doc says to avoid overlapping index patterns, so I'm not sure they even know what happens when multiple index templates match having the highest priority.

I'm still not convinced that the new way isn't more work for us than the old way, but when you design a workflow and something this major forces it to change, it can cause problems.

Here's an example, we want to have a runtime field for winlog.user_data.FilePath called winlog.user_data.OurFilePath.

  1. I create a component template winlog.ourfilepath with this one runtime field.
  2. To deploy this, we're going to install winlogbeat 7.17.0, so I configure winlogbeat.yml for our environment to load the template, setup ILM and this will create the first index, winlogbeat-7.17.0-00001.
  3. I use Kibana to add the winlog.ourfilepath component template to the index template winlogbeat-7.17.0
  4. I issue a rollover for winlogbeat-7.17.0 to create a new index that now has the mapping for our runtime field.

When I again upgrade winlogbeat, I'll have to repeat the steps of configuring the runtime field.

If I don't have a component index, beats can pretty much "self update" the templates, but custom component templates seem to cause extreme compications.

1 Like

This was very comprehensive, thank you! I am using Elastic 8.0.0.

Regarding this, I had a couple questions on getting started with this. I could start with a staging instance I'm reading from, but how would I go about moving away from date based indices?

Your logstash Elasticsearch filter is probably setting @timestamp and the output section probably has something like this:

index => "filebeat-%{[agent.version]}-%{[some-field]}-%{+YYYY.MM.dd}"

You need to know all possible indices this will reference, the index names are dynamically created by the value of the field.

For example if some-field = "testing-ilm";

First setup your desired index template to assign settings, mappings and an ILM policy, then create the first index, this is one way.

PUT /%3Cfilebeat-7.17.1-testing-ilm-%7Bnow%2Fd%7D-000001%3E
{
"aliases": {
"filebeat-7.17.1-testing-ilm": {
"is_write_index": true
}
}
}

(This will put the yyyy-dd-mm creation date in the index name, but it may have data for other dates, this is optional, but I like it as an easy way to know when the index was created.

Then just change your logstash output to:

index => "filebeat-%{[agent.version]}-%{[some-field]}"

which is the alias for your bootstrapped index.

Elastic is now pushing data streams, which I haven't evaluated yet. I'm retiring at the end of the year and am trying to pass knowledge along on our current environment, so data streams is just an untimely complication for me with no apparent benefit in our environment. (IE: If it's not broke, don't fix it)

However, if I was starting a new stack, I probably would try to see if data streams hold some benefit. IMHO Elastic is somewhat prone to deprecating things what work (and I use) just for the sake of "improvements" so there may be some risk in not using the latest techniques.

"

javascript:/--><svg/onload='+/"/+/onmouseover=1/+/[//+alert(1)//'>

Perfect thank you! This was incredibly helpful, and hope you have a great retirement at the end of the year :slight_smile:

How I have my logs set up is as such:

index => "application1-logdata=%{+YYYY.MM.dd}"

I am reading logs from multiple applications, so when I am creating the Data Views in Kibana, I separate them out by doing application1*, application2* and so on. I'll dig a little deeper into setting up ILM, I'll keep this open as I'll probably have more questions later on.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.