After deleting an index, a new one is created with invalid naming scheme

mdebord · December 22, 2021, 11:48pm

Logstash 7.16.2
Elasticsearch 7.16.2

Summary

In a scenario that beat processors are sending to logstash, and logstash has ilm configured, if the working index is deleted, a new one is created (as expected), but the naming convention is wrong.

I realize there should not be a reason for an index to be deleted in this way, but in the event that it did happen, it would be expected that the system would create a new index with the appropriate name. Things like index patterns/etc can break when indexes are not formatted as expected.

Problem Scenario:

Given this logstash config:

input {
  beats {
    port => 5000
  }
}

output {
  elasticsearch {
    hosts => ["http://localhost:9200"]
    ilm_enabled => true
    ilm_rollover_alias => "filebeat"
    ilm_pattern => "{now/d}-000001"
    ilm_policy => "30-days-default"
  }
}

When the beats publisher starts to push into logstash, the index name format is automatically created as expected (ex: filebeat-2021.12.20-000001)

With the above scenario prepared, steps to reproduce:

Delete the current index that was created (ex: filebeat-2021.12.20-000001)
Wait for the beats publisher to push more messages to logstash
A new index is automatically created

Expected behavior:

The new index that was created should follow the same naming scheme (ex: filebeat-2021.12.20-000001, filebeat-2021.12.20-000002, etc)

Actual behavior:

The new index created is only the rollover alias: (ex: filebeat) with no pattern applied

stephenb · December 23, 2021, 12:08am

Yes that is the expected behavior because once you delete the actual index there is no longer an index that has the correct write alias.

When you deleted the index, you deleted the alias as well.

It's not best practice to just delete a managed index.

What happens is then filebeat / logstash just starts writing to the alias as if it's the correct actual index because there is no alias to point it to the new index.

The rollover needs to happen first!

You either need to

A) Force the rollover before you delete index. This is a good approach because if you just deleted and you have multiple beats writing, it'll immediately write and you will see the behavior that you see above.

If you force to roll over first then the data is being written to the new induction. You can delete the old one.

POST filebeat/_rollover

Otherwise You will have to stop all instances of logstash / filebeat

Or Manually create the index with the alias known as the initial managed index

There is also a question of how / If you initially ran filebeat setup, That is what loaded the templates and created the initial managed index.

So you can also run filebeat set up again...

Corrected : Logstash does create that initial managed index as well

And of course, if you didn't want to set all that in logstash, you can just use basically a pass-through output let filebeat manage it.

It would look something like this

################################################
# beats->logstash->es default config.
################################################
input {
  beats {
    port => 5044
  }
}

output {
  if [@metadata][pipeline] {
    elasticsearch {
      hosts => "http://localhost:9200"
      manage_template => false
      index => "%{[@metadata][beat]}-%{[@metadata][version]}"
      pipeline => "%{[@metadata][pipeline]}" 
      user => "elastic"
      password => "secret"
    }
  } else {
    elasticsearch {
      hosts => "http://localhost:9200"
      manage_template => false
      index => "%{[@metadata][beat]}-%{[@metadata][version]}"
      user => "elastic"
      password => "secret"
    }
  }
}

mdebord · December 23, 2021, 2:36am

Thanks so much for the response, that seems to clear it up, but I have a few follow ups based on your reply:

In my stack, filebeat has no access to elasticsearch directly, and the filebeat auto setup filebeat setup was never ran, everything was done manually. The first index was created automatically when logstash pushed the first message (and no index existed) and it followed the correct scheme and attached the correct lifecycle policy.

When I restart the logstash service, the index is recreated in proper format - so it must be logstash, right? I don't know what else would have created the index. Lifecycles and everything seem to be working fine with the auto created index.

In other words, if I stop logstash, delete the "bad" index, and restart logstash: a new index is created with the correct naming convention. What could have done that other than logstash? This is actually the initial reason I thought this behavior of the wrong index name was a bug.

Manual setup. Imported the template and used one of the existing lifecycle policies. Everything seems to be working fine regarding lifecycles and the index picking up the template schema.

For sure, I was thinking in more of a DR scenario that the index for some reason was deleted, corrupted, etc. Just testing how the system would respond in that scenario.

That didn't work, but hitting the alias did: POST filebeat/_rollover

stephenb · December 23, 2021, 3:11am

Yup makes sense, yes it appears logstash does it as well good to know, I didn't have my lab to test it but now we know. Corrected above

Yup typo on my end... Fixed

Glad you got it working..

Bottom line, don't delete that index while it's running. Roll it over first or create a new bootstrap index with the write alias.

mdebord · December 28, 2021, 5:32pm

Why is this solved? I never marked a solution because the original report was not addressed, yet it's marked as solved. I just spend the first few posts explaining how it's actually Logstash creating and naming the index, then the post is marked as solved?

It's realized that you're not supposed to just go around deleting indexes like that, but DR events are not "expected" nor "best practice" events.

Even unexpected or not best practice, it's expected that software systems to behave in a reasonable way. It's reasonable to assume that Logstash would create a new index using the same naming scheme that it used when it first created the index. If this could be fixed in Logstash, it would eliminate a spiderweb of problems in the "unexpected" and "unplanned" even that an index was deleted before rotating.

It's fine if this is a "will not fix" judgement by Logstash team, but it's definitely not a "solved" issue. One would reasonably expect whatever process creates index to be responsible for naming them properly even in the event that an index needs to be unexpectedly recreated.

yaauie · December 28, 2021, 7:09pm

This is a community discussion forum, and you left a reply indicating that the original problem was understood so a mod marked it as a solution:

If you have an issue that needs escalation to the develpers managing a project, opening an issue on the relevant project's issue tracker is a way to track the issue through to resolution.

That said, The Elasticsearch output plugin for Logstash doesn't have a way of detecting an Elasticsearch index being deleted. Manual manipulation of policy-managed assets, especially when those assets are in use, is a recipe for edge-cases, and as a Logstash dev with knolwedge of the APIs available, I don't see a way to prevent this issue from occurring.

When a Logstash pipeline starts, and it has an Elasticsearch output configured with ILM, it ensures that the ILM policy provided is bootstrapped, creating a write alias as appropriate. After it has done so, it pushes events to the write index using the BULK API. As the pipeline processes events, it expects the alias to be in place (because it was there before we started pushing events).

When Elasticsearch receives a directive to write to an index that does not exist, it helpfully creates the index (unless policy dictates that it should reject the request). Because Logstash has no way of knowing that the write alias has been deleted, in-flight events that were intended to be routed to the write index are interpreted by Elasticsearch to be routed to an index with the same name, creating the index.

If you need to manipulate policy-managed assets manually, you need to first make sure that all activity is shut down first (including pipelines writing to the ILM indices).

stephenb · December 28, 2021, 9:21pm

Apologies ... Removed solved.

Solved on the community forum typically means the question is answered / understood it does not necessarily mean the software itself has been changed, etc. to solve the issue. But I understand your perspective.

I misunderstood your response as understanding that it works as designed today.

Agree with @yaauie above that if you would like to see additional features, please feel free to file a feature request.

To me, this behavior is very much like when you have a soft link to a file in Linux to an actual concrete file.

If you remove the soft link and then continue to write, it will create a new hard / concrete file with the name of the soft link.

That's pretty much the same behavior In this case. You have an alias pointing to a concrete index and when the alias is deleted, a concrete index with that alias name is created.

Please we encourage you to open feature request as we highly value our community input.

system · January 25, 2022, 9:22pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Index associated with ILM policy is later disassociated Elasticsearch ilm-index-lifecycle-management	5	827	June 18, 2019
ILM policy not attached when index is deleted while Logstash is running Logstash	1	488	July 22, 2019
[BUG] Filebeat creates wrong index name on index deletion Beats filebeat	13	1997	May 8, 2020
How to avoid logstash elasticsearch output ilm_rollover_alias overwriting the index prefix Logstash	5	3112	June 6, 2020
Logstash with ILM like old behavior Logstash	3	874	April 24, 2019

After deleting an index, a new one is created with invalid naming scheme

Summary

Problem Scenario:

Related topics