ELK and ILM - change of behaviour

Elad_Shmitanka · September 24, 2019, 6:00am

Hi,

We have recently did an upgrade of the ELK stack into 7.2.0 and started using ILM.
With the usage of ILM we have encountered multiple issues which sometime originate with a single one.
The architecture is quite simple, we have beats sending to Kafka, and LS reading from kafka, digesting and indexing to Hot-Warm-Cold

The flow is this:

Number of logs drastically increase due to error in the system
The volume is too high and the ES gets into read-only due to disk space
Lag is created

And now begins chain of issues

Operator tries to move indexes, and they are read-only (this is also with curator)
If Operator deletes index, the ILM gets stuck, that is due to auto-creation of index
The lag is big, and the timestamp of the even is not taken into account when writing to the index, that is, the writing will be done to the index active during the index operation, not on the event's timestamp (unlike without ILM)
Since the lag is being consumed, and the rollover is calculated time based (not the initial, the move to warm), that yet again creates a congestion on the hot servers as they take all the lag
Cannot explain yet why, but at those cases we also see indexes created far larger than the IML settings, i.e, initial rollover set to 100G, and the index size grows to 1T (500G primary)

So questions are:

Is there some recommendation that I've missed, to disable the auto-create index if using ILM, and if so, the ILM is being set on the Logstash side, if I want to do it properly I assume I will must realy on naming convention (?)
Is there a way (or plan) to add to the rollover more criteria, based on disk space, when moving from Hot to Warm or Cold? We are with physical servers, I need the retention to be flexible, not the disk space
Is there a way to tell Logstash, with ILM, to take into account the event's timestamp or is it intentionally ignored?

Thanks in advance

system · October 22, 2019, 6:00am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ILM questions2 Elasticsearch ilm-index-lifecycle-management	3	380	February 21, 2020
Hot nodes full Elasticsearch ilm-index-lifecycle-management	9	1233	December 23, 2022
ILM Policy Clarification Kibana ilm-index-lifecycle-management	6	344	January 2, 2023
ILM Setup support for ELK 8.3 Elasticsearch ilm-index-lifecycle-management	2	184	November 14, 2023
ILM Delete Only Phase Elasticsearch ilm-index-lifecycle-management	2	716	July 9, 2020

ELK and ILM - change of behaviour

Related topics