Index Lifecycle Management (ILM) based on a document field

fefontana · May 21, 2021, 8:36pm

Hello, I would like to know if there is any way to partition and index based on a field of documents within it. I don't want to use the index size (like "max_size"), nor the index time (like "max_age") and nor max number of documents within the index. What I need is partitioning the index according to the values in a field of a mapping. For example, I have an index mapping like this:

{
"mappings": {
"_doc": {
"properties": {
"AccountId": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"AccountServicePointId": {
"type": "long"
},
"ActivityDate": { <────────────────────────────────────
"type": "date"
}, . . . . and more

What I would like to achieve is to create an ILM policy (or other automatic procedure) which let me create partitioned indexes based on the field "ActivityDate" taken from the mapping. The result should be something like this:

index-000001 (January 2021 docs according to ActivityDate field)
index-000002 (February 2021 docs according to ActivityDate field)
index-000003 (March 2021 docs according to ActivityDate field)
index-000004 (April 2021 docs according to ActivityDate field)
.
.
.
index-000012 (December 2021 docs according to ActivityDate field)

Is this possible ?.

warkolm · May 24, 2021, 5:29am

No, that's not possible with ILM. You'd need to manage this yourself.

Christian_Dahlqvist · May 24, 2021, 5:41am

This is basically what you do when you create indices with a date pattern, e.g. YYYYMM, in the index name based on a specific date field. This is how all time-based indices were created before rollover was introduced. This is however incompatible with rollover but you should still be able to use ILM to manage retention even if rollover is not used. Logstash supports creating indices this way but does this based on the @timestamp field so you may need to copy your datetime field over.

warkolm · May 24, 2021, 7:49am

Yeah this occurred to me as a work around as well.

fefontana · May 24, 2021, 2:13pm

Thank you Christian and Warkolm for your answers, but in this case I'm not using Logstash, I'm currently using the node.js "@elastic/elasticsearch" library to insert new documents in ES indexes. According to the previous comments, I deduce that it is not possible to implement an automatic partition of the index using as a reference the values of a field taken from the mapping (like ActivityDate in my example). I understand the best approach using the node.js library to interact with ES is to manage by myself the partition process using a reindex by query or something similar to:

Let me know if I'm wrong or exists a better approach. Thank you!.

system · June 21, 2021, 2:13pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Implementing Rollover+ILM with Logstash and time based index name Elasticsearch ilm-index-lifecycle-management	3	1262	October 7, 2020
Index lifecycle management (cleanup) based on document fields Elasticsearch	2	178	September 16, 2022
ILM questions: How to delete indexes based daily referenced to @timestamp? Elasticsearch ilm-index-lifecycle-management	2	1114	May 19, 2020
Logstash ILM problem Logstash ilm-index-lifecycle-management	4	548	October 15, 2019
Want to create new index daily with date associated with index name Elasticsearch ilm-index-lifecycle-management	3	311	December 8, 2023

Index Lifecycle Management (ILM) based on a document field

Related topics