Document ID Scope when Index Life Cycle Management is in play and index patterns

Using the BULK api and the index action on each entry.
Suppose I have an index pattern "myindex-*".
ILM is in play creating "myindex-00001", "myindex-00002", etc....
Day one a document is POSTed to the rollover alias "myindex_insert" using a user supplied Document ID of "0000000000001".
Since this is the first POST "0000000000001" will obviously be stored in "myindex-00001" as I understand it.
Day two ILM "fires" and now "myindex_insert" is pointing to "myindex-00002".
What will happen when Document ID "0000000000001" is used again against "myindex_insert"?

I believe I'll get 2 entries with Document ID "0000000000001" in 2 indexes since an index pattern is not an index but a way to search a list of indexes and the scope of uniqueness for a Document ID is the index itself not the index pattern covering a list of indexes.


1 Like

Welcome to our community! :smiley:

if you want to keep your document IDs unique then ILM is not the best approach for your index structure. To understand what might be a better option would require a bit more information on your use case.

We choose ILM because we want to be able to manage document retention policies. After a document ages out we want to be able to delete it. If I have one mammoth index any delete action will only mark the record as not there and no reclaiming of the space used on the disk will happen resulting in terabytes of space consumed with no useful data in it.
If I have an ILM based on size and age a set of documents can just be deleted after they are so old because they are all in an index that is old enough to disappear.

That's not entirely accurate. Merging handle that automatically.

But you also want a unique ID?
You can do this with ILM, but you will just need to do a secondary filter to grab the newest document from all your indices.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.