In rollover, Is it possible to update the document of old indexes?

Nikunj210903 · February 17, 2021, 6:24am

Hello community.I have implemented rollover feature of elastic search. It will create a new index every day. Below is my question

Is it possible to update the document of old indexes ? I have read somewhere that old indexes are set read-only . Can I prevent that because I want to update the document in both current-index and old-index .

warkolm · February 17, 2021, 7:15am

Welcome to our community!

Can you elaborate where you read that? It's not part of our docs.

Nikunj210903 · February 17, 2021, 7:47am

@warkolm I meant. I don't know the index name(Because indices are created dynamically) then how can I update a doc?

warkolm · February 17, 2021, 7:48am

Update which doc sorry?

Christian_Dahlqvist · February 17, 2021, 7:49am

How are you updating old documents? Do you assign a unique ID that you can use? What type of data are you indexing?

Nikunj210903 · February 17, 2021, 9:02am

@Christian_Dahlqvist Yes I assigned a unique ID that I can use. But suppose I have 100 old indexes. Then Do I need to fire a update query 100 times? Like try to update document in 1st index, if document is missing then try to update document in 2nd index and so on upto 100th index. Is there any way that I fire only one query to update document which is in any one index of that 100 indices.

Christian_Dahlqvist · February 17, 2021, 9:08am

Do you know the timestamp of the documents to update when you issue the update?

Nikunj210903 · February 17, 2021, 9:11am

@Christian_Dahlqvist You mean document creation time? I will not have the document creation time. I just have unique ID, alias name, and index pattern

Christian_Dahlqvist · February 17, 2021, 9:49am

It does not necessarily sound like your data naturally fits with time-based indices. Do the documents have a retention period? What type of updates are you performing? What type of data is it?

Nikunj210903 · February 17, 2021, 1:24pm

@Christian_Dahlqvist Below is my document structure. Here id is unique.
{
"id": 1
"name":"John",
"company": "TCS"
}
I have set max_docs: 2 in rollover, so It will create a new index after every 2 documents inserted. Suppose I have added 11 documents then It created 6 indexes. Now I need to update document no: 3(id: 3), which is on the 2nd index. So If I use index alias to update document no: 3 then It will give the document missing error because current index is 6th index which contains only one document(ID: 11). I used update_by_query and provided index pattern in Elasticsearch URL, so it will try to update all indexes which matches the index pattern. Is it Right approach or is there any alternative?

Christian_Dahlqvist · February 17, 2021, 1:58pm

If you need to perform lots of updated efficiently it might be better to use a single index and avoid rollover. If you can tell us more about the data and use case we may be able to help.

Nikunj210903 · February 18, 2021, 3:37am

Thanks @Christian_Dahlqvist for your support.

Nikunj210903 · February 19, 2021, 3:31pm

@Christian_Dahlqvist Here is the my detailed requirements:
I am making a chat application. Please refer below 3 terms

bot: one who replies automatically when someone sends a message to bot
visitor: one who can chat with a bot.
conservation: chat between visitor and bot is called a conversation

Relationship: A visitor can have multiple conversations So the mapping between visitor and conversation is one to many.

I have below 3 approaches for index structure

I create only one index conversation. And keep a visitor object in each conversation
Advantages: Easy to search based on visitor, conversation properties
Disadvantages: Suppose one visitor have 5 conversations. Now visitor updates his property then I need to updates that all 5 conversation documents. I am using rollover index feature of elastic search. So if my document is in old indices then need to update document using index pattern because i don't know in which index my document is. Using index pattern instead of index name is more time consuming.
I create only one index visitor. And keep a list of conversation object in each visitor
Advantages: Easy to search based on visitor, conversation properties
Disadvantage: I need pagination on total conversations which is not possible in this approach.
I create 2 indexes visitor and conversation. I implement rollover in both indexes
Advantages: It don't need to update conversation when visitor property is updated because I have separate index for visitor object and I am not keeping visitor object in conversation object
Disadvantage: Same disadvantage as in the 1st approach, suppose I update visitor property and as I am using rollover, if visitor is in old index then need to update it using the index pattern instead of index name so it will consume more time. Second disadvantage is that suppose I want to search on both conversation and visitor property then need to invoke 2 ES queries for both the index which will take more time.

Please suggest a better approach that how do I keep my index structure so It can fulfil below requirements

I can search on both properties visitor and conversation
I can do pagination on total conversions
I can update document using index name instead of index pattern

Christian_Dahlqvist · February 19, 2021, 3:39pm

Do you have a defined retention period for your data? How much data do you expect to index every day?

Nikunj210903 · February 19, 2021, 3:44pm

@Christian_Dahlqvist No. I have not defined retention period . I have 5000 conversation per day of 3000 visitors

Christian_Dahlqvist · February 20, 2021, 8:56am

Given that a single shard can hold over a billion documents I do not see why you would need to use time based indices, which are better suited for immutable data with limited retention period. Instead create a single index with 3 or 4 primary shards and that would probably last you a long time. If the shards get too large you can always use the split index API, although you probably would like to set up an alias linked to the index so you can reindex and switch this easily without affecting the application.

Nikunj210903 · February 22, 2021, 3:57am

@Christian_Dahlqvist
Thanks for your support. I will got with only one index and used split API when my index have sufficient data.

system · March 22, 2021, 3:58am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Update across indices Elasticsearch	4	1012	July 6, 2017
Update items in a rolled-over index Elasticsearch	7	1592	March 23, 2021
How to purge old documents or any better options to use rollover for dynamic index Elasticsearch	18	556	August 16, 2019
Indice rollower Elasticsearch	2	357	September 13, 2019
How to update a document using index alias Elasticsearch	6	5588	July 11, 2017

In rollover, Is it possible to update the document of old indexes?

Related topics