In rollover, Is it possible to update the document of old indexes?

Hello community.I have implemented rollover feature of elastic search. It will create a new index every day. Below is my question

  1. Is it possible to update the document of old indexes ? I have read somewhere that old indexes are set read-only . Can I prevent that because I want to update the document in both current-index and old-index .

Welcome to our community! :smiley:

Can you elaborate where you read that? It's not part of our docs.

@warkolm I meant. I don't know the index name(Because indices are created dynamically) then how can I update a doc?

Update which doc sorry?

How are you updating old documents? Do you assign a unique ID that you can use? What type of data are you indexing?

@Christian_Dahlqvist Yes I assigned a unique ID that I can use. But suppose I have 100 old indexes. Then Do I need to fire a update query 100 times? Like try to update document in 1st index, if document is missing then try to update document in 2nd index and so on upto 100th index. Is there any way that I fire only one query to update document which is in any one index of that 100 indices.

Do you know the timestamp of the documents to update when you issue the update?

@Christian_Dahlqvist You mean document creation time? I will not have the document creation time. I just have unique ID, alias name, and index pattern

It does not necessarily sound like your data naturally fits with time-based indices. Do the documents have a retention period? What type of updates are you performing? What type of data is it?

@Christian_Dahlqvist Below is my document structure. Here id is unique.
{
"id": 1
"name":"John",
"company": "TCS"
}
I have set max_docs: 2 in rollover, so It will create a new index after every 2 documents inserted. Suppose I have added 11 documents then It created 6 indexes. Now I need to update document no: 3(id: 3), which is on the 2nd index. So If I use index alias to update document no: 3 then It will give the document missing error because current index is 6th index which contains only one document(ID: 11). I used update_by_query and provided index pattern in Elasticsearch URL, so it will try to update all indexes which matches the index pattern. Is it Right approach or is there any alternative?

If you need to perform lots of updated efficiently it might be better to use a single index and avoid rollover. If you can tell us more about the data and use case we may be able to help.

Thanks @Christian_Dahlqvist for your support.

@Christian_Dahlqvist Here is the my detailed requirements:
I am making a chat application. Please refer below 3 terms

  1. bot: one who replies automatically when someone sends a message to bot
  2. visitor: one who can chat with a bot.
  3. conservation: chat between visitor and bot is called a conversation

Relationship: A visitor can have multiple conversations So the mapping between visitor and conversation is one to many.

I have below 3 approaches for index structure

  1. I create only one index conversation. And keep a visitor object in each conversation
    Advantages: Easy to search based on visitor, conversation properties
    Disadvantages: Suppose one visitor have 5 conversations. Now visitor updates his property then I need to updates that all 5 conversation documents. I am using rollover index feature of elastic search. So if my document is in old indices then need to update document using index pattern because i don't know in which index my document is. Using index pattern instead of index name is more time consuming.

  2. I create only one index visitor. And keep a list of conversation object in each visitor
    Advantages: Easy to search based on visitor, conversation properties
    Disadvantage: I need pagination on total conversations which is not possible in this approach.

  3. I create 2 indexes visitor and conversation. I implement rollover in both indexes
    Advantages: It don't need to update conversation when visitor property is updated because I have separate index for visitor object and I am not keeping visitor object in conversation object
    Disadvantage: Same disadvantage as in the 1st approach, suppose I update visitor property and as I am using rollover, if visitor is in old index then need to update it using the index pattern instead of index name so it will consume more time. Second disadvantage is that suppose I want to search on both conversation and visitor property then need to invoke 2 ES queries for both the index which will take more time.

Please suggest a better approach that how do I keep my index structure so It can fulfil below requirements

  1. I can search on both properties visitor and conversation
  2. I can do pagination on total conversions
  3. I can update document using index name instead of index pattern

Do you have a defined retention period for your data? How much data do you expect to index every day?

@Christian_Dahlqvist No. I have not defined retention period . I have 5000 conversation per day of 3000 visitors

Given that a single shard can hold over a billion documents I do not see why you would need to use time based indices, which are better suited for immutable data with limited retention period. Instead create a single index with 3 or 4 primary shards and that would probably last you a long time. If the shards get too large you can always use the split index API, although you probably would like to set up an alias linked to the index so you can reindex and switch this easily without affecting the application.

@Christian_Dahlqvist
Thanks for your support. I will got with only one index and used split API when my index have sufficient data.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.