Daily index maintenance

ankitachow · May 7, 2018, 8:40pm

Hi,

In my current application, I'm rolling over data with a doc limit. I'm planning to implement the logic of creating daily indices in my platform. I would like to know how would daily index affect below parameters:

Processing old data. For eg: in my application, I expect old data with a latency of XX hrs. So, once the new index is created with current date, I can still expect some yesterday's data. But since my current day's index is active, how can I maintain yesterday's data in m current index. Retention will also be a problem in this scenario.
Creating daily index, how the response time for search be affected? How to ensure the request goes only to few shards of the day for which the request was issued.

I'm planning to follow the below link:

Thanks
Ankita

warkolm · May 8, 2018, 6:40am

you'd need to use time based indices and then have your processing layer send the delayed data to the older index.
Depends on how you are querying the indices .

Christian_Dahlqvist · May 8, 2018, 6:59am

Kibana used to limit the indices being queried, first by using date match based on the timestamp in the index name and later based on field stats. Improvements in Elasticsearch has meant that this is no longer required, and Kibana now sends the query to all shards matching the index pattern, so if you are on version 6.x you may not need to worry about this.

ankitachow · May 10, 2018, 2:44pm

@warkolm: How can the processing layer process be made t process old data and send to older index? From what I understand, the daily indexes that are created are not on the contents of the data but the time when the data has been processed.
for eg: if data is processed on 5/10 it will do to 5/10's index even when it has 5/9's data.
Also, when I'm using alias, with a rollover is happening on the on the daily index, the wite alias points to the current and so writing data to the old index is also a challenge. How to overcome that?

warkolm · May 11, 2018, 5:16am

If you are using Logstash then it'll automatically pick the right day's index as long as you have a date filter taking the event date from the event.

Christian_Dahlqvist · May 11, 2018, 6:30am

If you are using rollover, events that are processed late and indexed into the write alias will indeed end up in the current index. You could based on index statistics determine which index/indices the data should go to and index directly to the indices rather than through the alias, but there is no automatic way to do so.

If you have data coming in very late, it may be better for you to stick with indices matching fixed time periods based on the index name.

system · June 8, 2018, 6:32am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Manage old data based on time Elasticsearch	6	1542	July 5, 2017
Rollover Index based on timestamp field Elasticsearch	8	1362	September 6, 2019
Time-based indices and automation in node.js Elasticsearch	9	1455	January 13, 2017
Elastic search Indexes Elasticsearch	4	413	March 27, 2018
Best method for daily ingest of data? Elasticsearch	2	929	January 10, 2018

Daily index maintenance

Related topics