Elastic search document wise persistence - Need solution

Rahul · December 1, 2016, 9:11pm

We have a requirement for all the documents stored in single Index to be persisted based the data value in particular field. So few document with field value=XYZ needs to be persisted 7 days only while other documents to persisted for 1year or so. As we understand, elastic search does have capability to set TTL for each document to be set through Java API. But we also read through various expert blogs of experts that TTL is not a performance friendly solution since ES continually check documents for expiration in the background causing performance bottleneck to some extent. Setting TTL based on Index/Type would also be the same provided data is split into independent types based on persistence requirement.

So in such situation, what is the best possible solution we should go for as per best practices/standards?

warkolm · December 1, 2016, 10:04pm

Why not put the short term documents into their own index?

Rahul · December 2, 2016, 7:02pm

So you mean splitting the data into different indices based on persistence requirement and setting TTL for each index as per requirement?

nik9000 · December 2, 2016, 7:44pm

Don't use _ttl at all. It is deprecated and being removed eventually because it doesn't perform well. Mostly because deleting all documents in an index is just much less efficient than deleting the index. Instead I'd have an index per day for each of the last seven days. Every day you nuke the oldest one and make a new one (or turn on dynamic index creation and use templates).

For your one year documents you can do an index per day, but searching that many indexes can be quite a bit. So it might make more sense to do an index per month or per week.

Also have a look at the shrink and rollover APIs. They are new in 5.0 so we don't really have any best practices around them but they were meant for time based indexes like the ones you are describing. They may not help you, but they are worth looking at.

Rahul · December 5, 2016, 10:34pm

Thank you very much for detail explanation. But we have data which needs to be represented through single index pattern inside Kibana visualizations. Hence we at most can split data in multiple types under same index. So in this case, if not TTL then is there any better way to maintain various types based on different persistence requirements for each type under single index?

warkolm · December 5, 2016, 10:37pm

Why not just patterns like indexname-7days-$TIMESTAMP and indexname-1year-$TIMESTAMP?
Then you can set a pattern of `indexname-* in KB.

system · January 2, 2017, 10:37pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
TTL Documents Elasticsearch	9	5374	February 4, 2019
Rolling indexes or _ttl Elasticsearch	2	402	July 6, 2017
Set a _ttl value at the index level? Elasticsearch	7	3128	July 5, 2017
Change TTL value Elasticsearch	7	2427	July 6, 2017
Data expiration and ttl Elasticsearch	6	7197	July 5, 2017

Elastic search document wise persistence - Need solution

Related topics