Delete old documents

Is there a way to delete old documents based on a filed which has timestamp?
What I want to do is just keep the latest document per id

example
id. timestamp
1 1pm
1 2pm
1 3pm
1 4pm
2. 3pm
2. 4pm
2. 6pm
2. 7pm

I want to keep only the 4 pm document for id 1 and delete all the others, similarly for id 2, want to keep only 7pm document and delete all the rest.

Thank you!

If you stored the hour of the day as a field in your document and have the id field as well, you can create a query to select those documents and make sure it works well using the search API.

Then use that query in a Delete By Query API call.

Thank you David. So I have a query to give me the most recent document for each id. Now I want to delete all the other documents for that id except the latest document. I am really new to Elastic search and not sure how the query will look like. Please could you guide me in the right direction?

{
"aggs":{
"top_tags":{
"terms":{
"field":"id.keyword"
},
"aggs":{
"top_sales_hits":{
"top_hits":{
"sort":[
{
"Time":{
"order":"desc"
}
}
]
}

I believe you need to write some kind of a manual script to do that. Like getting all _id that needs to be deleted and call for each the DELETE document API?

But why you are in that situation? Why not using the id field of your document as the _id of the elasticsearch document?

In that case, anytime you are writing a new "version" of the document, a more recent version I mean, old version will be overwritten.
That way, there is nothing else to do.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.