If you stored the hour of the day as a field in your document and have the id field as well, you can create a query to select those documents and make sure it works well using the search API.
Then use that query in a Delete By Query API call.
Thank you David. So I have a query to give me the most recent document for each id. Now I want to delete all the other documents for that id except the latest document. I am really new to Elastic search and not sure how the query will look like. Please could you guide me in the right direction?
I believe you need to write some kind of a manual script to do that. Like getting all _id that needs to be deleted and call for each the DELETE document API?
But why you are in that situation? Why not using the id field of your document as the _id of the elasticsearch document?
In that case, anytime you are writing a new "version" of the document, a more recent version I mean, old version will be overwritten.
That way, there is nothing else to do.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.