I am collecting the data from IoT DEVICE every second and transferring the data to Elastic search. I wanted only last 7 days data in my index pattern. How do i delete the old document data from my Index pattern.
I have a field called "event_ts" which is the time field and I want to perform the delete operation based on this time field.
I don't want to create a Index pattern for every day , because I am build a dashboard based on the index pattern and I can't keep on build my dashboard. Is there any way i can search all the record in my index pattern and delete the record based on the time parameter ??
@dadoonet is correct. There is no reason not to use a different index per day for your use case. Because Kibana can use an Index Pattern for building visualizations and dashboards, it can handle new indices per day:
If you click on Index Patterns, you will be taken to another screen where you can Create Index Pattern. You can see from mine that there are several Index Patterns which follow the foo-* example shared by @dadoonet:
Once created, when you go to create a new visualization in Kibana, your Index Pattern will be in a list you select from:
To further round out the discussion, performing delete_by_query, which you can still do, is very inefficient for deleting data from indices, compared with deleting entire indices. The difference is similar to the difference between these SQL psuedo-statements of:
DELETE from TABLE where timestamp < now-7days
and
DROP TABLE
The DELETE from statement has to perform a query, and do a comparison on every document, and then set up a series of atomic DELETE operations for each match found, while the DROP statement is over and done in a single shot. This example isn't perfect, because SQL is designed to handle this sort of thing better than Elasticsearch.
Elasticsearch makes things worse still because deleting a document doesn't result in it immediately freeing resources, but instead only marks the document for deletion. It isn't actually deleted until the next segment merge—which results in yet another scan of the documents to see which are kept, and which to delete. That's at least 2 scans over all of your documents just to free the resources. Also, having mismatched segment sizes—which is what happens with document deletes—makes Lucene a bit less efficient.
Elasticsearch handles these scenarios well, but if you didn't have to delete documents from an index, it would be much, much more efficient, which is why @dadoonet recommended using daily indices—as do I.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.