Delete the data in Elasticsearch index based on a date/timestamp column in that index using python

I have several indexes which has a column named "date" or "timestamp", I want to delete only the data from that index which older than certain days/weeks. I do not want to delete the entire index.

In fact I want to do it from python, if someone is able to suggest me how to do it from kibana also its fine for now.

Thanks in advance :grinning:

Kibana is mostly a tool to view your data, not to modify it, so you can't do this from within let's say Discover. Your best bet here are the "Dev Tools" (https://www.elastic.co/guide/en/kibana/current/console-kibana.html) - those allow you to talk to Elasticsearch REST APIs directly, so it won't be a big jump to do it in dev tools vs Python.

To delete only certain documents, _delete_by_query seems like the right choice: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html

For Python in general it's recommended to use the official python client: https://elasticsearch-py.readthedocs.io/en/7.9.1/

1 Like

We also have a free feature called Index Lifecycle Management which lets you configure your indices to automatically get deleted after a certain time, details here: https://www.elastic.co/guide/en/elasticsearch/reference/current/index-lifecycle-management.html

Specifically, you can set a delete phase that can be met based on the conditions you want.

1 Like

Just to add on another comment, you are far better off using time-based indices rather than deleting data from an index based on a timestamp. It's far more efficient.

1 Like

This served my purpose, just what I needed.
Thank you very much :slightly_smiling_face:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.