How delete data from elasticsearch according to index time


(lizhen) #1

hi all:
I index data use spark, but the job failed after ran few hours, so the data is incomplete. I want to delete them that indexed in that few hours. How can I do it? Thanks.


(Mark Walkom) #2

Is all the data in one single index?
Is is mixed with other data?


(David Pilato) #3

If you set the document ids in your job, you can just reindex again. It will overwrite existing docs.

If you have a timestamp or a range of ids, you can use the delete by query plugin.

David.


(lizhen) #4

all data in one single index


(lizhen) #5

hi David, I use automatic ID, and has no timestamp in the data, so i don't no how to filter the data use query plugin.


(Mark Walkom) #6

Is there other data in the same index? If not then just delete the index.


(lizhen) #7

yes, there are much other data in the index


(Mark Walkom) #8

You are probably better of putting it into it's own index then.

For this, you will need to use the delete-by-query plugin.


(lizhen) #9

Mark, thanks for your reply , but i don't know how to use the query plugin, the data has no unique feature except indexed time


(system) #10