Index Trimming


(orenmazor) #1

hey all,

given a cluster of around 200m large-ish uniform documents, how would you
go about trimming the index based on a certain mapping property?

right now I run a scroll query on my cluster, and for all the matching
documents*, I issue a delete command**. this can be a fairly lengthy
process. are there any better/faster mechanisms? this is on ES 19.2, by the
way.

*I check a date property. I used to use TTL to trim these out, but that
stopped working after the 19.2 update. we'll be updating to 19.9 in the
next few weeks, but this trim would help shrink the index in the short term.
**it's actually a routed bulk delete of 500-1000 documents.

--


(Clinton Gormley) #2

Hi Oren

given a cluster of around 200m large-ish uniform documents, how would
you go about trimming the index based on a certain mapping property?

right now I run a scroll query on my cluster, and for all the matching
documents*, I issue a delete command**. this can be a fairly lengthy
process. are there any better/faster mechanisms? this is on ES 19.2,
by the way.

Have you tried delete-by-query ?

http://www.elasticsearch.org/guide/reference/api/delete-by-query.html

clint

--


(orenmazor) #3

awesome. that's exactly what I needed.

On Monday, September 24, 2012 11:51:29 AM UTC-4, Clinton Gormley wrote:

Hi Oren

given a cluster of around 200m large-ish uniform documents, how would
you go about trimming the index based on a certain mapping property?

right now I run a scroll query on my cluster, and for all the matching
documents*, I issue a delete command**. this can be a fairly lengthy
process. are there any better/faster mechanisms? this is on ES 19.2,
by the way.

Have you tried delete-by-query ?

http://www.elasticsearch.org/guide/reference/api/delete-by-query.html

clint

--


(system) #4