Delete By Query in ES 5.0.1 Slower than Expected

First time testing out this function and not quite sure what to expect in terms of performance. Can anyone comment if this seems right or can be further optimized to perform better?

Single index: logstash_asa_2016.12.31
Total documents: 179,345,189
Documents tagged with "vpn": 5,303,648

delete_by_query command issues at around 10:45 or 11:19 CST, command is still running as of 13:41 CST

search rate on the index is between 20-40 docs/second

curl -XPOST 'localhost:9200/logstash_asa_2016.12.31/_delete_by_query?pretty=true' -d '
 {
     "query" : {
         "bool" : {
             "must_not" :{
                 "term" : { "tags": "vpn" }
             }
         }
     }
 }'

It is worth looking at the _tasks API to see if you can find the task. That should tell you at least something about what it is up to.

In general it tends to be faster to build a new index rather than delete 97% of an index.

Thanks for the feedback and direction. I agree with the amount of tagged data being so skewed, it would be much quicker to split into a separate index. Also makes it easier to schedule Curator to erase that other index pattern to attempt to reclaim disk space.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.