Bulk Delete operation on Elastic Search


(Srinath Mkce) #1

i am using elastic search 2.2.

here is the count of documents

curl 'xxxxxxxxx:9200/_cat/indices?v'

yellow open   app                 5   1   28019178         5073     11.4gb         11.4gb

In the "app" index we have two types of document.

 1.  "log"
 2. "syslog"

Now i want to delete all the documents under type "syslog".

Hence, i tried using the following command

 curl -XDELETE "http://xxxxxx:9200/app/syslog"

But am getting the following error

No handler found for uri [/app/syslog]

i have installed delete-by-query plugin as well. Is there any way i can do a bulk delete operation ?

For now , i am deleting records by fetching the id.

curl -XDELETE "http://xxxxxx:9200/app/syslog/A121312"

it took around 5 mins for me to delete 10000 records. i have more than 1000000 docs which needs to be deleted. please help.


(Zachary Tong) #2

If you want to use the Delete-By-Query plugin, you can just formulate a query which uses the type query to delete all the docs of a certain type:

DELETE /app/_query
{
  "query": { 
    "type": {
      "value": "syslog"
    }
  }
}

Or you could do this, which is equivalent:

DELETE /app/syslog/_query
{
  "query": { 
    "match_all": {}
  }
}

Note: I haven't tested this... the syntax should be correct, but I'd double-check on a QA cluster or index first, just to make sure you don't delete everything :slight_smile:

If you don't end up using the DBQ plugin, you can formulate bulk requests which list large batches of deletes. That'll be considerably faster than issuing individual deletes.

Lastly, if you separate your data into multiple indices (one index for "syslog", another for "log"), you can just delete the entire index. This will be the fastest, as deleting an entire index is very quick... just a file system delete. Whereas deleting documents requires a complicated set of tombstoning and merging out dead docs later.


(Srinath Mkce) #3

Thanks for replying. However, when i run the first command that you gave i get this as result

No handler found for uri [/app/_query]

And when i run the second one , i get this error

{"found":false,"_index":"app","_type":"syslog","_id":"_query","_version":1,"_shards":{"total":2,"successful":1,"failed":0}}

Both are not working. Can you please tell me how to do that bulk request ?


(system) #4