Hello,
I am trying to remove a large amount of data from elasticsearch using _delete_by_query. I've tried many options to get this to complete, but typically can only get several hundred (out of millions) of records to actually delete. The most common error is:
{
"took": 3307,
"timed_out": false,
"total": 140739907,
"deleted": 102,
"batches": 1,
"version_conflicts": 77,
"noops": 0,
"retries": {
"bulk": 0,
"search": 0
},
"throttled_millis": 0,
"requests_per_second": -1,
"throttled_until_millis": 0,
"failures": [
{
"index": "logstash-2017.02.28",
"type": "fluentd",
"id": "AVqCEZwbJhYoxRiSZL1-",
"cause": {
"type": "es_rejected_execution_exception",
"reason": "rejected execution of org.elasticsearch.transport.TransportService$7@1ac5f094 on EsThreadPoolExecutor[bulk, queue capacity = 50, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@795ac56a[Running, pool size = 2, active threads = 2, queued tasks = 50, completed tasks = 464576]]"
},
"status": 429
}
Here is an example of a query. I've tried many different settings to no avail.
Anyone know how to get this to just slowly crawl through and remove all matching records, versus erroring out?
GET logstash-*/_delete_by_query?conflicts=proceed
{
"query": {
"query_string": {
"default_field": "log",
"analyze_wildcard": true,
"query": "DEV-*"
}
}
}