Delete by query took more time than expected

kish · July 25, 2018, 3:02pm

Hello -

I have a made query to delete an entire from index at 13:49:48 but ES took almost 40-50sec to delete the query. But in /elasticsearch/log/*.log file i am unable to find the any details regarding it.

From available debug log i am able see the time took is 49511 which very huge.
Delete query response: &{49511 <nil> false 1 0 0 1 1 0 0 {0 0} 0 -1 0 []}

so could you kindly explain the reason?

Thanks
-Kishan

dadoonet · July 25, 2018, 5:06pm

You tried to delete the complete content of an index? Is that right?

If so, just drop the index. That's much much much efficient.

DELETE index_name

kish · July 25, 2018, 5:19pm

No, i just tried to delete an entire in the index.

dadoonet · July 25, 2018, 9:53pm

Don't use delete by query for this.

kish · July 26, 2018, 5:42am

Why we should not use the delete by query? it is always successfully and it is the 2nd time we facing the scenario.

Following the code sample...

	termQuery := elastic.NewTermQuery("Uuid", Uuid)
	if registerExistsByQuery(termQuery) {
		put1, err := elasticSearchClient.DeleteByQuery().
			Index(RegisterIndex).
			Type(RegisterDocType).
			Query(termQuery).
			Do(elasticSearchContext)
		log.Println("Delete query response: ", put1)

Kindly let us know why it took much time to delete?

Christian_Dahlqvist · July 26, 2018, 6:25am

When you use delete-by-query you delete the documents one-by-one. A delete is basically equal to an update, which means that it is I/O intensive and can be slow. If you are deleting all data in an index, it is much more efficient to delete the full index and then recreate it.

dadoonet · July 26, 2018, 7:11am

Adding to this that it's also probably more efficient to use the reindex API to reindex 20% of the data instead of removing 80% with delete by query.

kish · July 26, 2018, 7:38am

There is some misunderstanding.
I am not deleting a huge data say 80% as such. At present setup, the index has some 10 docs out of which i am just deleting the one doc. So for one doc it took much time.
Technically the flow and issue is, there is request like

-> del x doc
after ~10sec
-> add x doc
executed in fractions of seconds
<- ack for 'add x doc' :  doc x added
after sometime 
<- ack for 'del x doc':    doc x deleted

As delete operation executed lately there is some discrepancy in final data available.

Thanks
-Kishan

dadoonet · July 26, 2018, 8:41am

I tried a simple test:

DELETE test 
PUT test/_doc/1
{ "id": 1 }
PUT test/_doc/2
{ "id": 2 }
PUT test/_doc/3
{ "id": 3 }
PUT test/_doc/4
{ "id": 4 }
PUT test/_doc/5
{ "id": 5 }
PUT test/_doc/6
{ "id": 6 }
PUT test/_doc/7
{ "id": 7 }
PUT test/_doc/8
{ "id": 8 }
PUT test/_doc/9
{ "id": 9 }
PUT test/_doc/10
{ "id": 10 }
POST test/_refresh
POST test/_delete_by_query
{
  "query": { 
    "match": {
      "id": 10
    }
  }
}

The last call takes 32ms on my side:

{
  "took": 32,
  "timed_out": false,
  "total": 1,
  "deleted": 1,
  "batches": 1,
  "version_conflicts": 0,
  "noops": 0,
  "retries": {
    "bulk": 0,
    "search": 0
  },
  "throttled_millis": 0,
  "requests_per_second": -1,
  "throttled_until_millis": 0,
  "failures": []
}

Not sure what you are talking about then.

kish · July 26, 2018, 8:58am

I clearly mentioned with ES dump, this is the response for delete query raised. i guess we can compare this with delete response syntax.

Delete query response: &{49511 false 1 0 0 1 1 0 0 {0 0} 0 -1 0 }

ES dump itself saying that it 49511 time. I just wont know

what could be the possible reason? And yes the remaining operations took very less time.
As of now the logs under /elasticsearch/logs/* has very less info. Is there any way to increase the debug log level?
Is there any queue technique say, for one particular index if i send 3 operation one after the other like add, add, delete. So will ES execute all in sequence or how?
if in sequence then if 1st operation fails then the remaining will discarded or it will processed? In which log we can see this.

Thanks
-Kishan

dadoonet · July 26, 2018, 9:41am

Can you start Kibana, open the developer Console, copy and paste the full script I shared and execute it all then share as I did the last response?

kish · July 26, 2018, 1:22pm

Sorry, we dont use kibana here. Is there any other way to get it?
If you are look for last response then i have shared it already

dadoonet · July 26, 2018, 1:31pm

Then do:

curl -XDELETE "http://127.0.0.1:9200/test"
curl -XPUT "http://127.0.0.1:9200/test/_doc/1" -H 'Content-Type: application/json' -d'
{ "id": 1 }'
curl -XPUT "http://127.0.0.1:9200/test/_doc/2" -H 'Content-Type: application/json' -d'
{ "id": 2 }'
curl -XPUT "http://127.0.0.1:9200/test/_doc/3" -H 'Content-Type: application/json' -d'
{ "id": 3 }'
curl -XPUT "http://127.0.0.1:9200/test/_doc/4" -H 'Content-Type: application/json' -d'
{ "id": 4 }'
curl -XPUT "http://127.0.0.1:9200/test/_doc/5" -H 'Content-Type: application/json' -d'
{ "id": 5 }'
curl -XPUT "http://127.0.0.1:9200/test/_doc/6" -H 'Content-Type: application/json' -d'
{ "id": 6 }'
curl -XPUT "http://127.0.0.1:9200/test/_doc/7" -H 'Content-Type: application/json' -d'
{ "id": 7 }'
curl -XPUT "http://127.0.0.1:9200/test/_doc/8" -H 'Content-Type: application/json' -d'
{ "id": 8 }'
curl -XPUT "http://127.0.0.1:9200/test/_doc/9" -H 'Content-Type: application/json' -d'
{ "id": 9 }'
curl -XPUT "http://127.0.0.1:9200/test/_doc/10" -H 'Content-Type: application/json' -d'
{ "id": 10 }'
curl -XPOST "http://127.0.0.1:9200/test/_refresh"
curl -XPOST "http://127.0.0.1:9200/test/_delete_by_query" -H 'Content-Type: application/json' -d'
{
  "query": { 
    "match": {
      "id": 10
    }
  }
}'

system · August 23, 2018, 1:31pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Delete by query deletes only 1000 documents, then quits Elasticsearch	8	274	February 5, 2024
Elasticsearch Delete Index Performance Elasticsearch	5	2867	August 28, 2020
Delete by Query Impact in terms of Data size vs Time taken Elasticsearch	1	270	February 22, 2022
Slow deletes Elasticsearch	5	3298	July 5, 2017
Delete vs deletebyquery Elasticsearch	8	1483	July 25, 2022

Delete by query took more time than expected

Related topics