Spike in response time during deleteByQuery

stephenb · January 31, 2022, 4:48pm

I think perhaps you're approaching this a bit backwards. You're telling it not to delete all the good entries.

Which is a huge query and then negated.

The usual approach to this is determining IDs that need to be deleted and use those as the delete by query.

Perhaps I'm not understanding.

bradptp · January 31, 2022, 4:54pm

I think you're right but the issue is we're deleting the IDs from MSSQL first, then using deleteByQuery to sync those deletes to Elasticsearch.

To do what you propose we'd have to somehow record what got deleted and then feed that to delete by query.

Perhaps we delete them from MSSQL first, then write the ids to deleted to a separate MSSQL table and the NodeJS application reads that in, feeds it to deleteByQuery which would be a much smaller query.

stephenb · January 31, 2022, 4:59pm

Yes, you're entering the world of change data capture...

Tracking and propagating changes definitely need to think about the design for those kind of things... Sometimes a little extra at the capture point makes propagating changes easier whether it's Elasticsearch or any other data store.

bradptp · February 1, 2022, 12:05am

Thanks for all the help. I think our dev team needs to re-think how we handle deletes. But this has been very helpful none the less.

system · March 1, 2022, 12:06am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Delete vs deletebyquery Elasticsearch	8	1242	July 25, 2022
DeleteRequest Throttling Elasticsearch	1	329	October 15, 2019
Slow deletes Elasticsearch	5	3278	July 5, 2017
Alternative for Delete By Query or Solution for Performance improvement Elasticsearch	6	1847	June 4, 2021
ES DeleteByQuery fails to delete records sometimes Elasticsearch	6	1009	March 13, 2017

Spike in response time during deleteByQuery

Related topics