Spike in response time during deleteByQuery

I think perhaps you're approaching this a bit backwards. You're telling it not to delete all the good entries.

Which is a huge query and then negated.

The usual approach to this is determining IDs that need to be deleted and use those as the delete by query.

Perhaps I'm not understanding.

I think you're right but the issue is we're deleting the IDs from MSSQL first, then using deleteByQuery to sync those deletes to Elasticsearch.

To do what you propose we'd have to somehow record what got deleted and then feed that to delete by query.

Perhaps we delete them from MSSQL first, then write the ids to deleted to a separate MSSQL table and the NodeJS application reads that in, feeds it to deleteByQuery which would be a much smaller query.

1 Like

Yes, you're entering the world of change data capture...

Tracking and propagating changes definitely need to think about the design for those kind of things... Sometimes a little extra at the capture point makes propagating changes easier whether it's Elasticsearch or any other data store.

Thanks for all the help. I think our dev team needs to re-think how we handle deletes. But this has been very helpful none the less.

3 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.