I think you're right but the issue is we're deleting the IDs from MSSQL first, then using deleteByQuery to sync those deletes to Elasticsearch.
To do what you propose we'd have to somehow record what got deleted and then feed that to delete by query.
Perhaps we delete them from MSSQL first, then write the ids to deleted to a separate MSSQL table and the NodeJS application reads that in, feeds it to deleteByQuery which would be a much smaller query.
Yes, you're entering the world of change data capture...
Tracking and propagating changes definitely need to think about the design for those kind of things... Sometimes a little extra at the capture point makes propagating changes easier whether it's Elasticsearch or any other data store.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.