Delta load in Elasticsearch

Hi
I have two Elasticsearch indexes say Index_A and Index_B. Index_A gets daily updates and inserts.
I need to do a delta loading on Index_B based on the data in Index_A on a weekly basis.
For Docy_Type =A, I should consider Modified Date and for Docy_Type =B, Created date should be considered.

My current indexes state:

image

image

So, after delta refresh, following conditions should be considered:

  1. DocID 1 no update should happen in Index_B as there no change in Modified Date.
  2. DocID 2 should be deleted from Index_B and new record from Index_A should be inserted as there is a change in Modified Date.
  3. DocID 3 should be deleted from Index_B and new record from Index_A should be inserted as there is a change in Created Date.
  4. DocID 4 no update should happen in Index_B as there no change in Created Date.
  5. DocID 5 no update should happen in Index_B as there no change in Created Date.
  6. DocID 6 should be inserted in Index_B as it is new/
  7. DocID 7 should be deleted from Index_B as it is obselete.

So, my Index_B after Delta refresh:

image

Right now, my approach is to use Python scripting to achieve this.

Can anyone from Elasticsearch team shed some light how such delta refresh can be achieved this in optimal way? Is there any ES tool available?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.