Efficient and fault-tolerant way to "merge" two indexes

gabriel.goncalves · May 24, 2020, 9:33pm

Hi there. I'd like and efficient and fault tolerant (connection problems, failed operations etc) way to do the following:

I need to "merge" two indexes. Lets call them A and B being merged into C;

For every entry in A, there are 1 or more entries in B which are associated with it. We could think of A as a flights and B its passengers, for instance. Let 's describe an example of such a group by (A', [B'1, B'2,...B'n]);

Each entry in C will consist of:
-Full data of an entry in A
-Full data of an entry in B
-A few more computed fields - lets call them Cf ;
Each entry in A will appear multiple times in C, one for each entry in B associated with it;
For the group (A', [B'1, B'2,...B'n]), the entries in C would be:
(A',B'1,Cf1), (A',B'2,Cf2),..,(A',B'n,Cfn)

Things I'm considering: Best way to scroll and search through A and B to produce the entries for C. If in this case would be use useful to create auxiliary fields en A and/or B to register which items were already processed, which ones
are being processed and etc and speed up the scrolling. If its worth to use sort _doc in the scrolling for the speed or the fact that I won't be able to use
"search_after" with '_doc' makes it not worth. The way to organize the process such that the work can be divided in threads and that it automatically deals
with failed create/index operations made withing bulks etc

The indexes A and B have around 20 million and 30 million entries respectivelly. Each entry has approx. 10kb in size.

system · June 21, 2020, 9:33pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to merge two indexes based on common field in Elasticsearch? Elasticsearch	3	9578	September 4, 2019
Merging Two Indexes Elasticsearch	3	9172	July 6, 2017
Merge indexes Elasticsearch	5	401	May 31, 2018
Merging multi search from 2 elasticsearch indexes Elasticsearch	4	6628	July 6, 2017
Is it possible to merge results of a search for two indexes based on a unique identifier in common? Like a join Elasticsearch	5	1581	July 16, 2020

Efficient and fault-tolerant way to "merge" two indexes

Related topics