Elasticsearch compare two indices


#1

How does one go about comparing two (or more) indices in elasticsearch? especially based on specific fields (not entire document).


Querying differences on one field between two indexes
(David Pilato) #2

You can't do anything like a join.

You need to solve that at index time or on the client side IMO.


#3

just to reiterate; among of other fields, each of index's docs has at least following fields: city, state and zip.

like following:

index-0: _id, retailName, city, state, zip, etc
index-1: _id, otherName, city, state, zip, etc
index-2: _id, otherName, city, state, zip, etc
index-3: _id, otherName, city, state, zip, etc

My goal is to compare docs (from index-1-3) with fields (city, state, zip) against index-0 and find how much percent wise of each index (1-3) are matches against index-0.


(David Pilato) #4

You can't compare documents. You can search for documents or aggregate some fields from the resultset but you can't join documents to compare them which is I think what you asked for...


(Frederik Mortensen) #5

Hi @dadoonet @alexus.

First of all I am new to the forum. And new to elastic. But I used the search function and found this thread. So I decided not to open yet another thread. I am not sure wether my situation is quite the same as alexus situation (regarding the definition of "document"). But I am having a similar problem. I am synchronizing from MySQL to Elastic Index1. Later I changed to Index2. Same MySQL table. Same index with same fields. Now I would like to compare the two indices. I know that many entries should be in both indices. But some are just in index 1, some are in index2 and some entries differ in their content between the indices. Is there a way to do this in Elastic?

Thanks for your help


(David Pilato) #6

I don't think there is an easy way to do that.


(Frederik Mortensen) #7

OK thanks for the fast answer @dadoonet. I think this could be a useful addition to Elasticsearch. Maybe you could add it as a feature request? And how would you try to compare the indices now? With scrolling and using an external script like java? Or simply replace the indices with full takes? Its quite a big index and takes some hours to synchronize it.


(David Pilato) #8

I don't think it will be supported unless we support distributed joins which we don't.

With scrolling and using an external script like java?

Yes. most likely.


(system) closed #9

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.