How to model a many to many scenario in elasticsearch

(Sandeepvja) #1


I have scenario where I need to search for an Person whose employments are listed within its JSON. See below for the person JSON

"name": "David",
"workedAt: [
"name": "google",
"name": "microsoft",
"site": ""

Now let say even the companies information is available to us as companies json

name: "Google",
description: "Google or GoogleInc is an internet company which primarily focusses on search engine and internet based ads. Google is also develops and owns the popular android platform."

Now when search for a person, my query provides a list of companies he has worked with. Its evident that there are hundreds of David's, but there may be only one David who worked in both google and microsoft which needs to come at the top of the search results.

To fine tune the results I want to use the company JSON as well, which contains more information about the company.

The problem here is I can't

  1. Model a many-to-many relationship in elastic search
  2. Do an application side join because I think we cannot reuse the relevance score from two different queries.

Since I do not have any past experience with search, I am finding it difficult to find an approach to this.

Any help on this will be deeply appreciated.

(Mark Walkom) #2

Treat company name as a unique value (or add an ID value to that workedAt array for the company, so when you look it up you don't have to search.

You will probably need to do an application side join though, you can't do many-to-many without structuring your data to cover the views you want.

(Sandeepvja) #3

Hi @warkolm,

First of all thanks for replying.

Assuming that in the above example the person named David worked at IBM as well, and his JSON has the value as "International Business Machines" and not as IBM.

Now, someone who searches this person formulates the query as follows,

"List all the persons whose name is David and who worked at IBM, Google and microsoft"

Now unless we join the Company Index, we cannot know that IBM and Internation business machines are the same. And here mostly the application side join should help.

The second problem is the ranking part. Now, I may get another document named "David" but he might have worked only at Oracle and microsoft and his name might appear first in the list.

So I do not still understand how application side joins do the ranking part.

(Sandeepvja) #4

Can someone please reply to my above question ?

Thanks in advance

(Sandeepvja) #5

Can anyone help me with the above problem ?

Thanks in advance

(system) #6