Can elasticsearch replace “foreign key” with existing “child document” during document save?

Elasticsearch uses embedded objects to store documents (like in document 1 for comments)

//Document 1
{
  "title": "Nest eggs",
  "body":  "Making your money work...",
  "tags":  [ "cash", "shares" ],
  "comments": [ 
    {
      "name":    "John Smith",
      "comment": "Great article",
      "age":     28,
      "stars":   4,
      "date":    "2014-09-01"
    },
    {
      "name":    "Alice White",
      "comment": "More like this please",
      "age":     31,
      "stars":   5,
      "date":    "2014-10-22"
    }
  ]
}

Is it possible instead of nested object to pass only ids and somehow ask elasticsearch to fill appropriate data instead of it (assuming that we already have documents for comments with id and other data).

So generally I want to pass something like document 2, but I want document 1 to be saved (so "John Smith" comment is saved instead of comment_id 123, so I can use "comment's author name" for search).

//Document 2
    {
      "title": "Nest eggs",
      "body":  "Making your money work...",
      "tags":  [ "cash", "shares" ],
      "comments": [ 
        123, 124
      ]
    }

The reason for this question is that for my particular case "comment" actually stored in different schema (different microservice). So i don't want, if possible, to make request to different microservice to find comment by id before saving whole document in elastisearch.

I'm new to elasticsearch, so sorry for possible begginer questions, but I haven't found anything like this in documentation.

Thanks

That'd sound like a parent/child relationship feature. See https://www.elastic.co/guide/en/elasticsearch/reference/current/parent-join.html

I'm not a big fan of it as I prefer dernormalizing my data but may be that's the way to go.

I guess that your use case is about searching a post by looking either in the post itself or the comments, right?

Thank you for answer. I also want to store de-normalized data. I'll try to give more detailed example

I have microservice that updates child object. And this microservice contains parent id (but not the whole parent object, which stored in different microservice). So my goal is to be able to make a select request for child objects using some field from parent as a filter. As far as I understand the best way to do it is to store de-normalized child object with a whole parent object. Something like
{
"childeField1":"value1",
"parent":{
"parent_id": 1,
"parent_field":"parent_field_value"
}
}
So returning to my question what I wanted to do is

  1. I send in some something like

    {
    "childeField1":"value1",
    "parent_id": 1
    }

  2. Elastic search internally finds parent document by id = 1
    "parent":{
    "parent_id": 1,
    "parent_field":"parent_field_value"
    }

  3. It substitutes requested "parent_id": 1 with the whole parent object that was found on step 2

  4. It saves de-normalized object
    {
    "childeField1":"value1",
    "parent":{
    "parent_id": 1,
    "parent_field":"parent_field_value"
    }
    }

The reason for this question is that I don't really want, if possible, to make request to other microservice and delagate elasticsearch to fill this data as it already has document with child info.

Parent/child relationship looks similar. But I think it has following limitations:

  1. It shows worse performance than storing dernomaized data. I my case parents also can have parents so it should be even worse.
  2. It looks like I won't be able to have more that one parent considering limitation

Only one join field mapping is allowed per index.

I think the best way is just to get parentDocument from elastic search and create the whole denormalized object by myself.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.