What are the disadvantages of having 6 level nesting in index?


(Sankalp Sood) #1

Need to document cons of having 6 level nesting in my index . How it can effect my queries and response time?. And How it can effect the shards and response level?


(David Pilato) #2

The more nested docs you have the more complex will be the query to write and to be executed.

You need to make sure you absolutely need that for your use case.

In a former job, for example, I had a structure like this (all nested):

{
  "a": [ {
    "b": [ {
      "c": [ {
        "name": "1",
        "d": true
      },{
        "name": "2",
        "d": false
      } ]
    },{
      "c": [ {
        "name": "3",
        "d": false
      },{
        "name": "4",
        "d": false
      } ]
    } ]
  } ]
}

And so on.. So collections of collections of attributes...
And one of my use case was to display if there is any a.b.c.d equal to true.

I first implemented it as a complex nested query.

And then I stepped back and thought about the use case.
I decided to compute this value in my application and put it within the top level object, such as:

{
  "has_d": true,
  "a": [ {
    "b": [ {
      "c": [ {
        "name": "1",
        "d": true
      },{
        "name": "2",
        "d": false
      } ]
    },{
      "c": [ {
        "name": "3",
        "d": false
      },{
        "name": "4",
        "d": false
      } ]
    } ]
  } ]
}

That reduced a lot the query time.

So this is just an example that is probably not what you are dealing with but I wanted to share my thoughts. When you come from relational database model, you are tempted to just have the same exact model in both systems without thinking of the use case. If you can simplify the implementation in elasticsearch because your use case allows it, it will be a great win for you and your users IMHO.

My 0.05 cents.


(Sankalp Sood) #3

Thankyou David..
Need to clarify one more thing, if I have a use case like suppose "a" has 1 to many relationship with "b", and for one "a" ID, I have 10 "b" IDs. In this case if I follow the flat structure by combining both unique ids, then rather than of having one single document for 10 IDs of "b" under an "a" ID, it will have 10 different documents with same "a" ID and different "b" Ids. so, in this case is flat structure more preferable?
And if yes, then whenever i will have to update "a" name , then i will have to update that in 10 documents rather than just updating 1 document using nested structure.. So what will be the better approach in this use case?


(Christian Dahlqvist) #4

In Elasticsearch a separate document is stored for each nested object within the document. When you make any change to any part of this document, all these parts are reindexed behind the scenes. This means that if you are frequently modifying documents and have large documents, this can quickly become a bottleneck. Updating a thousand flat documents may therefore actually be quicker in some cases as only the changed parts need to be updated.