Deep nesting and recommendation for its usage

abhigyan · June 12, 2019, 4:14am

Hello ES community!

We are building cloud application which is going to have high scaling data store inside an RDBMS. What we are trying to do is moving the relational data which are searchable to the ES which everyone faces I've seen on the internet already. Now, it turned out that we had to use nested objects (documents) and it went around 4 levels deep almost per document.

The ES docs say that we have to use the nesting in special case only. We also cannot find some good references for querying the nested data properly. As we are up to using elasticsearch-dsl for python, could not seem to find a trace of searching nested data there too. Is it that there is less community support in this high level python client? How is the future of it, should we use it in production ?

So my concern is that is it good idea to use nesting in ES or not ?
If not, can we use RDBMS to get relations and build up query objects and send to ES to only search documents (RDBMS + ES together) ?

If curious about how our data looks like, it is similar to this document structure:

{
    car_doc {
        car_name,
        car_suppliers [
             {
                  supplier_name,
                  supplier_services [
                        {
                               service_type,
                               service_offerings [   {  ...   }   ]
                        }
                  ]
             }
        ],
    }
}

Thanks!

Christian_Dahlqvist · June 12, 2019, 5:58am

Elasticsearch is a document store, which requires you to change how you model your data when you move from a relational database. Trying to mimic a relational structure using parent-child or nested document is generally not the way to approach this as they are not a replacement for the lack of joins. Instead think about what entities you want to search for and denormalize your data into documents that match this. This will result in a flat model that is generally a lot easier to query.

This means data will be duplicated and if you make changes at the higher levels of the hierarchy you will need to update multiple documents. This is however often a reasonable tradeoff as doing a bit more work at index or update time in exchange for faster and simpler queries generally is worth it if updates are infrequent.

abhigyan · June 12, 2019, 6:46am

Okay, sounds right, thanks! So we are now on the way to de-normalize and flatten our stuff. For duplication part, wouldn't the index only store reference to the textual data in the documents? which I think will reduce duplication.

In our documents, it will happen that we will require to query related data fields which are multiple in numbers from multiple documents; we cannot duplicate them, which may tend us to querying multiple documents. Any suggestions for querying multiple documents at the same time efficiently? I mean would it be better to query two times or perform joins inside ES (putting it all in the query).

Thanks for the support!

Christian_Dahlqvist · June 12, 2019, 7:14am

I can’t tell as I do not know your data. Making recommendations based on simplified sample data can often lead to important aspects being missed or overlooked.

system · July 10, 2019, 7:14am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Implementing an amazon/ebay like search Elasticsearch	2	439	July 6, 2017
Relational Data Modelling in Elasticsearch Elasticsearch	2	452	July 6, 2017
What are the disadvantages of having 6 level nesting in index? Elasticsearch	4	562	December 28, 2018
Data Duplication Model with Nested Docs Elasticsearch	1	699	July 28, 2017
What to chose : Parent/child or Nested model Elasticsearch	8	910	August 2, 2019

Deep nesting and recommendation for its usage

Related topics