Queries on left join on a single index in elasticsearch


Is it possible to use left join on single index.
i have a table Sample, loaded into elasticsearch with index name as index_sample.

So, i had a query left join applied to same table Sample, can i frame queries in elasticsearch with left join for the index index_sample as done in sql query.

Waiting for your response.

Thanks inadvance.

Elasticsearch does not support joins at all. There are a few ways to represent relationships between documents, e.g. parent-child, but that is not to be confused with joins. If you try to model data in a relational way in Elasticsearch you are likely to run into problems.

When you model data in Elasticsearch you generally need to leave the relational mind set behind and start denormalising data into documents instead.

1 Like

Hi @Christian_Dahlqvist,

Thanks a lot for your response,

do parent-child relationships means like primary key and foreign key data in database.
can you please share some knowledge on parent child relationship and for which scenarios we create them.

Thanks inadvance

It allows you to set up a relationship between types of documents within a single index, but is not equivalent of primary-foreign key. It is useful when you have a parent entity that is updated frequently and you do not want to update all the children. It does however come with a number of limitations and performance penalty.

I would therefore recommend to instead model your data without relations to as great extent as possible.

It might help us give better advice and examples if you could describe your data and what you are looking to achieve at a high level.

1 Like

Thanks for your response,

I have a query as shown below in SQL,

select A.*, revenue AS REV, month_1 AS month, B.m_branch, B.y_brnach
SELECT DISTINCT b_no, SUM(current) AS m_branch, SUM(c_year) AS y_brnach
WHERE id=@id
) AS B
ON A.b_no=B.b_no
WHERE id=@id
ORDER BY name, b_no, c_year DESC

I want to show the equivalent in elasticsearch query.

As informed by you, there is no possibility for joins in Elasticsearch. But can i split the data in parent-child and achieve this scenario.

Waiting for your response.
Thanks inadvance.

I am not sure I understand exactly what that does. It might be easier if you could describe the data and what you want to do at a higher level and not based on your current relational data model.

hi christian,

Can you please give some sample examples to understand indetail about parent-child relationships, so that i will acquire some knowledge that in which scenarios can i use them.

After going through this concept in internet, observed like

With Elasticsearch 6.0, there are some fundamental changes which prevent parent/child relationships.

  1. One index cannot contain more than one type. Read more here.
  2. Parent/child relationships have been removed, and hence the _parent field is also removed. You have to use join field instead of parent/child.

Parent/child relationships required that there were two distinct types and both types were defined in the same index. Now that you can't have multiple types in one index, there is no way parent/child relationships can be supported in the same way that they were supported in 5.x and prior releases.

You can refer to the join field documentation to see how to do similar things to parent/child relationships. But now, you have to define both kinds of documents within a single Elasticsearch index, within the same type. Please see the example which explains how to model "1 to many" kind of relationships (1 question, multiple answers related to that question) using a join field here.

for example, an institute will have different number of students.
so institute data is parent, students data is child.
How to give mappings for this relation(institute & student details) and load data from database.
Waiting for your response,
thanks inadvance,

If you look at the page about the join field it states the following:

The join field shouldn’t be used like joins in a relation database. In Elasticsearch the key to good performance is to de-normalize your data into documents. Each join field, has_child or has_parent query adds a significant tax to your query performance.

Parent-child relationships are as as I described earlier useful when you have a parent entity that is very large and/or frequently updated and you do not want to update a very large number of documents in a denormalised data model.

In many cases there the parent is infrequently updated or modified, it is often better to denormalise and perform the extra work when the parent is updated than pay the performance penalty for every query.

If you could describe your data and use-case rather than discuss in terms of simplistic examples it might be easier for others to provide better guidance. I personally rarely use parent-child relationships as they generally are suitable for a relatively limited number of scenarios, so am probably not the right person to help you with its use.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.