How to fetch data from multiple index using join like sql

Hi Team,

I am using below query to fetch data from sql.

select W.WALLET_LOCATION_ID, W.Region, UWL.HIERARCHY
from Wallet_Location_Dim W
inner join User_Wallet_Location UWL
on W.WALLET_LOCATION_ID = UWL.WALLET_LOCATION_ID
inner join dv_wallet_location dv
on dv.WALLET_LOCATION_ID = UWL.WALLET_LOCATION_ID
ORDER BY TO_NUMBER(REGEXP_SUBSTR(UWL.HIERARCHY, '[^.]+', 1, 1)) NULLS first,
TO_NUMBER(REGEXP_SUBSTR(UWL.HIERARCHY, '[^.]+', 1, 2)) NULLS first,
TO_NUMBER(REGEXP_SUBSTR(UWL.HIERARCHY, '[^.]+', 1, 3)) NULLS first

But my requirement is to create index for each table and after that I want to apply join on multiple index while fetching data from index in elasticsearch.

Please let me know how we can achieve it.

You can’t do joins with elasticsearch.

Better to think your model differently and denormalize your data.

Parent/child feature is kind of 1-n relationship but I’d only use it if absolutely necessary.

1 Like

thanks David, I read elastic doc for modeling data.I used like below:
I created one index(join_query_logic) for two table (Wallet_Location_Dim,User_Wallet_Location) and differentiate tables by using type(Wallet_Location_Dim,User_Wallet_Location).but when i am trying to get records using below query getting no records.

GET /join_query_logic/_search
{
"query": {
"bool": {
"must": [
{ "match": { "Wallet_Location_Dim.wallet_location_id": "User_Wallet_Location.wallet_location_id" }}
]
}
}
}

{ "match": { "Wallet_Location_Dim.wallet_location_id": "User_Wallet_Location.wallet_location_id" }}

Is like joining documents. No you can't do that. May be explain the use case, share some sample documents and what a typical user would like to search for?

Hi David, I prefer this example of normalization.

PUT /my_index/user/1
{
  "name":     "John Smith",
  "email":    "john@smith.com",
  "dob":      "1970/10/24"
}

PUT /my_index/blogpost/2
{
  "title":    "Relationships",
  "body":     "It's complicated...",
  "user":     {
    "id":       1,
    "name":     "John Smith" 
  }
}

for fetching records used:

GET /my_index/blogpost/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "title":     "relationships" }},
        { "match": { "user.name": "John"          }}
      ]
    }
  }
}

sorry if I am asking wrong query but i am the new beginner for elasticsearch.

I tried to use {"match": { "Wallet_Location_Dim.wallet_location_id": "User_Wallet_Location.wallet_location_id" }}
at d place of this { "match": { "user.name": "John" }}

Please format your code using </> icon as explained in this guide. It will make your post more readable.

Or use markdown style like:

```
CODE
```

I edited your post.

In you example, what the user is searching for? users? blogposts?

Let say the later. Then index your documents as:

PUT /my_index/blogpost/1
{
  "title":    "Relationships",
  "body":     "It's complicated...",
  "user":     {
    "name":     "John Smith",
    "email":    "john@smith.com",
    "dob":      "1970/10/24"
  }
}

And you're done.

I just send you example which i followed.I want the records from join_sql_logic index for Wallet_Location_Dim index type where wallet location id from both type(Wallet_Location_Dim,User_Wallet_Location) should match.I showed you in post.

Yes but I already told you that you can't do joins in elasticsearch.
You can use parent/child feature but I'd not use it unless it's absolutely needed. But I already said that.

Again, if you need help about modelling please answer this.

May be explain the use case, share some sample documents and what a typical user would like to search for?

1 Like

I have created parent child relationship between two index type.but while searching it is giving no records.
my code is:

GET /join_query_logic/Wallet_Location_Dim/_search
{
"query": {
"has_child": {
"type": "User_Wallet_Location",
"query": {
"constant_score" : {
"filter" : {
"term" : {
"wallet_location_id" : 1
}
}
}
}
}
}
}

where index: join_query_logic, parent: Wallet_Location_Dim, child: User_Wallet_Location.
I want to get only those records from parent type which has wallet_location_id presented in child type.

Writing it again:

Please format your code using </> icon as explained in this guide. It will make your post more readable.

Or use markdown style like:

```
CODE
```

Could you provide a full recreation script as described in

It will help to better understand what you are doing.
Please, try to keep the example as simple as possible.

Hi,

I have created parent child relationship between two index type.but while searching it is giving no records.

my code is:

GET /join_query_logic/Wallet_Location_Dim/_search
{
"query": {
"has_child": {
"type": "User_Wallet_Location",
"query": {
"constant_score" : {
"filter" : {
"term" : {
"wallet_location_id" : 1
}
}
}
}
}
}
}

No results are found.But in User_Wallet_Location index type "wallet_location_id" : 1 is present.

Last chance. Please read:

I think that in order for you to accomplish what it is you are trying to do, with the technologies you have chosen, you first really need to consider clearing your thoughts and removing your relational database thinking. I am not going to get real technical on this comment but rather gloss over a few points. I do not know how much time you've spent with elasticsearch, but I can tell you, it likes it much better when you play nice with it. It is sensitive.. It doesn't even like when you voice gets a bit loud. That being said, I think you will find that you can accomplish your goals by rewiring a few ways you think. I am going to try to keep this short. Im going to make a couple suggestions much like those of @dadoonet, which you should have been followed; embrace these freebies as the process of earning them is lengthy :wink:

Before I offer what I think will likely put you down a better path, consider the one you are on. You are trying to use techniques to retrieve data in a relational centric way on a storage engine that is an index. This is not a subtle difference and is a point that you must understand well in order to be able to efficiently work with it. Behind elastic search is an extremely fast and efficient indexing technology. No relational database that I know of could come close to touching it, when doing the things it does best. Much the same, I do not know of any non relational databases or index that can navigate relationally stored data the way relational db can.

You really should go ever that documentation, thoroughly, like 30 or 40 more times. Im pretty sure when I learned it, i had no less than 6 browser instances open for sorting the 80 or so open pages each had open on it. When you do this, pay really close attention to the way they write about the various aspects. Look for the function / methods / techniques that they talk about that elasticsearch and cassandra do really really well and then think on them with an out of the box, resourceful mind. Try and figure out how you can leverage one or more of your options ( which even the worst options are still like 12 timex faster than what your doing ) Keep in mind that sql has like 90k words ( maybe 30 that you use ) elastic search may have 100 maybe not even. It is much less of a task to thoroughly learn elasticsearch than it would just to become aware of all that sql.

The first and most important was the little gem that @dadoonet gave you in the beginning of this conversation regarding using the the parent / child relationship with caution. I would agree with him and add the little bit of, just forget that exists and use one of the other two "relational" means to store data. The way you have chose to store your "to be indexed data " is the slowest and least efficient as well as resource heavy way you can go about this task. I am sure there is a good reason for them having it in there, I have just not come across or heard about it.

Elasticsearch is an index. Take the idea of normalization and throw it out the window if your to work with the index. Accept that it is perfectly ok to have data, the same data, repeat its self across several models. Aside from going beyond reason and fighting scope, your not likely going to pay in the long run. The index will have no problem working with it, especially if your not exposing all of it and are working with in the meta of its models correctly.

Since your still feeling icky at the thought of denormalizing data ( and yes that means scribbling all over your domain and rearranging it so that it looks like a 4 year old with chewed up crayons designed it ) while your thinking of how your data is going to be used rather than how neatly you can store it, find a way to accept that sometimes you just need to do some of the processing in the code. If those domain objects were pulled from a relational db that layer would probably serve you up some nice arrays of data with some awesome caching levels to the the transient data bla bla bla... to get that effect for data that is relational, especially if your index is not designed with how the data is to be used but rather how to make the foot print small, your just going to have to create those arrays in the code your self.

The way this index stores data and in particular the way you are asking it to persist the index, being aware of and understanding the data is scoped and that the way you store the data determines its scope. The scoped data will impact what and how you use things or do not use them. You will write code differently because of the scoped nature of the persisted data. (ugh this is such a lengthy subject, with so many things that need to be gone over) I would be sure to touch on the scope of the child data in in relation to any data not in the relationship. It has strong benefits, but I never really found a need to leverage it, especially because of what it cost. You will find later that your likely going to be using a bunch of business logic and memory to further relate anything you would derive from it....

Learn to live and love aggregation and other tools that elasticsearch has and has specialized to very impressive level. Elasticsearh tends to go a little easier on ya and may even forgive you for you trying to cram relational data down its throat in a normalized way all while expecting a the index to persist it as such. Its just not how it works.

In all my years working with data, and model this or that..... the many years before OOP really was the main stream, I have never known a storage engine that was the end all be all that is perfect for persisting every imaginable type of data as well as accessing it. Expand the mind a bit and think of systems which may leverage relational databases for some aspects, indexes for other, in memory db / caches for other. Think on the ways you can leverage each tool you have so as to maintain a healthy balance of cost / gain in efficiency as well as time needed to implement the various aspect. Image things like persisting the results of data pulled from a relational database as entities of their own in one object to be indexed for analysis. Big data is big, maybe you cant store all the answers so you decide the analysis results could be offloaded to a schema less or nosql in mem db to hold on to, keeping the index a health size with no real cost in time to get it back out of mem.

Let go of all you know embrace the difference, it is after all why your using something different right? Because its different: RTFM :slight_smile: Good luck to ya brother. Elasticsearch is fun to learn. They do some really cool things in very clever ways which is going to make you abetter programmer for the broadened exposure to mindsets beyond the borders of relational data.

21 Likes

Wow, I wish I would have read all of what you said... I think I said the same thing but with one or two or 300 more words lol

1 Like

Thanks!!!!!!!!!!!!!!1

Haha! Nice try.

It would deserves a better split in paragraphs though IMO for clarity.

:slight_smile:

1 Like

Hi dadoonet,
I satisfied with you responses.I haven't read that much big paragraph because nobody has that much time.

2 Likes

Yes, you are right. I will do that.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.