Can we use SQL Join in elastic query?


Can we use join function in elastic query ?
I want to join two indexes with one common field . How to do that ?


Elasticsearch does not support joins at all.

Hi Christian ,

What is workaround for JOIN in ES ? Is it possible in some other way ?

Most common workaround is to denormalize the data into a form that matches what you want returned when you search.

Hi Christian ,

how to do denormalisation ?

That depends on the data so it would help to know a bit about it. This article provides some background.

Hi Christian ,

Take a example :

If username in both index is same then add both indexes and create new one

t _index windows-6.5.4-2019.08
t windows.event_data.TargetLogonId 0xhhg9a
t windows.event_data.TargetUserName sajal
t windows.event_data.TargetUserSid 1-2-3-4
t windows.event_data.TokenElevationType %%8765

t _index mac-ls-2019.08.*
t s-action Pipe
t username sajal
t x-http-connect-host
t x-http-connect-port 443
t x-virus-details -

How would you join that given that you probably could have lots of entries in each index for a single user name? Just on user name? Would the timestamp be relevant given that you have 2 time based indices?

yes i have two different indexes with timestamp . You can take username as it won't repeat. Just add both indexes and create new one

Then merge the documents into a single one at index time.

i am new to this . can you please help me how to do this ?

How are you indexing your data into Elasticsearch? What is the source of these different documents? Are they generated close in time? Is it always a 1-to-1 mapping?

logs are coming from filebeat. They are generated close in time and yes always 1 to1 mapping

To do this at index time you probably need to send the data through Logstash as that offers greater flexibility. If it is truly a 1:1 mapping and the user name does not appear in multiple documents, you could set the document ID to the user name and perform an upsert (think Logstash supports this but have not tried) for both documents. You should then end up with a single document containing all the fields. Another option might be to use the aggregate filter, similar to what is described in this thread.

I would recommend opening a separate thread on this in the Logstash section.

Just to update the correct requirement now .

I have two indexes with particular date (Eg 23 August 2019) , when username in both index gets matched then get only last 10 logs only in new index

use Apache Hive Or Apache Spark

Is this possible with nested query ?

No, I don’t think so.

can we apply nested query in two indexes ? If yes , what is the query syntax ?

No that is not possible. Nested queries requires all the parts to be in the same document.