Matching two fields with the same value across two indices

Hi, I'm trying to do a simple match of two fields from two indices but I'm not having much luck.

So I've got two indices A and B and two fields with different names but the same values. How do I return all results from either index where the two values match?

I can do it pretty easily with MySQL but I'm having issues with ES!

Thanks for any help

1 Like

Elasticsearch does not support JOIN queries which is I believe somehow what you are trying to do.

Thanks for the reply!

Yeah, I guess it would be something like a JOIN which I understand is computationally impracticable with ES. But apart from pre-processing all the data into a simplified form and stuffing it all in one index is there anyway to compare the values of two fields and return matching documents?

Or are we best using ELK for the really simple stuff and using MySQL for the complex work?

I don't believe you can compare documents.

You can introduce some kind of relationship between documents with parent/child feature though but I don't believe this is what you want here.

IMHO it's always better to think about the use case, adapt the model to the datastore you are using. May be share an example of the documents you have in both indices and someone can come with a better solution?

I'm basically visualising perimeter logs (Firewall, Apache etc.) and internals (Audit, Syslog etc.). These logs are easy to ingest, visualise, and interpret. The sticking point is mod_security logging as it produces a confused mess of logs with data all over the place (and even not always logged correctly). It's matching the multiple results from this with the equivalent entry in the Apache logs that's causing me heartache!

I did have a look at child\parent (and nested) relationships but it seems to require you to know the relationship beforehand so that you can tag to the two together during ingestion which isn't a practical option here.

I think I'll go for a bespoke solution for this as it's not something that ELK seems to be suited to. I was just trying out the options :slight_smile:

Thanks for the input!

I think I'll go for a bespoke solution for this as it's not something that ELK seems to be suited to.

That's probably the first time ever I'm hearing that elastic stack (we don't say ELK anymore as beats is feeling alone) is not suitable for logs.
I'm pretty sure there is a way to solve the problem you want to solve with elastic.

Note that you can search in multiple fields, multiple indices at the same time. So if you have something like, let say an IP, in firewall logs and apache logs, I'm sure you can get all the logs related to this IP.
And you can probably visualize all that from Kibana BTW.

:smiley: I've been referring to it as FELK at work ( or more accurately FELKP with nmap and other things). I'm sure we'll still be using it to visualise the standard logs to get all the pretty pictures that management loves so much. But as they make my life easier I like them too...

I have tried lots of different ways (multiple indices, same index, same names, parent\child, nested et al) to match the data together but all have failed so far, hence the bespoke solution would probably be quicker and easier for this particular job. I can already output the results as JSON for VegaLite and use either Grafana or Kibana to visualise them. I would just prefer to have it all under one roof so to speak.

Kibana couldn't handle the modsecurity logs at all, nested arrays etc. I already had to flatten JSON version to get it into ES for Kibana to even look at.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Hi @cuttlefishjones, in case you haven't solved it yet, one way to do it would be to use Vega (not VegaLite), and do the work client-side. Basically add both data sources for the current time range, properly filtered and aggregated as needed to reduce the number of data points to a manageable size, and use the lookup transformation to combine them.