How to Join 2 fields?

image

So..lets say i want to see how many IP xxx.xxx.xxx.xxx attacks my server. And in that table also included the reputation of the IP. How i could do that in one table? So.. the table will look like this:

| IP Address | Count | IP Reputation|

create a scripted field, with script: doc['fieldA'].value + ' ' + doc['fieldB'].value

That would sort of work, and then you can use that combination as the field to aggregate/bucket data by.

Elasticsearch does not do joins, it's a document database, not a relational one. What you're trying to do is take all the documents you have and roll them up (aggregate them) based on one of the fields. In this case, you want to roll up all the values by IP Address and get the total some of documents for each discrete value. You can do this using a Table visualization in Kibana, with a count metric and a terms aggregation on the address field.

The issue you're going to run into is the additional rollup of the "IP Reputation" field. A scripted field would basically do the same thing as doing a terms agg on the Address and the Reputation... You're still going to potentially end up with multiple records for the same IP (one for each IP and Reputation pair), but you'll still mostly have the info you need. If you keep your denormalized data in sync (ie. every document for a given address has exactly the same reputation value), then you'll get the results you want. If that data is not all synchronized, you won't.

What would work better is to assign a kind of "risk level" value instead of a string representation, and then do a min/max on that value. This way you'd only have a single aggregation (the IP Address) and then 2 metrics (the count and the min/max risk value). The drawback of that is then knowing what a value of, say, "4", really means.

Here's an example of what I mean. Take the following documents:

address | reputation
------- | ----------
185.107.80.183 | null
185.107.80.183 | null
185.107.80.183 | known attacked
185.107.80.183 | null

If you aggregate on both fields, you're going to get:

address | reputation | count
------- | ---------- | -----
185.107.80.183 | null | 3
185.107.80.183 | known attacker | 1

You have 2 records for the same address, with different counts for each, and you have to mentally add up all the counts and decide which reputation is the correct one. You have the information you were looking for, kind of, but not in exactly the format you wanted.

If instead, "known attacker" was a numeric value, say 5, and no reputation was a 0, you'd have:

address | reputation
------- | ----------
185.107.80.183 | 0
185.107.80.183 | 0
185.107.80.183 | 5
185.107.80.183 | 0

Now you can aggregate on the address, take the max value of the reputation, and you are going to be closer to the info you want:

address | reputation | count
------- | ---------- | -----
185.107.80.183 | 5 | 4

You have the right data (a single record) with the total count, but you need to know what a reputation of "5" means. Like before, this still isn't exactly what you wanted, but it's closer, and scales better if you're showing multiple addresses at the same time. It also means that as soon as "escalate" an addresses reputation, it shows up immediately in that table with the new value. Of course, you could always do this backwards too (making the most offensive addresses have a score of 0), it just comes down to what makes more sense for you.

You could attempt to use a field formatter here to make the value more useful, mapping "5" to "known attacker" somehow. The only real option is the URL formatter, which you could use to make Kibana display an image instead of a number there. It's not ideal, but it would work.

The problem with the number, instead of the string, is still kind of the same though; neither ES nor Kibana do joins. So, there's no simple way to map "5" to "known attacker". FWIW, Canvas does allow this, but it's still currently in technical preview. You can find out more about Canvas in this blog post and at http://canvas.elastic.co/.

1 Like

Wow, thats will need a lot of works...thank you for your answer.
Last, how about this one :

Thats from 1 index but differennt log/source. I want to combine that 2 in one chart.. I used this scripted field but didnt works:
doc['connection.protocol.keyword'].value + ' ' +doc['protocol..keyword'].value

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.