Grouping values?

Hi,

I was wondering if it's possible to group similar values together in Kibana?

Example:
Facebook and google use many different hosts so if I create a simple pie chart (metric SUM total bytes, bucket destination_host) with 10 entries, I just get a bunch of different google and facebook hosts.

Is it possible to do some kind of query that pulls destination:host fb* or facebook* together in a new entry called facebook?

I understand this is probably a long shot but I thought I'd ask.

I believe a filters aggregation with a regexp filter will get you what you are looking for: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-filters-aggregation.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html (even though examples here show use in query, this should also work in a filter context)

Thanks. I've made a query that returns the results I'm looking for but how do I combine that with a visualization in Kibana? I get all sorts of errors if I simply copy/paste the query but even if I use the query builder and copy that under pie chart -> buckets -? filter aggregation -> filter 1 it still trows errors.

The docs don't mention what a filter should look like.

GET test-netflow-*/_search
{
  "size": 0,
  "aggs" : {
    "messages" : {
      "filters" : {
        "filters" : {
          "facebook" :   { "regexp" : { "dst_host" : ".*facebook.*"   }},
                    "fb" :   { "regexp" : { "dst_host" : ".*fb.*"   }},
                    "google" :   { "regexp" : { "dst_host" : ".*1e100.*"   }}
        }
      }
    }
  }
}

Can you share what errors you are seeing?

Maybe this example would help. I have filebeat data which has some messages that include words like disconnecting and disconnected and maybe some other variations.
So I created a new horizontal bar graph visualization and selected the Filters aggregation.
Then I created several filters. Each filter is a query and you can use wildcards, AND OR, etc.

Hi @Bill_McConaghy

Is what I get if I use "facebook" : { "regexp" : { "dst_host" : ".*facebook.*" }} as the filter. I've tried other combinations as well but they all give similar errors.

@LeeDr

Thanks. Such a filter indeed works but the problem is that I only want to group e.g. google and facebook, but still display the other values.

For example:
Normally if I create a pie chart with the top 10 destination hosts based on data usage, I get 10 entries related to facebook. So I want to group those together but still display the other values as normal.

image

Instead of this, use the syntax:
dst_host: *facebook*

for each filter. You can add a label for each filter by clicking the pencil icon.

Maybe you should create a scripted field that combines the facebook and google values and leaves all the rest and then based your chart on that scripted field.

For example, I have some test data that has a referer field like this;

image

If a scripted field is even the least bit complex I like to take an intermediate step just to make sure I'm getting the expected results. So first I created this number type scripted field;

(I try to always set the popularity up to at least 1 so that it appears at the top of the list in Discover which just makes it easier to find)

So in this case Discover shows -1 if facebook is not found and 7 if it's part of http://facebook...

Now that I see that works, I created this string type scripted field;

if (doc['referer'].value.indexOf('facebook') > 0) {
  'FACEBOOK';
} else {
  doc['referer'].value;
}

image

Now I can use that scripted field in my visualization;
image

Refer to https://www.elastic.co/blog/using-painless-kibana-scripted-fields

Hi,

I've tried doc['dst_host'].value.indexOf('facebook'). At first this resulted in -1 etc. because facebook doesn't exist and it should be a wildcard I think.

How can I do this from the Kibana interface? I'm using Amazon's Elastic service so no direct access to the config files.

For some reason now that script also broke the index and it just results in failed shards with no data appearing anywhere. Delete the field and all is well again.

I think you should be able to or a couple of indexOf tests together to check for several different variations of fb, facebook, etc?

Something like;

if ( (doc['dst_host'].value.indexOf('facebook') > 0 ) || (doc['dst_host'].value.indexOf('fb') > 0) )

Thanks for all the help but I just can't get this to work.

If I paste that code the scripted field won't even show up in discovery or visualizations. I read the docs on scripted fields it clearly isn't written with non programmers in mind. I tried looking op how to use the json filter as well but it doesn't appear there is any documentation on that at all.

If you look in the Management section of Kibana, and then in Index Patterns, and then select your index pattern you should see the list of fields in your index.
If you look for dst_host is it searchable and aggregatable? If not, is there something like dst_host.keyword or dst_host.raw which is searchable and aggregatable?

To use a field in a scripted field it has to be searchable and aggregatable.

Let me know and we'll go step by step to get this working for you.

Hi LeeDr,

Thanks for all the help, really appreciate all the effort you are putting in.

The field created by my mapping is only searchable but the .keyword field is searchable and aggregatable.
If I use your script with replaced field names Kibana trows this error.

Sorry I missed this for a few days. When you see the error, where is the ^--HERE pointing?

Can you paste your script here?

Thanks,
Lee

LeeDr no worries, I'm sure you got better things to do and I appreciate the help. I can't check right now but I will check next week.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.