Grouping values?

Sjaak01 · January 25, 2018, 8:44am

Hi,

I was wondering if it's possible to group similar values together in Kibana?

Example:
Facebook and google use many different hosts so if I create a simple pie chart (metric SUM total bytes, bucket destination_host) with 10 entries, I just get a bunch of different google and facebook hosts.

Is it possible to do some kind of query that pulls destination:host fb* or facebook* together in a new entry called facebook?

I understand this is probably a long shot but I thought I'd ask.

Bill_McConaghy · January 25, 2018, 1:37pm

I believe a filters aggregation with a regexp filter will get you what you are looking for: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-filters-aggregation.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html (even though examples here show use in query, this should also work in a filter context)

Sjaak01 · January 26, 2018, 1:35am

Thanks. I've made a query that returns the results I'm looking for but how do I combine that with a visualization in Kibana? I get all sorts of errors if I simply copy/paste the query but even if I use the query builder and copy that under pie chart -> buckets -? filter aggregation -> filter 1 it still trows errors.

The docs don't mention what a filter should look like.

GET test-netflow-*/_search
{
  "size": 0,
  "aggs" : {
    "messages" : {
      "filters" : {
        "filters" : {
          "facebook" :   { "regexp" : { "dst_host" : ".*facebook.*"   }},
                    "fb" :   { "regexp" : { "dst_host" : ".*fb.*"   }},
                    "google" :   { "regexp" : { "dst_host" : ".*1e100.*"   }}
        }
      }
    }
  }
}

Bill_McConaghy · January 26, 2018, 1:25pm

Can you share what errors you are seeing?

LeeDr · January 26, 2018, 3:46pm

Maybe this example would help. I have filebeat data which has some messages that include words like disconnecting and disconnected and maybe some other variations.
So I created a new horizontal bar graph visualization and selected the Filters aggregation.
Then I created several filters. Each filter is a query and you can use wildcards, AND OR, etc.

Sjaak01 · January 29, 2018, 12:35am

Hi @Bill_McConaghy

Is what I get if I use "facebook" : { "regexp" : { "dst_host" : ".*facebook.*" }} as the filter. I've tried other combinations as well but they all give similar errors.

@LeeDr

Thanks. Such a filter indeed works but the problem is that I only want to group e.g. google and facebook, but still display the other values.

For example:
Normally if I create a pie chart with the top 10 destination hosts based on data usage, I get 10 entries related to facebook. So I want to group those together but still display the other values as normal.

Bill_McConaghy · January 29, 2018, 1:16pm

Instead of this, use the syntax:
dst_host: *facebook*

for each filter. You can add a label for each filter by clicking the pencil icon.

LeeDr · January 29, 2018, 4:18pm

Maybe you should create a scripted field that combines the facebook and google values and leaves all the rest and then based your chart on that scripted field.

For example, I have some test data that has a referer field like this;

If a scripted field is even the least bit complex I like to take an intermediate step just to make sure I'm getting the expected results. So first I created this number type scripted field;

(I try to always set the popularity up to at least 1 so that it appears at the top of the list in Discover which just makes it easier to find)

So in this case Discover shows -1 if facebook is not found and 7 if it's part of http://facebook...

Now that I see that works, I created this string type scripted field;

if (doc['referer'].value.indexOf('facebook') > 0) {
  'FACEBOOK';
} else {
  doc['referer'].value;
}

Now I can use that scripted field in my visualization;

Refer to https://www.elastic.co/blog/using-painless-kibana-scripted-fields

Sjaak01 · January 30, 2018, 1:10am

Hi,

I've tried doc['dst_host'].value.indexOf('facebook'). At first this resulted in -1 etc. because facebook doesn't exist and it should be a wildcard I think.

How can I do this from the Kibana interface? I'm using Amazon's Elastic service so no direct access to the config files.

For some reason now that script also broke the index and it just results in failed shards with no data appearing anywhere. Delete the field and all is well again.

LeeDr · January 30, 2018, 7:18pm

I think you should be able to or a couple of indexOf tests together to check for several different variations of fb, facebook, etc?

Something like;

if ( (doc['dst_host'].value.indexOf('facebook') > 0 ) || (doc['dst_host'].value.indexOf('fb') > 0) )

Sjaak01 · February 9, 2018, 5:01am

Thanks for all the help but I just can't get this to work.

If I paste that code the scripted field won't even show up in discovery or visualizations. I read the docs on scripted fields it clearly isn't written with non programmers in mind. I tried looking op how to use the json filter as well but it doesn't appear there is any documentation on that at all.

LeeDr · February 12, 2018, 9:21pm

If you look in the Management section of Kibana, and then in Index Patterns, and then select your index pattern you should see the list of fields in your index.
If you look for dst_host is it searchable and aggregatable? If not, is there something like dst_host.keyword or dst_host.raw which is searchable and aggregatable?

To use a field in a scripted field it has to be searchable and aggregatable.

Let me know and we'll go step by step to get this working for you.

Sjaak01 · February 14, 2018, 5:25am

Hi LeeDr,

Thanks for all the help, really appreciate all the effort you are putting in.

The field created by my mapping is only searchable but the .keyword field is searchable and aggregatable.
If I use your script with replaced field names Kibana trows this error.

Error: Request to Elasticsearch failed: {"error":{"root_cause":[{"type":"script_exception","reason":"compile error","script_stack":["... alue.indexOf('fb') > 0) )"," ^---- HERE"],"script":"if ( (doc['dst_addr_host.keyword'].value.indexOf('facebook') > 0 ) || (doc['dst_addr_host.keyword'].value.indexOf('fb') > 0) )","lang":"painless"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"logstash-netflow-test-2018.06","node":"1SzZYb8ZQ2iyTAlNHwe9jA","reason":{"type":"script_exception","reason":"compile error","script_stack":["... alue.indexOf('fb') > 0) )"," ^---- HERE"],"script":"if ( (doc['dst_addr_host.keyword'].value.indexOf('facebook') > 0 ) || (doc['dst_addr_host.keyword'].value.indexOf('fb') > 0) )","lang":"painless","caused_by":{"type":"illegal_argument_exception","reason":"unexpected end of script.","caused_by":{"type":"no_viable_alt_exception","reason":null}}}}]},"status":500}

LeeDr · February 22, 2018, 6:05pm

Sorry I missed this for a few days. When you see the error, where is the ^--HERE pointing?

Can you paste your script here?

Thanks,
Lee

Sjaak01 · February 23, 2018, 9:01am

LeeDr no worries, I'm sure you got better things to do and I appreciate the help. I can't check right now but I will check next week.

system · March 23, 2018, 9:01am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to group similar messages on a chart Kibana	3	5096	December 3, 2018
Kibana. Regex contains Kibana	4	856	April 13, 2018
Kibana group visualization Kibana	5	694	October 22, 2019
Kibana- Pie Chart Visualization using Filters Kibana	3	437	September 4, 2019
Create table visualization using regex matching groups Kibana	3	1704	December 16, 2019

Grouping values?

Related topics