Count occurences of a field excludig a pattern


#1

Hi, having some occurences of a field like this (it's part of a SIP message):

<sip:john.doe@kibana.co>;tag=446463957-1474248204740
<sip:mark.grey@elastic.co>;tag=48768q74638634376
<sip:rose.mary@logstash.co>;tag=098974309852376086387

Is it possible to create a Visualization in Kibana which counts the occurences of values in this field but excluding part of the message, for example everyting before @ and everything after >. In my example I would like to obtain this Data Table:

               Count
kibana.co      1
elastic.co     1
logstash.co    1

I've tried to use the Include or Exclude pattern, but I think I'm not using that feature properly or I'm getting something wrong.

Thanks :slight_smile:


(Shaunak Kashyap) #2

Typically this sort of tokenization is done at ingest time. If you are using Logstash to ingest your data into Elasticsearch, you could use the grok filter to parse out the different tokens from the message and index them as separate fields in Elasticsearch. Then visualizing them in Kibana becomes not only easy but also very fast, using the Terms aggregation.

So if this performing tokenization at ingest time is an option for you, I would highly recommend doing that.


(Shaunak Kashyap) #3

If you are on Elasticsearch (and Kibana) 5.0, you have a couple of other options:

If you don't use Logstash, you could look into using Elasticsearch's new Ingest node to perform parsing/tokenization of the message field into separate fields in Elasticsearch itself right before indexing. This is similar to the Logstash option I suggested in the previous comment in that the tokenization happens before indexing.

That being said you could create a scripted field in Kibana, say domain, and use Elasticsearch's new Painless scripting language to parse out the domain names into a separate field at query time. This will be slower than doing the parsing/tokenization at index time as described in the Logstash or Ingest Node approaches earlier, though.


#4

Hello @shaunak,

Thanks for your interesting replies! I'm able to manipulate the input via Logstash, so I'm following your first advice!
However, I found interesting also your second post :slight_smile:

Thank you!


#5

Also, is there some detailed documentation about the Include/Exclude pattern in Kibana? I still can't get what are they used for :blush:


(system) #6