Extract (substring) and count(distinct) in Kibana


(Mirco Santori) #1

Hello!

I have lots of messages field structured as following :

"Running tests for project: Project A - description"

and need to extract/substring the second part after the colons and count(distinct) them.

Is it possible in kibana , or should I use lucene syntax via curl/json request ?

Thanks for your help!

Mirco


(Christian Dahlqvist) #2

In order to do that I believe you will need to extract the data into a separate field before indexing it into Elasticsearch.


(Mirco Santori) #3

mmm .. good point but I am not fully convinced it is the only way to achieve that. I guess and hope ElasticSearch should have some built-in mechanism to manipulate the string

anyway, thanks for your hint


(Dani Garcia) #4

Did you find any way to achieve that??

Thanks in advance.

Regards,


(Mirco Santori) #5

Hi Dani,
sorry for my late reply.
Yes I am still working on it and the only solution I found is to dedicate a new field to the interested message and set it to "not_analyzed" while loading the data into elasticsearch. It allows you to treat the whole message as unique string otherwise elasticsearch will end up splitting the message in many words space limited.

Then, as soon as the import is ok, you can proceed with a substring manipulation directly from Kibana (notice the "inline json" field among the options). To do that you would need to enable the groovy feature in elasticsearch and use the following syntax :

{ "script" : "doc['your_custom_message_not_analyzed'].getValue().substring(10,10)" }

Let me know if you have questions
hope it helps!

Mirco


(Dani Garcia) #6

Thanks for your reply,

I did it with a python script and creating a new index every day. I will test your solution for future tasks.

Regards,


(system) #7