Aggregate similar logs in kibana


(Chelovek Chelovechnii) #1

Suppose I have several services. They are all configured to send their logs to ES.

In complex systems it is quite usual that the same messages (sometimes even errors) are logged every day. I was to create a dashboard where only abnormal messages are showed. They would indicate that something unusual happens.

The first thing to come in mind is just to set a filter and filter out all "regular" messages. In some systems it would be as easy as filter out INFO messages, but, as I said, it can be hard in complex systems. Moreover, some enterprise services do not provide log level in their logs!
The second thought is to filter messages by text. But it will be a huge filter and it would be hard to create and maintain it.
Another idea is to perform an es aggregation on log write (or lazily) to find similar messages. But I'm not quite sure how to do it and it sound slow.

So, what is the common approach to show only abnormal messages? Any ideas would be appreciated.


#2

Maybe you could use a grok filter on logstash to capture the NORMAL messages, and do nothing to them. Then all the ones that are not normal could be tagged (using a field for that) appropriately, so on Kibana you would just have to filter by that field.


(Chelovek Chelovechnii) #3

It seems that it would be hard to create and maintain this filter.


#4

I don't think there is a built-in feature that detects patterns and deviations in strings coming from different sources and in assorted formats.
You probably will not be able to avoid having to write and maintain something to sort that.


#5

If your "normal" strings are really regular, you could try to sort them out by size I guess.


(Chelovek Chelovechnii) #6

You mean that errors are usually longer?

What is there is single format of logs? Would it help?


#7

I mean the errors could be different. If all of your normal messages have the same size X you could look for the ones which size is not X. Usually in log files when there is an error the stack trace is logged along, making error messages bigger. I can't really help you more without knowing your use case, I'm just trying to give you some ideas.

If you have a single format for the logs, why is it so difficult to have a filter on logstash for that?


(Chelovek Chelovechnii) #8

Filtering by size indeed seems reasonable.

Some abstract example of logs

opening session id xxxx with params bla bla bla //normal
performing request form to with bla bla bla //normal
sending response from to with bla bla bla //normal
closing session id xxxx with params bla bla bla //normal

opening session id xxxx with params bla bla bla //normal
performing request form to with bla bla bla //normal
sending response from to with bla bla bla //normal
closing session id xxxx with params bla bla bla //normal

flushing some buffers by request //!!!strange!!!

opening session id xxxx with params bla bla bla //normal
performing request form to with bla bla bla //normal
sending response 500 from page with well known bug //normal
closing session id xxxx with params bla bla bla //normal

opening session id xxxx with params bla bla bla //normal
performing request form to with bla bla bla //normal
sending response 500 from uxepected page //!!!strange!!!
closing session id xxxx with params bla bla bla //normal

Filtering can be done, yes, but one need to maintain a large set of filters or even write a separate app to decide what is normal, what is not. It is a dull work. It would be nice to get some piece of logic which just decides "oh, I've seen something similar before" or "unusual message, haven't seen anything similar in weeks".


#9

Other than the size you could try have all the messages as analysed strings in the index, and then do some kind of aggregation to identify which ones have words differing from the average. I don't even know if this is possible, but it's another idea.


(system) #10