Looking for idea to preprocess logs

imaad · January 29, 2019, 8:41am

Hello guys,

In my log file, multiple threads are logged in the same file and this causes a badly ordered file which looks like :

2018-12-13T11:46:13.654+0000 Regulatory [INFO] Transaction etc...
2018-12-13T13:13:22.449+0000 Regulatory [INFO] Transaction etc...
2018-12-13T12:07:41.644+0000 Regulatory [INFO] Transaction etc....
2018-12-13T11:41:44.846+0000 Regulatory [INFO] Transaction etc....

Is there any way to sort it by timestamp? I am not a kafka expert but I'm wondering if it could be the righ tool to achieve this.

Did someone have any idea to make it work? my final goal is to start process this data with logstash in the right order :

2018-12-13T11:41:44.846+0000 Regulatory [INFO] Transaction ...........
2018-12-13T11:46:13.654+0000 Regulatory [INFO] Transaction ...........
2018-12-13T12:07:41.644+0000 Regulatory [INFO] Transaction ...........
2018-12-13T13:13:22.449+0000 Regulatory [INFO] Transaction ...........

Thank you in advance.

pjanzen · January 29, 2019, 8:45am

Instead of pre-order the data, use the timestamp from the event and set that as the timestamp to index in elasticsearch. Then it nicely sorted in kibana when you view the data..

You can use a filter like this (this does not match your timestamp, it is just an example).

filter {
    grok {
        match => { "message" => "%{TIMESTAMP_ISO8601:replace_timestamp}" }
    }
    date {
      match => ['replace_timestamp', 'yyyy-MM-dd HH:mm:ss']
      timezone => "UTC"
      target => "@timestamp"
    }
}

imaad · January 29, 2019, 9:34am

Thank you @pjanzen. It's a good idea, but I'm using a multiline codec in my input :

  file {
    path => "/usr/share/logstash/test.log"
    start_position => "beginning"
    type => "log"
    sincedb_path => "/dev/null"
    codec => multiline {
      pattern => "(null)+"
      what => "previous"
    }
  }

So, I am looking for a way to pre-order data before the multiline codec.

pjanzen · January 29, 2019, 9:40am

Your using the multiline codec to create 1 single event before you are processing it right? the it still would work I think.. But I can be wrong of course, I could miss information..

imaad · January 29, 2019, 9:51am

Ok, I will explain you my use case. In my log file, I have two kind of lines :

2018-12-13T11:41:44.846+0000 Regulatory [INFO] Transaction : VALIDATE,qf16ft787bif1xs1iuoqihwi9,00000000,100002506,13-12-2018,13-12-2018T11:41:42.447+0000,null,Payment Order,Date not a working day
2018-12-13T13:54:43.646+0000 Regulatory [INFO] Transaction : PROCESS,007069643021v2xs7x08f975bswzkiv5nona,0070696430.2,100002506,13-12-2018,13-12-2018T13:54:42.585+0000,13-12-2018T13:54:43.139+0000,Payment Order,None

As you can see, there is one line with a null value and the other one with a date value. This date could be null. So, I am using the multiline codec to create 1 single event for the null values.

My goal is to create single events for null consecutive values that's why I am using the multiline codec. But because of the multiple threads logged in the same file I need to pre-order data by timestamps firstly to respect the rule of consecutive values.

I hope I was clear enough because my english is so bad

Thank you @pjanzen

system · February 26, 2019, 9:54am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Order log in kibana based on log timestamp Logstash	10	6560	April 24, 2017
Log messages with same index timestamps are not in the order as in the actual log file Logstash	10	2949	August 19, 2020
How to keep my entries ordered as in my file Logstash	1	245	June 21, 2018
Logstash sending data to kafka topic at same time which disturbs ordering of log transactions Logstash	2	358	January 11, 2022
How to parse logs in same order as in log files Kibana	5	2101	July 6, 2018

Looking for idea to preprocess logs

Related topics