Logstash didn't split log as config say

I am using elkb to handle my access log. And some day, I found kibana missing a log.

Then, grep filebeat log, And I can find the missing log:

cat km-access-2017-03-01 |grep 17714317657896955-151

BTW 17714317657896955-151 is my request id

2017/03/01 10:19:20.096711 client.go:184: DBG  Publish: {
  "@timestamp": "2017-03-01T10:19:16.327Z",
  "beat": {
    "hostname": "kvm980156.jx.diditaxi.com",
    "name": "kvm980156.jx.diditaxi.com",
    "version": "5.0.0"
  },
  "input_type": "log",
  "message": "2017-03-01 18:19:11.699|10.94.104.169|17714317657896955-151|1|wangziyi|Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_2)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36|POST|/api/v1/answer/|com.didi.km.api.controller.api.v1.quest
ion.AnswerController#post[2 args]|{\"questionId\":[\"145\"],\"content\":[\"\u003cp\u003e123123123123123\u003c/p\u003e\"]}|200|220",
  "offset": 1723505,
  "source": "/home/km/didi-km-api/logs/km-access.2017-03-01.log",
  "type": "log"
}

And, I grep logstash log using grep 17714317657896955-151, I can find it too:

{
    "@timestamp" => 2017-03-01T10:19:16.327Z,
        "offset" => 1723505,
      "@version" => "1",
    "input_type" => "log",
          "beat" => {
        "hostname" => "kvm980156.jx.diditaxi.com",
            "name" => "kvm980156.jx.diditaxi.com",
         "version" => "5.0.0"
    },
          "host" => "kvm980156.jx.diditaxi.com",
        "source" => "/home/km/didi-km-api/logs/km-access.2017-03-01.log",
       "message" => "2017-03-01 18:19:11.699|10.94.104.169|17714317657896955-151|1|wangziyi|Mozilla/5.0 (Macintosh; Intel Mac OS X 10
_12_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36|POST|/api/v1/answer/|com.didi.km.api.controller.api.v
1.question.AnswerController#post[2 args]|{\"questionId\":[\"145\"],\"content\":[\"<p>123123123123123</p>\"]}|200|220",
          "type" => "log",
          "tags" => [
        [0] "beats_input_codec_plain_applied",
        [1] "_grokparsefailure"
    ]
}

BUT there is some different between this log and others. This log didn't split as my config say, And others did.

next log under the missing log:

{
    "controllerMethod" => "com.didi.km.api.controller.api.v1.question.AnswerController#answersOrderByHot[2 args]",
              "offset" => 1723849,
              "method" => "GET",
          "input_type" => "log",
              "source" => "/home/km/didi-km-api/logs/km-access.2017-03-01.log",
             "message" => "2017-03-01 18:19:11.855|10.94.104.169|17714317657896955-152|1|wangziyi|Mozilla/5.0 (Macintosh; Intel Mac O
S X 10_12_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36|GET|/api/v1/answer/145|com.didi.km.api.controll
er.api.v1.question.AnswerController#answersOrderByHot[2 args]|{\"order\":[\"hot\"],\"pager\":[\"1,100\"]}|200|60",
                "type" => "log",
                  "ua" => "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87
 Safari/537.36",
                 "uri" => "/api/v1/answer/145",
                "tags" => [
        [0] "beats_input_codec_plain_applied"
    ],
                 "uid" => 1,
          "@timestamp" => 2017-03-01T10:19:11.855Z,
               "param" => "{\"order\":[\"hot\"],\"pager\":[\"1,100\"]}",
            "costTime" => 60,
           "requestID" => "17714317657896955-152",
             "host-ip" => "10.94.104.169",
            "@version" => "1",
                "beat" => {
        "hostname" => "kvm980156.jx.diditaxi.com",
            "name" => "kvm980156.jx.diditaxi.com",
         "version" => "5.0.0"
    },
                "host" => "kvm980156.jx.diditaxi.com",
                "time" => "2017-03-01 18:19:11.855",
            "username" => "wangziyi",
          "statusCode" => 200
}

And, this is my logstash config. Using grok to split log.

input {
    beats {
        port => "5043"
    }
}
filter {
    # TIME||HOST-IP||REQUEST-ID||UID||USERNAME||METHOD||URI||CONTROLLER-METHOD||PARAMS-MAP
    grok {
        match => {
            "message" => "%{TIMESTAMP_ISO8601:time}\|%{IP:host-ip}\|(?<requestID>\d+-\d+)\|%{INT:uid:int}\|%{WORD:username}\|(?<ua>(\
w|\/|\.|\s|\(|;|\)|,)+)\|%{WORD:method}\|(?<uri>(\w|\/)+)\|(?<controllerMethod>(\w|\d|\s|\.|#|\[|\])+)\|(?<param>(\w|{|}|\"|\:|\[|\]|
\,)+)\|%{NUMBER:statusCode:int}\|%{NUMBER:costTime:int}"
        }
    }

    date {
        match   =>  ["time", "yyyy-MM-dd HH:mm:ss.SSS"]
        target  =>  "@timestamp"
    }
}
output {
    stdout { codec => rubydebug }
    elasticsearch {
        hosts => [ "10.94.66.193:9200" ]
        index => "km-access-%{+YYYY.MM.dd}"
    }
}

Because this bug, I can't count some log with Kibana.

Here is my original log:

2017-03-01 18:19:11.699|10.94.104.169|17714317657896955-151|1|wangziyi|Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36|POST|/api/v1/answer/|com.didi.km.api.controller.api.v1.question.AnswerController#post[2 args]|{"questionId":["145"],"content":["<p>123123123123123</p>"]}|200|220

In your first example the event contains the tag _grokparsefailure, indicating that the grok filter failed. To debug this, reduce your grok expression to the bare minimum: %{TIMESTAMP_ISO8601:time}

Does that work for the previously failing input line? If yes, try with another piece: %{TIMESTAMP_ISO8601:time}\|%{IP:host-ip}

What about now? Continue until you've found the culprit.

Thanks for your help

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.