Dissect filter is not dissecting properly


#1

Hi,

I am using ELK GA 6.0.0. I am trying out dissect filter to chop my log file entry, but some field is not getting dissected properly. I am pushing messages into Kafka using file beat, and using my Logstash to fetch them. I have a log entry in my file like below;

<Sep 26, 2017 7:56:38:265 PM> <c60ea685-4f68-454d-b0b3-4b7279a19f1e-00000090> <Bte_NsketefJerlsw_Vlfsirb_1_1> <CidhoegLsi5hs: KFSDbpdgBrvkdhsny> <ndygkcvsdwifg> <mht> <qwe.rtyuio.aYre.qazxswedcvfrt.fvb.bo.MNBVzxcvLkjhgfdsa> <run>
 <QWERTY: qwerty.qwerty.qwerty
	qwerty
	qwerty
	qwerty
>

I am using the following dissect filter;

dissect {
  mapping => {
	"message" => "<%{time}> <%{data1}> <%{data2}> <%{data3}> <%{data4}> <%{data5}> <%{data6}> <%{data7}>\n <%{data8}>"
  }
}

the stdout of logstash is like below;

{
	"data8": "run>\n <QWERTY: qwerty.qwerty.qwerty\n\tqwerty\n\tqwerty\n\tqwerty\n",
	"offset": 630,
	"data4": "ndygkcvsdwifg",
	"prospector": {
		"type": "log"
	},
	"data6": "qwe.rtyuio.aYre.qazxswedcvfrt.fvb.bo.MNBVzxcvLkjhgfdsa",
	"source": "/files/sample.txt",
	"data2": "Bte_NsketefJerlsw_Vlfsirb_1_1",
	"message": "<Sep 26, 2017 7:56:38:265 PM> <c60ea685-4f68-454d-b0b3-4b7279a19f1e-00000090> <Bte_NsketefJerlsw_Vlfsirb_1_1> <CidhoegLsi5hs: KFSDbpdgBrvkdhsny> <ndygkcvsdwifg> <mht> <qwe.rtyuio.aYre.qazxswedcvfrt.fvb.bo.MNBVzxcvLkjhgfdsa> <run>\n <QWERTY: qwerty.qwerty.qwerty\n\tqwerty\n\tqwerty\n\tqwerty\n>",
	"data1": "c60ea685-4f68-454d-b0b3-4b7279a19f1e-00000090",
	"@timestamp": "2018-01-04T08:27:07.129Z",
	"beat": {
		"name": "mylinux",
		"hostname": "mylinux",
		"version": "6.0.0"
	},
	"@version": "1",
	"time": "Sep 26, 2017 7:56:38:265 PM",
	"data3": "CidhoegLsi5hs: KFSDbpdgBrvkdhsny",
	"data7": "",
	"data5": "mht"
}

in the output, data7 is missing. I am expecting runin data7. The entire thing works fine if I use a grok filter like below;

grok {
	match => { "message" => "<(?<time>%{MONTH} %{MONTHDAY}, 20%{YEAR} %{HOUR}:?%{MINUTE}(?::?%{SECOND}) (?:AM|PM))> <%{NOTSPACE:data1}> <%{NOTSPACE:data2}> <%{GREEDYDATA:data3}> <%{NOTSPACE:data4}> <%{NOTSPACE:data5}> <%{NOTSPACE:data6}> <%{NOTSPACE:data7}>\n <%{GREEDYDATA:data8}>" }
}

Why is this happening? How can I fix this?

Thanks.


(Guy Boertje) #2

I will check. I suspect its because the \n in the dissection is not being seen as one newline but as a slash and a n.

Even if one adds the actual newline in the dissection there is still a bug because in the regex pattern that builds the delimiters list the dot does not match the newline.

e.g.

input {
  generator {
    message => "<Sep 26, 2017 7:56:38:265 PM> <c60ea685-4f68-454d-b0b3-4b7279a19f1e-00000090> <Bte_NsketefJerlsw_Vlfsirb_1_1> <CidhoegLsi5hs: KFSDbpdgBrvkdhsny> <ndygkcvsdwifg> <mht> <qwe.rtyuio.aYre.qazxswedcvfrt.fvb.bo.MNBVzxcvLkjhgfdsa> <run>
 <QWERTY: qwerty.qwerty.qwerty
  qwerty
  qwerty
  qwerty
>"
    count => 1
  }
}

filter {
  dissect {
    mapping => {
      message => '<%{time}> <%{data1}> <%{data2}> <%{data3}> <%{data4}> <%{data5}> <%{data6}> <%{data7}>
 <%{data8}>%{rest}'
    }
  }
}

output {
  stdout {
    codec => rubydebug
  }
}

gives...

{
         "data1" => "c60ea685-4f68-454d-b0b3-4b7279a19f1e-00000090",
         "data3" => "CidhoegLsi5hs: KFSDbpdgBrvkdhsny",
    "@timestamp" => 2018-01-04T09:18:16.525Z,
         "data7" => "run>\n",
         "data8" => "QWERTY: qwerty.qwerty.qwerty\n  qwerty\n  qwerty\n  qwerty\n",
         "data5" => "mht",
      "sequence" => 0,
         "data6" => "qwe.rtyuio.aYre.qazxswedcvfrt.fvb.bo.MNBVzxcvLkjhgfdsa",
       "message" => "<Sep 26, 2017 7:56:38:265 PM> <c60ea685-4f68-454d-b0b3-4b7279a19f1e-00000090> <Bte_NsketefJerlsw_Vlfsirb_1_1> <CidhoegLsi5hs: KFSDbpdgBrvkdhsny> <ndygkcvsdwifg> <mht> <qwe.rtyuio.aYre.qazxswedcvfrt.fvb.bo.MNBVzxcvLkjhgfdsa> <run>\n <QWERTY: qwerty.qwerty.qwerty\n  qwerty\n  qwerty\n  qwerty\n>",
          "rest" => "",
          "host" => "Elastics-MacBook-Pro.local",
      "@version" => "1",
          "time" => "Sep 26, 2017 7:56:38:265 PM",
         "data2" => "Bte_NsketefJerlsw_Vlfsirb_1_1",
         "data4" => "ndygkcvsdwifg"
}

nearly but not quite.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.