Grok filter throwing error for valid pattern

Hi,

I am using ELK GA 5.0.0. I am reading from Kafka using Logstash. I have a message like below;

192.196.0.1\t-\t-\t[05/Feb/2018:10:39:35 +0000]\t\"GET /ghRme_Neldtjhqoxy/TlvehaNjshv \"\t404\t-\t1\t\"-\"\t\"-\"\t\"-\"\t\"-\"\t-\t-

When I try grok pattern like below, it works;

filter {
	grok{
		match => { "message" => "%{IP:f1}\t%{NOTSPACE:f2}\t%{NOTSPACE:f3}\t\[%{GREEDYDATA:f4}\]\t\"%{NOTSPACE:f5} %{GREEDYDATA:rest}" }
	}
}

Rest of the message is stored inside %{GREEDYDATA:rest}. For further parsing, I tried NOTSPACE like below;

filter {
	grok{
		match => { "message" => "%{IP:f1}\t%{NOTSPACE:f2}\t%{NOTSPACE:f3}\t\[%{GREEDYDATA:f4}\]\t\"%{NOTSPACE:f5} \/%{NOTSPACE:f6}\"\t%{GREEDYDATA:rest}" }
	}
}

It is showing _grokparsefailure now. Why is this happening and how can I fix this?

Thanks in advance.

Pasting your pattern and the line into the Grok Constructor, I see that the IP successfully matches, but that's it. I think you may need yo escape the backslashes in your pattern (by prefixing each with a backslashes).

Hi @yaauie ,

changing \ to '\` created grok parse failure. I tried the below pattern;

%{IP:f1}\t%{NOTSPACE:f2}\t%{NOTSPACE:f3}\t\[%{GREEDYDATA:f4}\]\t\"%{NOTSPACE:f5}\s*\/%{GREEDYDATA:f6}\t%{GREEDYDATA:rest}

and I am getting the output like;

{
	"rest": "-",
	"offset": 112,
	"input_type": "log",
	"source": "/data/logs/sample.txt",
	"f1": "192.196.0.1",
	"message": "192.196.0.1\t-\t-\t[05/Feb/2018:10:39:35 +0000]\t\"GET /ghRme_Neldtjhqoxy/TlvehaNjshv \"\t404\t-\t24\t\"-\"\t\"-\"\t\"-\"\t\"-\"\t-\t-",
	"type": "logfile",
	"f2": "-",
	"f3": "-",
	"f4": "05/Feb/2018:10:39:35 +0000",
	"f5": "GET",
	"f6": "/ghRme_Neldtjhqoxy/TlvehaNjshv \"\t404\t-\t24\t\"-\"\t\"-\"\t\"-\"\t\"-\"\t-",
	"@timestamp": "2018-03-25T12:29:45.112Z",
	"beat": {
		"hostname": "mybox",
		"name": "mybox",
		"version": "5.0.0"
	},
	"@version": "1",
	"fields": {
		"logtype": "samplelogs"
	}
} 

f6 is suppose to take data till next ", but it is taking entire data except last -. Why is this happening?

GREEDYDATA may just be too greedy to be used here.

Are all of the fields you want to capture separated by the backslash-t sequence, or is it possible to have a backslash-t inside a captured field? Do you always have the same number of fields?

You may be better off using the dissect filter, which cares a lot more about what separates the fields than the exact patterns of the fields themselves.

Ok.. dissect filter is a better candidate. Lemme c.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.