Filebeat error using ingest pipeline: Provided Grok expressions do not match field value


(jmd) #1

Hi there,

I'm having an issue with Filebeat when sending to Elasticsearch.

Firstly, apologies please let me know if this should be in the Elasticsearch section rather than Filebeat.

I've setup an ingest pipline:

{
"varnish_logs": {
"processors": [
{
"grok": {
"field": "message",
"patterns": [
"""%{IP:clientip} %{USER:ident} %{USER:auth} [%{HTTPDATE:timestamp}] "%{WORD:verb} %{NOTSPACE:request} HTTP/%{NUMBER:http_version}" %{NUMBER:response} %{NUMBER:bytes} %{GREEDYDATA:referrer} %{QS:agent} %{NUMBER:response_time}"""
]
}
},
{
"date": {
"field": "timestamp",
"timezone": "UTC",
"target_field": "@timestamp",
"formats": [
"dd/MMM/yyyy:HH:mm:ss Z"
]
}
},
{
"remove": {
"field": "timestamp"
}
}
]
}
}

and it works when I use the simulate API, and also when I test the grok using http://grokconstructor.appspot.com/

However, when I start filebeat, for some of the log entries, I get the message:
2017-05-16T17:15:30+01:00 INFO Bulk item insert failed (i=0, status=500): {"type":"exception","reason":"java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Provided Grok expressions do not match field value: [ww.domain.com/de-DE/home/2ed5w48kfai8g96c782hulc2z" "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:53.0) Gecko/20100101 Firefox/53.0" 7342]","caused_by":{"type":"illegal_argument_exception","reason":"java.lang.IllegalArgumentException: Provided Grok expressions do not match field value: [ww.domain.com/de-DE/home/2ed5w48kfai8g96c782hulc2z" "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:53.0) Gecko/20100101 Firefox/53.0" 7342]","caused_by":{"type":"illegal_argument_exception","reason":"Provided Grok expressions do not match field value: [ww.domain.com/de-DE/home/2ed5w48kfai8g96c782hulc2z" "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:53.0) Gecko/20100101 Firefox/53.0" 7342]"}},"header":{"processor_type":"grok"}}
2017-05-16T17:15:30+01:00 INFO Error publishing events (retrying): temporary bulk send failure

My Elasticsearch log contains many lines like:

[2017-05-16T16:04:12,945][DEBUG][o.e.a.b.TransportBulkAction] [thw-prf-sqp-elk] failed to execute pipeline [varnish_logs] for document [http-2017.05.16/log/null]
org.elasticsearch.ElasticsearchException: java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Provided Grok expressions do not match field value: [sungsmooth-agent/1.1" 21706]
at org.elasticsearch.ingest.CompoundProcessor.newCompoundProcessorException(CompoundProcessor.java:156) ~[elasticsearch-5.3.1.jar:5.3.1]
at org.elasticsearch.ingest.CompoundProcessor.execute(CompoundProcessor.java:107) ~[elasticsearch-5.3.1.jar:5.3.1]
at org.elasticsearch.ingest.Pipeline.execute(Pipeline.java:58) ~[elasticsearch-5.3.1.jar:5.3.1]
at org.elasticsearch.ingest.PipelineExecutionService.innerExecute(PipelineExecutionService.java:166) ~[elasticsearch-5.3.1.jar:5.3.1]
at org.elasticsearch.ingest.PipelineExecutionService.access$000(PipelineExecutionService.java:41) ~[elasticsearch-5.3.1.jar:5.3.1]
at org.elasticsearch.ingest.PipelineExecutionService$2.doRun(PipelineExecutionService.java:88) [elasticsearch-5.3.1.jar:5.3.1]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:638) [elasticsearch-5.3.1.jar:5.3.1]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-5.3.1.jar:5.3.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
Caused by: java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Provided Grok expressions do not match field value: [sungsmooth-agent/1.1" 21706]
... 11 more
Caused by: java.lang.IllegalArgumentException: Provided Grok expressions do not match field value: [sungsmooth-agent/1.1" 21706]
at org.elasticsearch.ingest.common.GrokProcessor.execute(GrokProcessor.java:67) ~[?:?]
at org.elasticsearch.ingest.CompoundProcessor.execute(CompoundProcessor.java:100) ~[elasticsearch-5.3.1.jar:5.3.1]
... 9 more

What I have done to debug, is for example with the above message:
grep -i "2ed5w48kfai8g96c782hulc2z" varnishncsa.log > /tmp/log.txt
to get all the entries I would think it could be complaining about, and pasting the whole lot into the matcher at http://grokconstructor.appspot.com/ and every message matches, so I am very confused.

I'm using Elasticsearch and filebeat 5.3.1, and very happy to upgrade to 5.4, I just didn't see any changes about this in the release notes.

If anyone has any ideas, I would be very grateful!


(ruflin) #2

By simulate API I assume you mean this one here? https://www.elastic.co/guide/en/elasticsearch/reference/master/simulate-pipeline-api.html

So you used the exact line(s) that were failing with the simulate API and it worked, but when sending it from FB it doesn't? Just to make sure we talk about the same thing.


(jmd) #3

Yes, thanks for getting back to me.
This is now working properly. I slept on it and tried again this morning. It turns out I needed to have \[{HTTPDATE:timestamp}\] rather than [%{HTTPDATE:timestamp}]. However I'm still confused as to why it worked with the simulate API but not in filebeat.
Never mind, thanks again!


(ruflin) #4

Glad to hear it works now. Not sure about the difference but sounds like escaping issue which are different. Did you send the data through the same tool for both things?


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.