Is there a definitive list of characters that need to be escaped in GROK patterns in logstash?
I'm trying to bring up logstash with this grok filter:
grok {
"message" => [
"%{IPORHOST:client_ip}\,%{DATA}\,%{DATA}\,\[%{DATA:date_timestamp} %{ISO8601_TIMEZONE:timezone}\]\,(?:%{WORD:http_verb} %{NOTSPACE:uri_path}(?: HTTP/%{NUMBER:http_versio
n})?|%{DATA:rawrequest})\,%{NUMBER:response_code}\,%{NUMBER:bytes}\,-\,%{DATA:agent}\,%{NUMBER:response_time}\,%{NUMBER:wait_time}",
"%{IPORHOST:client_ip} %{DATA} %{DATA} \[%{DATA:date_timestamp} %{ISO8601_TIMEZONE:timezone}\] \"%{WORD:http_verb} %{URIPATH:uri_path}(%{URIPARAM:uri_params}|) %{DATA}\"
%{NUMBER:response_code} (%{NUMBER:bytes}|%{DATA})"
]
}
And it's choking on the first pattern...
[2019-03-13T14:53:07,169][ERROR][logstash.agent ] fetched an invalid config {:config=>"input {\n beats {\n port => 5551\n }\n}\n\nfilter {\n grok {\n \"message\" => [\n \"%{IPORHOST:client_ip}\\,%{DATA}\\,%{DATA}\\,\\[%{DATA:date_timestamp} %{ISO8601_TIMEZONE:timezone}\\]\\,(?:%{WORD:http_verb} %{NOTSPACE:uri_path}(?: HTTP/%{NUMBER:http_version})?|%{DATA:rawrequest})\\,%{NUMBER:response_code}\\,%{NUMBER:bytes}\\,-\\,%{DATA:agent}\\,%{NUMBER:response_time}\\,%{NUMBER:wait_time}\",\n \"%{IPORHOST:client_ip} %{DATA} %{DATA} \\[%{DATA:date_timestamp} %{ISO8601_TIMEZONE:timezone}\\] \\\"%{WORD:http_verb} %{URIPATH:uri_path}(%{URIPARAM:uri_params}|) %{DATA}\\\" %{NUMBER:response_code} (%{NUMBER:bytes}|%{DATA})\"\n ]\n }\n\n date {\n match => [ \"date_timestamp\" , \"dd/MMM/yyyy:HH:mm:ss\" ]\n target => \"date_timestamp\"\n }\n\n if [bytes] == \"-\" {\n mutate {\n replace => {\"bytes\" => \"0\"}\n }\n }\n\n mutate {\n convert => {\n response_code => \"integer\"\n bytes => \"integer\"\n }\n }\n}\n\noutput{\n elasticsearch {\n hosts => [\"${ELASTIC_SEARCH_URL:localhost:9200}\"]\n index => \"avro-tomcat-access-%{+YYYY.MM.dd}\"\n }\n}\n\n\n", :reason=>"Something is wrong with your configuration.", :backtrace=>["/opt/elk/logstash-5.1.2/logstash-core/lib/logstash/config/mixin.rb:125:in `config_init'", "/opt/elk/logstash-5.1.2/logstash-core/lib/logstash/filters/base.rb:128:in `initialize'", "/opt/elk/logstash-5.1.2/vendor/bundle/jruby/1.9/gems/logstash-filter-grok-3.3.0/lib/logstash/filters/grok.rb:237:in `initialize'", "/opt/elk/logstash-5.1.2/logstash-core/lib/logstash/filter_delegator.rb:20:in `initialize'", "/opt/elk/logstash-5.1.2/logstash-core/lib/logstash/pipeline.rb:456:in `plugin'", "(eval):12:in `initialize'", "org/jruby/RubyKernel.java:1079:in `eval'", "/opt/elk/logstash-5.1.2/logstash-core/lib/logstash/pipeline.rb:93:in `initialize'", "/opt/elk/logstash-5.1.2/logstash-core/lib/logstash/agent.rb:237:in `create_pipeline'", "/opt/elk/logstash-5.1.2/logstash-core/lib/logstash/agent.rb:94:in `register_pipeline'", "/opt/elk/logstash-5.1.2/logstash-core/lib/logstash/runner.rb:259:in `execute'", "/opt/elk/logstash-5.1.2/vendor/bundle/jruby/1.9/gems/clamp-0.6.5/lib/clamp/command.rb:67:in `run'", "/opt/elk/logstash-5.1.2/logstash-core/lib/logstash/runner.rb:178:in `run'", "/opt/elk/logstash-5.1.2/vendor/bundle/jruby/1.9/gems/clamp-0.6.5/lib/clamp/command.rb:132:in `run'", "/opt/elk/logstash-5.1.2/lib/bootstrap/environment.rb:71:in `(root)'"]}
Which isn't telling me much... I'm wondering if it's special characters that need to be escaped... is there such a list of all the ones that need to be escaped?
This pattern is working fine when I use it in my Grok Debugger ( https://grokdebug.herokuapp.com/ ). And yes... this is slightly modified Tomcat logs that I'm trying to parse that are comma separated instead of in the usual format.
Any suggestions?
thanks!