Grok error while loading

stack-security
(Vinit Pathri) #1

i am running a logstash for the log file with below a sample entry

Sample entry :
72.14.199.105 - - [30/Apr/2019:06:21:26 +0000] "GET /catalog/view/javascript/font-awesome/css/font-awesome.min.css HTTP/1.1" 200 4748 "https://www.orderhealth.in/drmorepen/drmorepenbg03" "Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1 (compatible; AdsBot-Google-Mobile; +http://www.google.com/mobile/adsbot.html)"

using grok as

grok {
match => ["%{IP:Clientip} %{USER:user} %{USER:auth} [%{HTTPDATE:apache_timestamp}] "%{WORD:method} /%{NOTSPACE:request_page} HTTP/%{NUMBER:http_version}" %{WORD:response} %{NUMBER:bytes} "%{URI:page}" "(?[\w/\d.\s(;)(,-]+) +(?[+\w:/.)]+)" ]
}

on running logstash
below error is coming.

[ERROR][logstash.agent ] Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"LogStash::ConfigurationError", :message=>"Expected one of #, input, filter, output at line 10, column 1 (byte 230) after ", :backtrace=>["/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:41:in compile_imperative'", "/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:49:incompile_graph'", "/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:11:in block in compile_sources'", "org/jruby/RubyArray.java:2577:inmap'", "/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:10:in compile_sources'", "org/logstash/execution/AbstractPipelineExt.java:151:ininitialize'", "org/logstash/execution/JavaBasePipelineExt.java:47:in initialize'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:23:ininitialize'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:36:in execute'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:325:inblock in converge_state'"]}

"grok" is at line 10 in my config file,
please help me in understanding that where i am going wrong here

Thanks in advance

#2

Hi,

one issue I see is that you do not tell the grok filter which field to match the pattern to. From the documentation

   filter {
     grok {
       match => {
         "message" => "Duration: %{NUMBER:duration}"
       }
     }
   }

The issue you are having is something else though

Could you post your Logstash config. Sounds like the whole syntax that Logstash is expecting is missing. Do you use more than one config file? Line 10 is line 10 of the concatenated config file (if you use several files.

(Vinit Pathri) #3

here is the conf file

Sample Logstash configuration for creating a simple

Beats -> Logstash -> Elasticsearch pipeline.

input {
file {
path => "/usr/share/logstash/logs-data/orderhealth.in-ssl_log-Apr-2019"
start_position => "beginning"
}
}

grok {
match => ["%{IP:Clientip} %{USER:user} %{USER:auth} [%{HTTPDATE:apache_timestamp}] "%{WORD:method} /%{NOTSPACE:request_page} HTTP/%{NUMBER:http_version}" %{WORD:response} %{NUMBER:bytes} "%{URI:page}" "(?[\w/\d.\s(;)(,-]+) +(?[+\w:/.)]+)" ]
}

output {
elasticsearch {
hosts => ["localhost"]
index => "logs-ssl"
}
}

and yes i am using 4 config files but all are for different indexes.

#4

As far as I know, all those files will be combined into one on Logstash startup.

Except for the issue with how the grokpattern is defined, that particular config file looks ok.

I would test each config file individually with ./logstash -f first_config_file.conf to see if they are all ok as far as syntx goes. Once they are ok individually you can try to start Logstash with all of them.

(Vinit Pathri) #5

in that case it should show error when i am running logstash with other config files but it is working fine with them.

let me check this point although i tried it earlier with this but no luck

(Vinit Pathri) #6

so now my config file is as below

input {
file {
path => "/usr/share/logstash/logs-data/orderhealth.in-ssl_log-Apr-2019"
start_position => "beginning"
}
}

filter {
grok {
match => { "message" => "%{IP:Clientip} %{USER:user} %{USER:auth} [%{HTTPDATE:apache_timestamp}] "%{WORD:method} /%{NOTSPACE:request_page} HTTP/%{NUMBER:http_version}" %{WORD:response} %{NUMBER:bytes} "%{URI:page}" "(?[\w/\d.\s(;)(,-]+) +(?[+\w:/.)]+)"
}
}
}

output {
elasticsearch {
hosts => ["localhost"]
index => "logs-ssl"
}
}

and now the error is

Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"LogStash::ConfigurationError", :message=>"Expected one of #, => at line 11, column 4 (byte 429) after filter {\n\tgrok {\n \t\tmatch => { "message" => "%{IP:Clientip} %{USER:user} %{USER:auth} \[%{HTTPDATE:apache_timestamp}\] \"%{WORD:method} /%{NOTSPACE:request_page} HTTP/%{NUMBER:http_version}\" %{WORD:response} %{NUMBER:bytes} \"%{URI:page}" "(?[\w/\d.\s(;)(,-]+) +(?[+\w:/.)]+)" \n\t\t\t", :backtrace=>["/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:41:in compile_imperative'", "/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:49:incompile_graph'", "/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:11:in block in compile_sources'", "org/jruby/RubyArray.java:2577:inmap'", "/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:10:in compile_sources'", "org/logstash/execution/AbstractPipelineExt.java:151:ininitialize'", "org/logstash/execution/JavaBasePipelineExt.java:47:in initialize'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:23:ininitialize'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:36:in execute'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:325:inblock in converge_state'"]}

#7

Did a quick test with your grok pattern and looks like that is the root of the problem...

The whole pattern is quoted with double quotes, so any double quotes inside the grok pattern have to be escaped somehow.

Did you test your grok pattern in any way?

This seems to work for me

%{IP:Clientip} %{USER:user} %{USER:auth} \[%{HTTPDATE:apache_timestamp}\] \"%{WORD:method} %{UNIXPATH:request_page} HTTP/%{NUMBER:http_version}\" %{WORD:response} %{NUMBER:bytes} \"%{URI:page}\" \"%{GREEDYDATA:field_name}\"$
(Vinit Pathri) #8

yes used debugger only
let me check again :slight_smile:

(Charlie) #9

Add the ^ and $ to your grok.
You are making it more expensive.

Use also:
break_on_match => true

(Vinit Pathri) #10

meanwhile .please help me with 1 more thing

i want to break the below part in 2 fields like

"Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1 (compatible; AdsBot-Google-Mobile; +http://www.google.com/mobile/adsbot.html)"

want http://www.google.com/mobile/adsbot.html in 1 and rest in another
actually i was trying to achieve this only with my grok pattern (which i was already doubtful that if i am doing it correct or not :smiley: )

#11

Just some test results (not included the additional which above)

I tested with this config

# cat ls-test1.conf
input { stdin { codec => "json" } }

filter {

  grok {
    match => {
      "message" => "%{IP:Clientip} %{USER:user} %{USER:auth} \[%{HTTPDATE:apache_timestamp}\] \"%{WORD:method} %{UNIXPATH:request_page} HTTP/%{NUMBER:http_version}\" %{WORD:response} %{NUMBER:bytes} \"%{URI:page}\" \"%{GREEDYDATA:field_name}\"$"
    }
  }
}
output {
  stdout { codec => rubydebug }
}

And got this result when putting your sample data through Logstash

...
[INFO ] 2019-05-17 09:01:22.670 [Api Webserver] agent - Successfully started Logstash API 
endpoint {:port=>9601}
{"field1":"hello","message":"72.14.199.105 - - [30/Apr/2019:06:21:26 +0000] \"GET /catalog/view/javascript/font-awesome/css/font-awesome.min.css HTTP/1.1\" 200 4748 \"https://www.orderhealth.in/drmorepen/drmorepenbg03\" \"Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1 (compatible; AdsBot-Google-Mobile; +http://www.google.com/mobile/adsbot.html)\""}
{
              "field1" => "hello",
            "response" => "200",
    "apache_timestamp" => "30/Apr/2019:06:21:26 +0000",
              "method" => "GET",
                "user" => "-",
            "Clientip" => "72.14.199.105",
        "request_page" => "/catalog/view/javascript/font-awesome/css/font-awesome.min.css",
             "message" => "72.14.199.105 - - [30/Apr/2019:06:21:26 +0000] \"GET /catalog/view/javascript/font-awesome/css/font-awesome.min.css HTTP/1.1\" 200 4748 \"https://www.orderhealth.in/drmorepen/drmorepenbg03\" \"Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1 (compatible; AdsBot-Google-Mobile; +http://www.google.com/mobile/adsbot.html)\"",
                "host" => "foo.bar.net",
            "@version" => "1",
                "page" => "https://www.orderhealth.in/drmorepen/drmorepenbg03",
          "@timestamp" => 2019-05-17T09:01:35.610Z,
                "auth" => "-",
          "field_name" => "Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1 (compatible; AdsBot-Google-Mobile; +http://www.google.com/mobile/adsbot.html)",
               "bytes" => "4748",
        "http_version" => "1.1"
}
(Vinit Pathri) #12

what are these signs for in grok?
without "$" what are the disadvantages?

(Vinit Pathri) #13

@A_B & @pastechecker : i have tried with below reg-ex it is giving required output, is it corerct performance wise?

%{IP:Clientip} %{USER:user} %{USER:auth} [%{HTTPDATE:apache_timestamp}] "%{WORD:method} %{UNIXPATH:request_page} HTTP/%{NUMBER:http_version}" %{WORD:response} %{NUMBER:bytes} "%{URI:page}" "%{GREEDYDATA:browserdetails}+%{URI:referby}

#14

You should definitely read Do you grok Grok? on the official elastic blog.

^ anchors your pattern to the beginning of line. Suppose we have a line like

2016-09-19T18:19:00 DEBUG this is an example log message

and we try to match it with

%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:log_level} %{GREEDYDATA:message}

That is OK, it will match. But suppose we have a different line such as

Hello, world!

That is not going to match. But to test that it has to see if the string "Hello, world!" matches a TIMESTAMP_ISO8601, then test if the string "ellow, world!" matches, then test if the string "llow, world!" matches and so on. If we had started with the pattern

^%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:log_level} %{GREEDYDATA:message}

then it would only have to "Hello, world!" and not any of the sub-strings. Lots of folks assume grok patterns are implicitly anchored by they are not. Our original pattern would match both these lines

Hello, I found this in a log file: 2016-09-19T18:19:00 DEBUG this is an example log message

2016-09-19T18:19:00 DEBUG this is an example log message

$ works the same way for end of line. Rarely has anywhere near as much performance impact as ^.