A few questions about grok

EDIT
It looks like my formatting got blown out so all my config changes are now in pastebin in a hopefully easier to read format at https://pastebin.com/DJ0FawPp
End Edit

I'm trying to use logstash to parse scrapy logs. My logstash config is as follows:
see first config in pastebin link

This works great and outputs (among other lines) this: (See pastebin)

Now, I also want to get the response code, url and the json entry contained in the new 'message' in the above, so I change my conf filter to the new one in the pastebin

and add a file in /vagrant/patterns that contains this:
CODE [0-9]{3}
URL http[s]?://.*
JITEM {.*}

I have checked the above in grok debugger, and it yields the expected results, but running it in logstash does not display the new response code, url, or jitem fields. Is there something wrong with the way I'm doing things? Is there something I have to do to get the new entries to show up?

Thanks

Do not try to match a newline with \n, use a literal newline in the grok pattern.

match => { "message" => "<%{CODE} %{URL}>
%{JITEM}" }

Thanks, but when I try that I get a "_grokparsefailure" error

You haven't name the fields to be captured.

input { generator { count => 1 lines => [ '' ] } }
filter {
    mutate { add_field => { "someField" => "Scraped from <200 https://example.com/new_posts>
{'link': 'https://example.com//3g67o/post/hjg78g78t',
 'image': 'https://cdnthumb2.example.com/liuh89/ios.jpg'}" } }

    grok {
        pattern_definitions => {
            "CODE" => "[0-9]{3}"
            "URL" => "http[s]?://.*"
            "JITEM" => "{.*}"
        }
        match => { "someField" => "<%{CODE:code} %{URL:url}>
%{JITEM:jitem}" }
    }
}

results in

      "code" => "200",
       "url" => "https://example.com/new_posts",
     "jitem" => "{'link': 'https://example.com//3g67o/post/hjg78g78t',\n 'image': 'https://cdnthumb2.example.com/liuh89/ios.jpg'}"

That was it! Thank you my friend!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.