Grok Parsing an Internal JSON Part (IMAP Plugin)

I Am familiar with Grok and Regexes , I Have the following Logstash Conf file :

Basically it uses the IMAP Plugin in order to read Emails that are sent into the mailbox , As you can see - it tries to parse out (grok) a specific data from the email in order to parse the JSON Part

The Plugin :

input {
imap {
host => "imap.gmail.com"
user => "user@pork.com"
password => "pass"
port => 993
secure => true
fetch_count => 100
check_interval => 10
}
} 

#Grokking the Message #
filter {
grok {
#match => {"message" => "Full Response\:\\n%{GREEDYDATA:json}\}"}
match => {"message" => "Full Response: %{GREEDYDATA:json}\}"}
#match => {"message" => "(?<json>Full Response:\\n(.|\r|\n)*)"}
break_on_match => false

}

json { source => "json"
add_tag => "Parsed"
}

output {
file {
path => "/tmp/emailtmp.log"
}
stdout {
codec => rubydebug }
}

For some reason i keep recieving __grokparsefailure and the JSON is , ofcourse , not parsed - I Only need the JSON Part (After the Full Response)

Tried various ways , any idea ?

Thanks!

Email Body available here - >

http://pastebin.com/wtmJztB5

grok doesn't like line returns.

So before the grok filter, I advice you to replace all line returns by a space.

Hi ,
Do you mean GSubbing /r/n to /s?

Ok , Managed Via this :slight_smile:

filter {
mutate {
    gsub => [
    "message", "[\\?#-]", "",
    "message", "\n", " "
]}

grok { match => {"message" => "(?<json_raw>\{.+?\}})"}}

Now , I have the folowing Json in a field called json_raw

Json looks escaped

IE

{"md5":"809e29c9765542645c882dddad35","internalID":3821800,"submitDate":"20160T02:40:48.0000700","versionID":"20160317_3820030","batch":"api_c14ca113752f98ce","url":" http://www.porkalolza.co.il/עי1עמק_%D}

Any idea how to parse this ? using the json filter results in jsonparsefailure

json filter is the right filter to do the job.

Looking to your json, it seems that a " miss at the end just before }

If it's not this, can you give the full error message that json filter raises.

Its not the full JSON - The Json looks like this :

http://pastebin.com/53He3BPn

The problem is that your json is invalid.
You have some values that are HTML, and where you have " as attribute delimiter, like here :
,\"location\":\"<iframe id=\"a61d045b\" name=\"a61d045b\"

These html attribute delimiters (") are in conflict with json value delimiters (").

So, to work, you must transform each html value into a valid json value. For example, replacing " by '