Mixed log with json field


(Or Cohen) #1

Hi,
I have a log with mixed fields the first two fields are outsude the json the third one is a json

Example:

tag : 02:00:00 --> {"v":"2.59","tagid":"2edfe14540b51a047dc8f66c219c7804","cb":"1665253253","ds":"loading","h":"250","o":"o64|o21|p18","ch":"o64|o21|p18|o48|o56|h","status":"5","l":"http://m.xxxxx.com","ts":"t:4|np:27|s:529|o64:3314|o21:3731|p18:4492|p:4492","uid":"ef95517ece868042a2e267ce78076342","kmnjid":"Vb1m2cAoJI4AABVxWMAAAAAj&563","kmnjts":"1438477848.468","geo":"NA|US","ip":"100.100.100.100","browser":"Chrome|43.0.2357.93|Android|M|Mobile","agent":"Mozilla/5.0 (Linux; Android 4.4.2; LG-D805 Build/KOT49I) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.93 Mobile Safari/537.36"}

is there any easy way to parse this log with logstash ?

update:
I want the logstash ignore the first 2 fields and parse just the json field if it possible.
I manage to do this with bash script that delete those fields but it's taking to much time.
btw
one log = 15 M records every hour.


(Ed) #2

you would have to split the field first

https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html#plugins-filters-mutate-split

Then you could use the json filter on your data
https://www.elastic.co/guide/en/logstash/current/plugins-filters-json.html


(Or Cohen) #3

Thank you eperry for the reply
I'm new on logstash and I can't figure how to split the log.
can you please help me with this.

I added this to the configuration but this not working

filter {
mutate {
split => { "message" => "-->" }
}
}

I believe the field name is message, am I right ?


(Ed) #4

looks right I would run your data though it and see.


(Or Cohen) #5

thanks alot


(Magnus Bäck) #6

Instead of splitting the message I'd use grok.

filter {
  grok {
    match => ["message", "tag: %{INT}:%{INT}:%{INT} --> %{GREEDYDATA:message}"]
    overwrite => ["message"]
  }
  json {
    source => "message"
    remove_field => ["message"]
  }
}

(Or Cohen) #7

Magnus,
Thank you very very much
you're a lifesaver


(Muhammad Nuzaihan) #8

Hi magnus,

I also have a mixed json data with the timestamp first which i followed with the following code:

    grok{
         match => [ "message", "%{DATESTAMP} %{GREEDYDATA:message}" ]
         overwrite => ["message"]
    }
    json{
        source => "message"
        remove_field => ["message"]
    }
}```

where `%{DATESTAMP}` is the timestamp at the beginning of the message.

 Results is:
```"message": "{\\\"id\\\":\\\"55d09c10f2aee2000133673a\\\",\\\"fingerprintjs_id\\\":\\\"\\\",\\\"email\\\":\\\"\\\",\\\"app_key\\\":\\\"ZSp-0vi8_0yRDY66bW--dg\\\",\\\"referrer\\\":\\\"http://www.domain.com/\\\",\\\"client_timestamp\\\":\\\"2015-08-16T14:20:04.709Z\\\",\\\"page_view_client_key\\\":\\\"b9d235da2011439734804706\\\",\\\"user_agent\\\":\\\"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36\\\",\\\"keywords\\\":\\\"\\\",\\\"description\\\":\\\"Police, military and search teams would check the area in Oksibil district where there had been reports of the crash.\\\",\\\"title\\\":\\\"Missing Indonesian plane with 54 crashed, report residents - Khaleej Times\\\",\\\"uid\\\":\\\"55d09c10f2aee2000133673b\\\",\\\"session_id\\\":\\\"55d09c10f2aee2000133673c\\\",\\\"url\\\":\\\"http://www.domain.com/international/rest-of-asia/missing-indonesian-plane-with-54-crashed-report-residents\\\",\\\"ip\\\":\\\"83.110.196.21\\\",\\\"received_at\\\":\\\"2015-08-16T14:20:00.802Z\\\"}\\n\",\"stream\":\"stdout\",\"time\":\"2015-08-16T14:20:00.802862057Z\"}",```

but it still shows:
`        [0] "_jsonparsefailure"
`

I think it is not normal with so many escape `\\\` characters in json? (Note: I am not using json codec but json filter only)

(Magnus Bäck) #9

@muhammadn – please start a new topic for your new and unrelated question.


(system) #10