Mixed log with json field

(Or Cohen) #1

I have a log with mixed fields the first two fields are outsude the json the third one is a json


tag : 02:00:00 --> {"v":"2.59","tagid":"2edfe14540b51a047dc8f66c219c7804","cb":"1665253253","ds":"loading","h":"250","o":"o64|o21|p18","ch":"o64|o21|p18|o48|o56|h","status":"5","l":"http://m.xxxxx.com","ts":"t:4|np:27|s:529|o64:3314|o21:3731|p18:4492|p:4492","uid":"ef95517ece868042a2e267ce78076342","kmnjid":"Vb1m2cAoJI4AABVxWMAAAAAj&563","kmnjts":"1438477848.468","geo":"NA|US","ip":"","browser":"Chrome|43.0.2357.93|Android|M|Mobile","agent":"Mozilla/5.0 (Linux; Android 4.4.2; LG-D805 Build/KOT49I) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.93 Mobile Safari/537.36"}

is there any easy way to parse this log with logstash ?

I want the logstash ignore the first 2 fields and parse just the json field if it possible.
I manage to do this with bash script that delete those fields but it's taking to much time.
one log = 15 M records every hour.

(Ed) #2

you would have to split the field first


Then you could use the json filter on your data

(Or Cohen) #3

Thank you eperry for the reply
I'm new on logstash and I can't figure how to split the log.
can you please help me with this.

I added this to the configuration but this not working

filter {
mutate {
split => { "message" => "-->" }

I believe the field name is message, am I right ?

(Ed) #4

looks right I would run your data though it and see.

(Or Cohen) #5

thanks alot

(Magnus Bäck) #6

Instead of splitting the message I'd use grok.

filter {
  grok {
    match => ["message", "tag: %{INT}:%{INT}:%{INT} --> %{GREEDYDATA:message}"]
    overwrite => ["message"]
  json {
    source => "message"
    remove_field => ["message"]

(Or Cohen) #7

Thank you very very much
you're a lifesaver

(Muhammad Nuzaihan) #8

Hi magnus,

I also have a mixed json data with the timestamp first which i followed with the following code:

         match => [ "message", "%{DATESTAMP} %{GREEDYDATA:message}" ]
         overwrite => ["message"]
        source => "message"
        remove_field => ["message"]

where `%{DATESTAMP}` is the timestamp at the beginning of the message.

 Results is:
```"message": "{\\\"id\\\":\\\"55d09c10f2aee2000133673a\\\",\\\"fingerprintjs_id\\\":\\\"\\\",\\\"email\\\":\\\"\\\",\\\"app_key\\\":\\\"ZSp-0vi8_0yRDY66bW--dg\\\",\\\"referrer\\\":\\\"http://www.domain.com/\\\",\\\"client_timestamp\\\":\\\"2015-08-16T14:20:04.709Z\\\",\\\"page_view_client_key\\\":\\\"b9d235da2011439734804706\\\",\\\"user_agent\\\":\\\"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36\\\",\\\"keywords\\\":\\\"\\\",\\\"description\\\":\\\"Police, military and search teams would check the area in Oksibil district where there had been reports of the crash.\\\",\\\"title\\\":\\\"Missing Indonesian plane with 54 crashed, report residents - Khaleej Times\\\",\\\"uid\\\":\\\"55d09c10f2aee2000133673b\\\",\\\"session_id\\\":\\\"55d09c10f2aee2000133673c\\\",\\\"url\\\":\\\"http://www.domain.com/international/rest-of-asia/missing-indonesian-plane-with-54-crashed-report-residents\\\",\\\"ip\\\":\\\"\\\",\\\"received_at\\\":\\\"2015-08-16T14:20:00.802Z\\\"}\\n\",\"stream\":\"stdout\",\"time\":\"2015-08-16T14:20:00.802862057Z\"}",```

but it still shows:
`        [0] "_jsonparsefailure"

I think it is not normal with so many escape `\\\` characters in json? (Note: I am not using json codec but json filter only)

(Magnus Bäck) #9

@muhammadn – please start a new topic for your new and unrelated question.

(system) #10