Hi,
I have a log with mixed fields the first two fields are outsude the json the third one is a json
Example:
tag : 02:00:00 --> {"v":"2.59","tagid":"2edfe14540b51a047dc8f66c219c7804","cb":"1665253253","ds":"loading","h":"250","o":"o64|o21|p18","ch":"o64|o21|p18|o48|o56|h","status":"5","l":"http://m.xxxxx.com","ts":"t:4|np:27|s:529|o64:3314|o21:3731|p18:4492|p:4492","uid":"ef95517ece868042a2e267ce78076342","kmnjid":"Vb1m2cAoJI4AABVxWMAAAAAj&563","kmnjts":"1438477848.468","geo":"NA|US","ip":"100.100.100.100","browser":"Chrome|43.0.2357.93|Android|M|Mobile","agent":"Mozilla/5.0 (Linux; Android 4.4.2; LG-D805 Build/KOT49I) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.93 Mobile Safari/537.36"}
is there any easy way to parse this log with logstash ?
update:
I want the logstash ignore the first 2 fields and parse just the json field if it possible.
I manage to do this with bash script that delete those fields but it's taking to much time.
btw
one log = 15 M records every hour.
I also have a mixed json data with the timestamp first which i followed with the following code:
grok{
match => [ "message", "%{DATESTAMP} %{GREEDYDATA:message}" ]
overwrite => ["message"]
}
json{
source => "message"
remove_field => ["message"]
}
}```
where `%{DATESTAMP}` is the timestamp at the beginning of the message.
Results is:
```"message": "{\\\"id\\\":\\\"55d09c10f2aee2000133673a\\\",\\\"fingerprintjs_id\\\":\\\"\\\",\\\"email\\\":\\\"\\\",\\\"app_key\\\":\\\"ZSp-0vi8_0yRDY66bW--dg\\\",\\\"referrer\\\":\\\"http://www.domain.com/\\\",\\\"client_timestamp\\\":\\\"2015-08-16T14:20:04.709Z\\\",\\\"page_view_client_key\\\":\\\"b9d235da2011439734804706\\\",\\\"user_agent\\\":\\\"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36\\\",\\\"keywords\\\":\\\"\\\",\\\"description\\\":\\\"Police, military and search teams would check the area in Oksibil district where there had been reports of the crash.\\\",\\\"title\\\":\\\"Missing Indonesian plane with 54 crashed, report residents - Khaleej Times\\\",\\\"uid\\\":\\\"55d09c10f2aee2000133673b\\\",\\\"session_id\\\":\\\"55d09c10f2aee2000133673c\\\",\\\"url\\\":\\\"http://www.domain.com/international/rest-of-asia/missing-indonesian-plane-with-54-crashed-report-residents\\\",\\\"ip\\\":\\\"83.110.196.21\\\",\\\"received_at\\\":\\\"2015-08-16T14:20:00.802Z\\\"}\\n\",\"stream\":\"stdout\",\"time\":\"2015-08-16T14:20:00.802862057Z\"}",```
but it still shows:
` [0] "_jsonparsefailure"
`
I think it is not normal with so many escape `\\\` characters in json? (Note: I am not using json codec but json filter only)
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.