here is one of my fluentd log data 2017-04-23T16:20:31+08:00 cv.product.access.mobile {"race":"album","video_id":43633036,"ip":"117.177.78.48","cdn":"cdn-web-qn.colorv.cn","act":"update","ad_type":"AdExchange","agent":"Mozilla/5.0 (Linux; Android 5.1.1; vivo X6SPlus D Build/LMY47V; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/53.0.2785.49 Mobile MQQBrowser/6.2 TBS/043128 Safari/537.36 MicroMessenger/6.5.7.1041 NetType/WIFI Language/zh_CN","author_zone":0,"author_registered_at":"2015-08-02 13:06:05","post_id":0,"published_at":"","referer":"","reference_id":0,"sessid":"7e620e610447414bafe5091470f8b0b2","duration":144,"author_is_priest":0,"download_type":"myapp","author_udid":"d850a4a042d6382","status_404":"","author_version":"and-3.6.13-gdt","mold_id":10006,"url":"http://video.colorv.cn/play/43633036?from=timeline&isappinstalled=0&from=share","author_os":"and","page_kind":"mini","request_id":"ff6925b4c86b4d34be534a6609edfa2d","referrer_id":"","author_id":3934438,"play_time":60,"method":"GET","published":0}
I want to make a 2-step split.
the first step is to split it into 3 parts:
timestamp: 2017-04-23T16:20:31+08:00
log_type: cv.product.access.mobile
log_content: {the big json}
the second step is to split the log_content into many fields by each key in the json.
I have tried the following split for the first step, but it didn't work.
The split filter doesn't just split strings, it splits one event into multiple events. Use the mutate filter's split option or a grok filter to split the string. Then apply a json filter to the field with the JSON data.
The problem is probably that Logstash doesn't deal with escape sequences like \t in a consistent way. The easiest workaround is probably to use a grok filter for the parsing.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.