This includes 3 timestamp fields (2 @timestamp and 1 actual timestamp in log message). when I ship the events directly from filebeat to elasticsearch it looks like the following image in kibana:
I need the events to be indexed like the above picture when I use kafka and logstash between the filebeat and elasticsearch. In other words I want the actual log message doesn't be wrapped by another message by logstash and no extra field to be added to the event. (I tried to remove @timestamp of logstash but it throws an exception and exits)
Also, I couldn't extract the timestamp of the log message by using "yyyy-mm-dd hh:mm:ss". Is there any pattern to match it?
I tested many things including grok, date, mutate, etc. but couldn't solve the problem.
It looks like the JSON documents coming from Filebeat are not parsed, so you may need to add a json codec (or possibly a json_lines codec ) to the Kafka input.
If you want @timestamp to correspond to the timestamp in the log message you will also need to extract this using e.g. a grok filter and process the extracted timestamp field with a date filter.
I added codec => {json {}} to kafka input but seems it's wrong. what is the correct syntax?
According to the timestamp format of my actual log message, what filter (time format/template) of grok I must use to extract it? could you please write a config snippet? Thanks.
You have specified a coded for the stdout output, so the codec for the Kafka input should follow the same pattern. If you show the output from stdout with rubydebug codec, it will be a lot easier to see exactly what the events look like. Please do not post screenshots of text or Kibana for this.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.