Hi all, I'm looking to use log4j2 JSONLayout to generate log entries to be read by Filebeat. Everything works so far, but the timestamp on the log entries in the Kibana UI is set to when Filebeat read the logs, not when the log entry occurred. The log entry timestamp key is named timeMillis and it does not appear that this can be changed.
I'm trying to figure out the most efficient way to convert the timeMillis key to @timestamp as is needed for Kibana to use the timestamp value. It seems that Logstash could be used, but if I understand it correctly, Filebeat is a lot more efficient. There is a possibility to use some kind of JSON transform. I even tried to use some configurations for GELF files, but I couldn't get the reader side working (some complaint about UDP packet size).
Filebeat is very efficient and has significantly lower overhead than Logstash, but part of this is because it has less functionality and flexibility than Logstash with its plugin ecosystem. You should be able to do this in Logstash, but if you want to keep the architecture simple and send data directly from Filebeat to Elasticsearch, you can probably also perform the transformation through an Elasticsearch ingest pipeline - at least as long as you are on Elasticsearch 5.x.
{"timeMillis":1496816039789,"thread":"main-SendThread()","level":"DEBUG","loggerName":"org.apache.zookeeper.ClientCnxn","message":"Got ping response for sessionid: 0x15c71b4b03508ca after 0ms","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":37,"threadPriority":5}
So I just created the following:
PUT _ingest/pipeline/rename-log4j-timestamp
{
"description": "rename the timestamp field in a log4j2 row",
"processors": [
{
"convert": {
"field": "timeMillis",
"type": "string"
}
},
{
"date": {
"field": "timeMillis",
"formats": [
"UNIX_MS"
]
}
}
]
}
This uses two processors to convert the string and create the timestamp. My previously workingfilebeat.yml contains this:
I am able to see the old logs. They are parsed correctly for Kibana, they just don't have the correct timestamps from the log, rather they have the timestamps that were when the logs were read by Filebeat.
Thanks! That's a log entry generated when the pipeline entry is commented out in the filebeat.yml configuration. The logs flow in, but the timeMillis and @timestamp differ. The goal is that the @timestamp uses the log entry time instead of the time that Filebeat picked up the record from the file. But of course that time entry is a long representing milliseconds since epoch, hence the pipeline.
When I re-enable the pipeline and restart Filebeat, the entries in Kibana stop.
Since the pipeline works in the dev console simulator, there must be something I am doing wrong with the pipeline itself.
Do I have to send the data to a different node to have the ingest pipeline take effect? I tried it to the coordinator node as well as the ingest node and same results.
I am not sure I am understand. In the example you provided, the timeMillis field appears to correspond to the @timestamp field as this is in UTC (at least according to currentmillis.com). What timestamp are you expecting for the example event?
That is very odd! I can't imagine I had a copy paste error or that the logger picked up the log at exactly the same millisecond, but you are right.
To confirm, I just went through them again and picked a random entry off the list and updated the sample above. I also went through and tested again that when the pipeline is enabled that the log entries stop and when the pipeline is disabled that the pipeline entries resume (but with the incorrect time values). Lastly, I confirmed that when I disabled the pipeline, that the entries that swarmed in were also incorrect (wanted to make sure there wasn't a backlog of correct log entries that were for some reason suddenly flowing in).
As you may have noticed, these nodes are docker/mesos containers, which are a little difficult to log in to, but I guess that's the next step, to see if there are logs there that indicate the problem.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.