I have the following case. I have setup Filebeat to send the logs straight to Elasticsearch since there is no need for significant log parsing. I am receiving logs from different services and in the message field the date format is different for each service. The version I'm using is 5.6
I have setup a pipeline with grok and date processors like this:
PUT _ingest/pipeline/test_pipeline
{
"description": "timestamp test",
"processors": [
{
"grok": {
"field": "message",
"patterns": ["%{SYSLOGTIMESTAMP:systime}", "%{EXIM_DATE:eximdate}"]
},
"date": {
"field": "systime",
"formats": ["MMM dd HH:mm:ss"],
"ignore_failure": true
},
"date": {
"field": "eximdate",
"formats": ["yyyy-MM-dd HH:mm:ss"],
"ignore_failure": true
},
"remove": {
"field": ["systime", "eximdate"],
"ignore_failure": true
}
}
]
}
I have checked the patterns individually and they both work, the issue here is that when I'm simulation with messages of each type, the @timestamp
is updated only for the second pattern.
Here are the messages I am using to simulate the pipeline:
-
First pattern
POST _ingest/pipeline/test_pipeline/_simulate { "docs": [ { "_index": "index", "_source": { "message": "Sep 24 12:17:01 ubuntu CRON[28771]: pam_unix(cron:session): session closed for user root" } } ] }
-
Second pattern
POST _ingest/pipeline/test_pipeline/_simulate { "docs": [ { "_source": { "message": "2018-09-24 10:26:17 status unpacked apache2-utils:amd64 2.4.18-2ubuntu3.9" } } ] }
Here are the outputs for both types of messages
-
First pattern simulation output
{ "docs": [ { "doc": { "_index": "index", "_type": "_type", "_id": "_id", "_source": { "message": "Sep 24 12:17:01 ubuntu CRON[28771]: pam_unix(cron:session): session closed for user root" }, "_ingest": { "timestamp": "2018-09-25T12:41:42.843Z" } } } ] }
-
Second pattern simulation output
{ "docs": [ { "doc": { "_index": "index", "_type": "_type", "_id": "_id", "_source": { "@timestamp": "2018-09-24T10:26:17.000Z", "exim_month": "09", "exim_day": "24", "exim_time": "10:26:17", "exim_year": "2018", "message": "2018-09-24 10:26:17 status unpacked apache2-utils:amd64 2.4.18-2ubuntu3.9" }, "_ingest": { "timestamp": "2018-09-25T12:42:46.070Z" } } } ] }
You can clearly see that the @timestamp
field shows up only on the second kind of pattern. And yet, they work perfectly fine if there is only one pattern and one date processor in the pipeline.