Extract timestamp from the logline

I am trying to index log files to Elastic search. All the log entries are being indexed into a field named message. @timestamp field shows the time the entry was indexed and not the timestamp from log entry.

I created a ingest pipeline with grok processor to define the pattern of the log entry. I have tried several patterns and am unable to get this working, particularly because i am new to grok.

All i want is the ability to extract the timestamp from the log message and everything else can be ignored or wildcarded or stored in just one variable like message

Any help would be appreciated

Log sample
2019-08-05 00:04:06 error [error.js]: No authorization token was found
2019-08-05 00:04:06 info [index.js]: Request: HTTP GET /
2019-08-05 00:04:06 error [error.js]: No authorization token was found

Ingest pipeline with grok & date processor
{
"description" : "Extracting date from log line"
, "processors": [
{
"grok": {
"field": "message",
"patterns": ["%{yyyy-mm-dd HH:mm:ss:logtime} %{LOGLEVEL:loglevel} %{GREEDYDATA:message}"]
},
"date": {
"field": "logtime",
"target_field": "@timestamp",
"formats": ["yyyy-mm-dd HH:mm:ss"]
}
}
]
}

Justin ...

I ran into something similar trying to get JSON output from TShark into ElasticSearch .... for the life of me I couldn't get the timestamp to pull from the JSON log and translate properly ...

I found this article extremely helpful ...

https://www.elastic.co/guide/en/elasticsearch/reference/current/removal-of-types.html#_typeless_apis_in_7_0

Here's what I did step by step to make the timestamp show correctly ...

  1. From the Kibana console I created the Elasticsearch index first ...
  2. From the console I then added a document to the new index using the "PUT" command and tagging the end with ?pipeline=<pipeline_name>
  3. Tested the display of the timestamp using Kibana ...

I'm by no means at all an Elasticsearch/Kibana guru but I wonder if perhaps what you're missing is the creation of the index first and properly defining the "timestamp" field in the index prior to ingesting ... I ran into all sorts of issues until I created the index first ...

Some possible food for thought ... cheers!

Hi @michaelberg

Thanks for taking the time to respond. Yes i did try creating an index before ingesting any data to it, the mapping is created etc but the enter log line is indexed into a filed named message.

That is when i realized the filebeat cannot really process or parse the data. Do not want to use logstash for such a simple task. I created ingest pipeline with grok and date processor to possibly just extract the timestamp and leave the rest of the log message in the message field.

I am using the _simulate pipeline API which is a great feature to test how the data is getting ingested and investigate and fix any issues.

post _ingest/pipeline/redate/_simulate
{
  "docs":[
    {
      "_source":{
        "message":"2019-08-04 12:02:39 info [index.js]: Request: HTTP GET /"
      }
    }
    ]
}

I get this below error

"type": "exception",
"reason": "java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Provided Grok expressions do not match field value: [2019-08-04 12:02:39 info [index.js]: Request: HTTP GET /]",

Clearly indicating that the grok pattern i am trying to use is not correct

Just curious ... have you tried using the GROK Debugger in Kibana to test the GROK Pattern against a sample line from the log?

Yes i did, found the grok pattern that actually works finally!

PUT _ingest/pipeline/redate
{
"description" : "Extracting date from log line"
, "processors": [
{
"grok": {
"field": "message",
"patterns": ["%{TIMESTAMP_ISO8601:logtime} %{LOGLEVEL:loglevel} %{GREEDYDATA:message}"]
},
"date": {
"field": "logtime",
"target_field": "@timestamp",
"formats": ["yyyy-mm-dd HH:mm:ss","ISO8601"]
}
}
]
}

Now i am able to simulate different log messages and get desired result but filebeat is unable to recognize the ingest pipeline. I am getting this error

ERROR pipeline/output.go:121 Failed to publish events: temporary bulk send failure

In the filebeat elasticsearch output configuration, i have the following config

output.elasticsearch:
    hosts: ["host"]
    index: "index-name-%{+yyyy.MM}"
    pipeline: "redate"

If i remove the pipeline from the config then the logs are getting indexed, if pipeline entry is added to filebeat config then i get those errors. Any thoughts

I made this below change and the log messages are getting indexed. Although i do not understand how, appreciate if someone can shed some light on it

I had the pipeline: "pipelinename" setting in Elasticsearch output section of the filebeat config file. I moved that line to filebeat inputs section right under file path section, like so

filebeat.inputs:
-type: log
paths:
- D:\home\site\wwwroot\logs*.log
pipeline: "redate"

And the log messages are getting indexed now.

Trying taking the quotes off the name of the pipeline in the Config file and retry ...