Extract timestamp from the logline

justin1 · August 14, 2019, 10:36pm

I am trying to index log files to Elastic search. All the log entries are being indexed into a field named message. @timestamp field shows the time the entry was indexed and not the timestamp from log entry.

I created a ingest pipeline with grok processor to define the pattern of the log entry. I have tried several patterns and am unable to get this working, particularly because i am new to grok.

All i want is the ability to extract the timestamp from the log message and everything else can be ignored or wildcarded or stored in just one variable like message

Any help would be appreciated

Log sample
2019-08-05 00:04:06 error [error.js]: No authorization token was found
2019-08-05 00:04:06 info [index.js]: Request: HTTP GET /
2019-08-05 00:04:06 error [error.js]: No authorization token was found

Ingest pipeline with grok & date processor
{
"description" : "Extracting date from log line"
, "processors": [
{
"grok": {
"field": "message",
"patterns": ["%{yyyy-mm-dd HH:mm:ss:logtime} %{LOGLEVEL:loglevel} %{GREEDYDATA:message}"]
},
"date": {
"field": "logtime",
"target_field": "@timestamp",
"formats": ["yyyy-mm-dd HH:mm:ss"]
}
}
]
}

michaelberg · August 15, 2019, 5:29am

Justin ...

I ran into something similar trying to get JSON output from TShark into ElasticSearch .... for the life of me I couldn't get the timestamp to pull from the JSON log and translate properly ...

I found this article extremely helpful ...

https://www.elastic.co/guide/en/elasticsearch/reference/current/removal-of-types.html#_typeless_apis_in_7_0

Here's what I did step by step to make the timestamp show correctly ...

From the Kibana console I created the Elasticsearch index first ...
From the console I then added a document to the new index using the "PUT" command and tagging the end with ?pipeline=<pipeline_name>
Tested the display of the timestamp using Kibana ...

I'm by no means at all an Elasticsearch/Kibana guru but I wonder if perhaps what you're missing is the creation of the index first and properly defining the "timestamp" field in the index prior to ingesting ... I ran into all sorts of issues until I created the index first ...

Some possible food for thought ... cheers!

justin1 · August 15, 2019, 5:56pm

Hi @michaelberg

Thanks for taking the time to respond. Yes i did try creating an index before ingesting any data to it, the mapping is created etc but the enter log line is indexed into a filed named message.

That is when i realized the filebeat cannot really process or parse the data. Do not want to use logstash for such a simple task. I created ingest pipeline with grok and date processor to possibly just extract the timestamp and leave the rest of the log message in the message field.

I am using the _simulate pipeline API which is a great feature to test how the data is getting ingested and investigate and fix any issues.

post _ingest/pipeline/redate/_simulate
{
  "docs":[
    {
      "_source":{
        "message":"2019-08-04 12:02:39 info [index.js]: Request: HTTP GET /"
      }
    }
    ]
}

I get this below error

"type": "exception",
"reason": "java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Provided Grok expressions do not match field value: [2019-08-04 12:02:39 info [index.js]: Request: HTTP GET /]",

Clearly indicating that the grok pattern i am trying to use is not correct

michaelberg · August 15, 2019, 10:18pm

Just curious ... have you tried using the GROK Debugger in Kibana to test the GROK Pattern against a sample line from the log?

justin1 · August 15, 2019, 10:55pm

Yes i did, found the grok pattern that actually works finally!

PUT _ingest/pipeline/redate
{
"description" : "Extracting date from log line"
, "processors": [
{
"grok": {
"field": "message",
"patterns": ["%{TIMESTAMP_ISO8601:logtime} %{LOGLEVEL:loglevel} %{GREEDYDATA:message}"]
},
"date": {
"field": "logtime",
"target_field": "@timestamp",
"formats": ["yyyy-mm-dd HH:mm:ss","ISO8601"]
}
}
]
}

Now i am able to simulate different log messages and get desired result but filebeat is unable to recognize the ingest pipeline. I am getting this error

ERROR pipeline/output.go:121 Failed to publish events: temporary bulk send failure

In the filebeat elasticsearch output configuration, i have the following config

output.elasticsearch:
    hosts: ["host"]
    index: "index-name-%{+yyyy.MM}"
    pipeline: "redate"

If i remove the pipeline from the config then the logs are getting indexed, if pipeline entry is added to filebeat config then i get those errors. Any thoughts

justin1 · August 15, 2019, 11:16pm

I made this below change and the log messages are getting indexed. Although i do not understand how, appreciate if someone can shed some light on it

I had the pipeline: "pipelinename" setting in Elasticsearch output section of the filebeat config file. I moved that line to filebeat inputs section right under file path section, like so

filebeat.inputs:
-type: log
paths:
- D:\home\site\wwwroot\logs*.log
pipeline: "redate"

And the log messages are getting indexed now.

michaelberg · August 15, 2019, 11:52pm

Trying taking the quotes off the name of the pipeline in the Config file and retry ...

system · September 12, 2019, 11:52pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How can i use the timestamp in the source file rather than the timestamp generate automatically Beats	8	1601	January 27, 2018
Ingest pipeline drops the field as @timestamp while loading Elasticsearch's server and slowlog using filebeat Beats filebeat	10	1512	October 5, 2020
Enrich filebeat content using ingest pipeline Beats filebeat	4	1420	January 4, 2017
How to setup pipline to extract file using grok in Elastic if the timestamp format is "20Aug21 20:36:07.058931 @fsafsasfdasdsadasd" Elasticsearch ingest-pipeline	9	434	March 27, 2022
Get timestamp from log lines Beats filebeat	4	11863	February 28, 2018

Extract timestamp from the logline

Related topics