I am trying to index log files to Elastic search. All the log entries are being indexed into a field named message. @timestamp field shows the time the entry was indexed and not the timestamp from log entry.
I created a ingest pipeline with grok processor to define the pattern of the log entry. I have tried several patterns and am unable to get this working, particularly because i am new to grok.
All i want is the ability to extract the timestamp from the log message and everything else can be ignored or wildcarded or stored in just one variable like message
Any help would be appreciated
Log sample
2019-08-05 00:04:06 error [error.js]: No authorization token was found
2019-08-05 00:04:06 info [index.js]: Request: HTTP GET /
2019-08-05 00:04:06 error [error.js]: No authorization token was found
I ran into something similar trying to get JSON output from TShark into ElasticSearch .... for the life of me I couldn't get the timestamp to pull from the JSON log and translate properly ...
Here's what I did step by step to make the timestamp show correctly ...
From the Kibana console I created the Elasticsearch index first ...
From the console I then added a document to the new index using the "PUT" command and tagging the end with ?pipeline=<pipeline_name>
Tested the display of the timestamp using Kibana ...
I'm by no means at all an Elasticsearch/Kibana guru but I wonder if perhaps what you're missing is the creation of the index first and properly defining the "timestamp" field in the index prior to ingesting ... I ran into all sorts of issues until I created the index first ...
Thanks for taking the time to respond. Yes i did try creating an index before ingesting any data to it, the mapping is created etc but the enter log line is indexed into a filed named message.
That is when i realized the filebeat cannot really process or parse the data. Do not want to use logstash for such a simple task. I created ingest pipeline with grok and date processor to possibly just extract the timestamp and leave the rest of the log message in the message field.
I am using the _simulate pipeline API which is a great feature to test how the data is getting ingested and investigate and fix any issues.
post _ingest/pipeline/redate/_simulate
{
"docs":[
{
"_source":{
"message":"2019-08-04 12:02:39 info [index.js]: Request: HTTP GET /"
}
}
]
}
I get this below error
"type": "exception",
"reason": "java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Provided Grok expressions do not match field value: [2019-08-04 12:02:39 info [index.js]: Request: HTTP GET /]",
Clearly indicating that the grok pattern i am trying to use is not correct
Now i am able to simulate different log messages and get desired result but filebeat is unable to recognize the ingest pipeline. I am getting this error
ERROR pipeline/output.go:121 Failed to publish events: temporary bulk send failure
In the filebeat elasticsearch output configuration, i have the following config
If i remove the pipeline from the config then the logs are getting indexed, if pipeline entry is added to filebeat config then i get those errors. Any thoughts
I made this below change and the log messages are getting indexed. Although i do not understand how, appreciate if someone can shed some light on it
I had the pipeline: "pipelinename" setting in Elasticsearch output section of the filebeat config file. I moved that line to filebeat inputs section right under file path section, like so
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.