Need help with the grok and the date filter for parsing logs

I am trying to parse logs from a CSV file .

Issue 1> I am using below mentioned conf file.

input {
file {
path => "/tmp/test1.csv"
type => "core2"
start_position => "beginning"
}
}
filter {
csv {
separator => ";"
columns => ["Time","Source","Status","Severity","Location","ConfigItem","Alert","Message1","Message2"]
}
}
output {
elasticsearch {
action => "index"
hosts => "localhost"
index => "stock"
workers => 1
}
stdout {}
}

Let me know , is this the correct one ? If yes, then how will I get the date option as I tried using the date filter but unable to get through (Please find the sample data):

Time=Monday, May 01, 2017 1:48 AM

Format for this kind of date ....!
Thanks in advance

Use a date filter to parse the timestamp. The pattern "EEE, MMM dd, YYYY hh:mm aa" probably works.

Hi Magnus,

Thanks for replying !

I am able to parse the logs with the help of grok filter : but I am not getting the correct implementation of logs in KIBANA
For ex. : When I am parsing "NODEDOWN ALERT" , then while visualization it is showing one bar for "NODEDOWN" and another bar for "ALERT". how I can sort this out ?

Regards,
Gaurav Singh

This is because the field in Elasticsearch is analyzed. You need to adjust the index template used so that the field or question (or maybe all string fields?) are non-analyzed string fields (aka keyword fields as of ES 5.0).

I got some thing "Indexes imported from 2.x do not support keyword. Instead they will attempt to downgrade keyword into string. " . Seems like I need to move on to elasticsearch 5.x.

Is there any option or choice we can change the string to keyword as i don't want to move for elasticsearch 5.x ? (Currently using 2.x)

Make sure the string field is set as not_analyzed. That's equivalent to 5.x's keyword type.

And how can I do that ... I tried to find but got no luck . Any specific option I need to look for this ?

See https://www.elastic.co/guide/en/elasticsearch/guide/current/mapping-intro.html#_index_2 for an example of how to set a string field as not_analyzed.

I am unable to make grok filter for the below mentioned data as the data values are not consistent

Time=Monday, May 01, 2017 3:12 AM;Source=NPM;Status=New;Severity=Critical;Location=SYD-TO;ConfigItem=elgar-mt.iii.com aws;Alert=Reboot;Message1=LastBootTimeChanged
Time=Monday, May 01, 2017 3:12 AM;Source=NPM;Status=New;Severity=Critical;Location=SYD-TO;ConfigItem=asia-spacewalk.iii.com;Alert=Reboot;Message1=LastBootTimeChanged
Time=Monday, May 01, 2017 3:12 AM;Source=NPM;Status=New;Severity=Critical;Location=SIN-TO;ConfigItem=sinsierra-mt.iii.com;Alert=Reboot;Message1=LastBootTimeChanged
Time=Monday, May 01, 2017 3:14 AM;Source=NPM;Status=New;Severity=Warning;Location=JEM-PD;ConfigItem=bamboo;Alert=Diskspace;Message1=/;Message2=90 %
Time=Monday, May 01, 2017 4:09 AM;Source=NPM;Status=New;Severity=Critical;Location=SYRDC-POL;ConfigItem=scottsdaledc1;Alert=Reboot;Message1=LastBootTimeChanged
Time=Monday, May 01, 2017 4:13 AM;Source=NPM;Status=New;Severity=Critical;Location=SYRDC-POL;ConfigItem=sbcldc1;Alert=Reboot;Message1=LastBootTimeChanged
Time=Monday, May 01, 2017 4:13 AM;Source=NPM;Status=New;Severity=Critical;Location=SYRDC-POL;ConfigItem=chandlerdc1;Alert=Reboot;Message1=LastBootTimeChanged
Time=Monday, May 01, 2017 4:19 AM;Source=NPM;Status=New;Severity=Critical;Location=SYRDC-POL;ConfigItem=tempedc1;Alert=Reboot;Message1=LastBootTimeChanged
Time=Monday, May 01, 2017 4:22 AM;Source=NPM;Status=New;Severity=Critical;Location=JEM-PD;ConfigItem=bamboo;Alert=Diskspace;Message1=/;Message2=100 %

As you can see message 1 and message 2 values are not consistent ?

I am using the mentioned grok filter :

Time=%{GREEDYDATA:time};Source=%{DATA:source};Status=%{DATA:status};Severity=%{DATA:severity};Location=%{DATA:location};ConfigItem=%{DATA:configitem};Alert=%{DATA:alert};Message1=%{DATA:message1}(;)?{Message2=%{INT:message2}}?

Kindly Suggest.

For this type of data, consider using the kv filter with ; as field_split instead of the grok filter. Grok is a very powerful filter, but not necessarily the ideal tool for all types of data.

Is it correct , kindly suggest if it will work or not !

kv {
field_split => ";"
value_split => "="
}

Looks OK to me, but try it to find out for sure.

Able to parse the logs , thanks !
Now, could you please let me know to add the correct timestamp as it is taking the default timestamp not the time stamp from logs . For that do I need to use date filter and if yes then how ?

Yes, use the date filter. Please read the documentation and ask if you have any specific problems.

1 Like

I am using the below filter but still the timestamp issue is same .

kv {
field_split => ";"
value_split => "="
}
date {
match => [ "Time" , "EEEE, MMMM dd, yyyy HH:mm a", "EEEE, MMMM dd,yyyy H:mm a" ]
target => "@timestamp"
}

As per the data shared above .. is it the right approach ?

Please show an example event as produced by Logstash. Copy/paste from the JSON tab of Kibana's Discover panel or use a stdout { codec => rubydebug } output.

Observations:

1 Like

Thank you very much ..... everything is perfect now !
I really appreciate the help!

Well now getting the timezone difference.... though the time is correct .
During the elasticsearch testing command : curl -X GET http://localhost:9200/index/_search?pretty
It is showing the correct time stamp as what in logs but while visualizing the data in Kibana the timezone is getting changed .
Any setting I need to work on ? (By the way , i am working all this with docker just fyi)

ES stores timestamps in UTC and Kibana adjusts them for the browser's timezone. What's the timezone of the timestamps in the logs? Please give an example input string and how it's stored in ES (use the JSON tab in Kibana's Discover panel so we can see the raw JSON document).