Logstash filter not working as expected


(raghu) #1

i am new to ELK and trying to learn it.

I have given following filter file for logtash to get log details from the application server.
these are the paths where i am taking the logs from /opt/tomcat7/apache-tomcat-7.0.77/logs/catalina.out & /var/log/syslog.

In kibana i could see the timestamp as when the logstash is pulling the logs from those files. But i wanted to replace it with actual timestamp of the log. please suggest me the changes that need to be done for my filters

filter {
if [type] == "syslog" {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:[%{POSINT:syslog_pid}])?: %{GREEDYDATA:syslog_message}" }
# add_field => [ "received_at", "%{@timestamp}" ]
# add_field => [ "received_from", "%{host}" ]
}
syslog_pri { }
date {
match => [ "timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
target => "@timestamp"
}
}
if [type] == "apache_access" {
grok {
match => { "message" => ["%{COMBINEDAPACHELOG}", "%{IPORHOST:clientip} %{NOTSPACE:ident} %{NOTSPACE:auth} [%{HTTPDATE:timestamp}] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion}))" %{NOTSPACE:response} (?:%{NOTSPACE:bytes})" ] }
}
date {
match => [ "timestamp" , "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
target => "@timestamp"
}
}
if [type] == "apache_error" {
grok {
match => { "message" => "[(?%{DAY:day} %{MONTH:month} %{MONTHDAY} %{TIME} %{YEAR})] [%{WORD:module}:%{LOGLEVEL:loglevel}] [pid %{NUMBER:pid}:tid %{NUMBER:tid}]( (%{POSINT:proxy_errorcode})%{DATA:proxy_errormessage}:)?( [client %{IPORHOST:client}:%{POSINT:clientport}])? %{DATA:errorcode}: %{GREEDYDATA:message}" }
}
date {
match => [ "timestamp" , "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
target => "@timestamp"
}
}
if [type] == "apache_sslrequest" {
grok {
match => { "message" => "[%{HTTPDATE:timestamp}] %{IPORHOST:client} %{NOTSPACE:protocol} %{NOTSPACE:cipher} "(%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" }
}
date {
match => [ "timestamp" , "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
target => "@timestamp"
}
}
}


(Magnus Bäck) #2

The patterns in your date filters are wrong, except maybe in the syslog case. Inspect an example timestamp field of each log type and try to adjust the date filter so the pattern matches the data. Right now there are some very obvious mismatches.

The Logstash documentation contains examples of both syslog and HTTP log parsing.


(raghu) #3

@magnusbaeck

thanks for replying..

even in syslog case i see the same difference. Can you share me the link of the logstash docs.

.


(Magnus Bäck) #4

Please show the contents of the timestamp field. Please use copy/paste, no screenshots. Use the JSON tab in Kibana.

See https://www.elastic.co/guide/en/logstash/current/config-examples.html.


(raghu) #5

@magnusbaeck

May 10th 2017, 10:26:01.167 ---- is the timestamp i am able to see in kibana

May 10, 2017 4:56:00 AM org.apache.catalina.startup.Catalina start ---- here is the actual time when the log genrated along with msg


(raghu) #6

@magnusbaeck

{
"_index": "filebeat-2017.05.10",
"_type": "apache_sslrequest",
"_id": "AVvwt_eVag6JaBoW8duC",
"_score": null,
"_source": {
"message": "May 10, 2017 4:56:00 AM org.apache.catalina.startup.Catalina start",
"@version": "1",
"@timestamp": "2017-05-10T04:56:01.167Z",
"source": "/opt/tomcat7/apache-tomcat-7.0.77/logs/catalina.out",
"count": 1,
"fields": null,
"offset": 230233,
"type": "apache_sslrequest",
"input_type": "log",
"beat": {
"hostname": "deploymnetvm",
"name": "deploymnetvm"
},
"host": "deploymnetvm",
"tags": [
"beats_input_codec_plain_applied",
"_grokparsefailure"
]
},
"fields": {
"@timestamp": [
1494392161167
]
},
"highlight": {
"message": [
"@kibana-highlighted-field@May@/kibana-highlighted-field@ 10, 2017 4:56:00 AM org.apache.catalina.startup.Catalina start"
]
},
"sort": [
1494392161167
]
}


(Magnus Bäck) #7

Where's the timestamp field that you're trying to parse with your date filter?


(raghu) #8

@magnusbaeck

I could find the above JSON from one of the log in kibana discover page.

u mean from kibana logs in elk server ?

I am very new to this. Please help me to understand from where i can share the required timestamp details to you.


(Magnus Bäck) #9

The idea is that you use the grok filter to extract pieces of a field into fields of their own. That part seems to be working. The resulting fields can then be processed by other filter, like the date filter. If you look at your configuration you'll note that you've configured your date filter(s) to parse a field named timestamp, but when looking at an actual event there is no timestamp field (or any other field that only contains the timestamp to parse).


(raghu) #10

@magnusbaeck

Here is the actual sample log that's being generated in the syslog file.

May 10 09:27:59 deploymnetvm ansible-copy: Invoked with src=/home/devopsvm/.ansible/tmp/ansible-tmp-1494408474.42-15265948874711/source directory_mode=None force=True remote_src=None unsafe_writes=None selevel=None seuser=None setype=None group=None content=NOT_LOGGING_PARAMETER dest=/opt/tomcat7/apache-tomcat-7.0.77/webapps/ serole=None original_basename=tudu-dwr.war delimiter=None mode=None regexp=None owner=None follow=False validate=None attributes=None backup=False

If this is not the thing that you are looking for, please let me know where i can see and how i can make the changes if any for getting the actual log time in timestamp field of kibana web page.

Should this timestamp field to be invoked anywhere to get the correct one of log even.

I don't know where i am missing here


(Magnus Bäck) #11

You already have a grok filter to parse syslog events, but it's extracting the timestamp to the syslog_timestamp field. As I said earlier that's not the field you've configured the date filter to parse.


(raghu) #12

@magnusbaeck

got it. So in the filter also i an changing it to time stamp. Please help and make changes to this, where you feel the changes needed for getting the log timestamp.

filter {
if [type] == "syslog" {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:[%{POSINT:syslog_pid}])?: %{GREEDYDATA:syslog_message}" }
# add_field => [ "received_at", "%{@timestamp}" ]
# add_field => [ "received_from", "%{host}" ]
}
syslog_pri { }
date {
match => [ "timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
target => "@timestamp"
}
}
if [type] == "apache_access" {
grok {
match => { "message" => ["%{COMBINEDAPACHELOG}", "%{IPORHOST:clientip} %{NOTSPACE:ident} %{NOTSPACE:auth} [%{HTTPDATE:timestamp}] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion}))" %{NOTSPACE:response} (?:%{NOTSPACE:bytes})" ] }
}
date {
match => [ "timestamp" , "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
target => "@timestamp"
}
}
if [type] == "apache_error" {
grok {
match => { "message" => "[(?%{DAY:day} %{MONTH:month} %{MONTHDAY} %{TIME} %{YEAR})] [%{WORD:module}:%{LOGLEVEL:loglevel}] [pid %{NUMBER:pid}:tid %{NUMBER:tid}]( (%{POSINT:proxy_errorcode})%{DATA:proxy_errormessage}:)?( [client %{IPORHOST:client}:%{POSINT:clientport}])? %{DATA:errorcode}: %{GREEDYDATA:message}" }
}
date {
match => [ "timestamp" , "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
target => "@timestamp"
}
}
if [type] == "apache_sslrequest" {
grok {
match => { "message" => "[%{HTTPDATE:timestamp}] %{IPORHOST:client} %{NOTSPACE:protocol} %{NOTSPACE:cipher} "(%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" }
}
date {
match => [ "timestamp" , "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
target => "@timestamp"
}
}
}


(Magnus Bäck) #13

I don't have time to fix your configuration for you but I can provide pointers.

For each message type, look at the event in Kibana and make sure that a) you are extracting a timestamp field and b) compare it to the pattern you have in the corresponding date filter. It should be quite obvious that they don't match.


(system) #14

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.