New fields not being created and grok help


#1

I'm very new to all this ELK stuff and I'm having a lot of trouble getting logstash to parse my file. I followed a tutorial and managed to get filebeat to send my logs to logstash and I can see them in kibana. Now I'm trying to use grok to split out a bunch of items into their own fields but nothing seems to be working.

The lines from my logs have spaces after the words and then a tab character just before the next entry (and sometimes there may be nothing in a given field). I've also got a few lines at the top of the file which I need to exclude completely but I have no clue how to do that (unless the grok thing will handle it although the very first line in the file may be a match because it's a date and time)

Below is a sample line from the log along with the relevant .conf files for logstash. I can't figure out what I'm doing wrong. Can anyone help?

Time           Category       Severity       Entry type                                  Local time                UTC time                  Machine                 Application type        User                    Entity                                                Entity type                  Entity guid                               Details
3:14:34 PM     Security       Information    Entity created                              5/28/2015 3:14:34 PM      5/28/2015 7:14:34 PM      MYMACHINE               MyApp                                           1.1.1.1 - myentity                                    myentitytype                 {00000000-0000-0000-0007-0050F9100E7F}    My detials

in the "mypatterns" file
TIMESTAMP_12HOUR %{TIME} (AM|PM)
DATETIME_12HOUR %{DATE_US} %{TIME} (AM|PM)
MULTIWORD .[^\t]+

input {
beats {
port => 5046
type => "mylog"
ssl => true
ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
}
}

filter {
if [type] == "mylog" {
grok {
patterns_dir => "/etc/logstash/mypatterns"
match => { "message" => "%{TIMESTAMP_12HOUR:logtime} %{WORD:category} %{WORD:severity} %{MULTIWORD:type} %{DATETIME_12HOUR:localtime} %{DATETIME_12HOUR:utctime} %{HOSTNAME:hostname} %{MULTIWORD:apptype} %{USER:user} %{MULTIWORD:entity} %{MULTIWORD:entitytype} %{MULTIWORD:guid} %{GREEDYDATA:details}" }
add_field => [ "received_at", "%{@timestamp}" ]
add_field => [ "received_from", "%{hostname}" ]

    }
    date {
      match => [ "logtime", "HH:mm:ss a" ]
    }
    date {
      match => [ "localtime", "MM/dd/YYYY HH:mm:ss a" ]
    }
     date {
      match => [ "utctime", "MM/dd/YYYY HH:mm:ss a" ]
    }

  }

}

output {
elasticsearch { hosts => ["localhost:9200"] }
stdout { codec => rubydebug }
}


(Ralph Lo) #2

Hi tweetybird,

first I ran also into the same problem when started with logstash. So far your match- function looks fine, except using blanks (" ") between the predefined patterns.
If you use "\s" instead of blanks " ", this should work. Remember that the everything in the message field must exactly match with your patterns.

So for example if your message- field looks like this:
Time: 3:14:34 PM Category:Security Severity:Information Entry type:Entity created
and your patterns file looks like the one you've mentioned, the match field in your filter should look like:

(Time: 3:14:34 PM Category:Security Severity:Information Entry type:Entity created)
match => { "message" => "Time:\s%{TIMESTAMP_12HOUR:localtime}\sCategory:%(WORD:category)\sSeverity:%(WORD:severity)\sEntry\stype:%(WORD:entry_type) " }

So your matching pattern starts with the word time, since you wanna filter this word, you just write "Time:". The "\s" is a replacement for the withespace/blank character. Then you save the time in the "localtime" field. This is followed by a withespace again ("\s"). Your pattern continues with the word Severity, so you just type "Severity:" and save the value in the "severity" field. Again a withespace character ("\s") followed by the "Entry type" pattern...and so on.

Hope this helps!

Cheers,
Ralph


#3

Hi Ralph,

Thanks for your help. Using your suggestions and the online grok debugger tools, I've managed to get part of my log line with following command:

%{TIMESTAMP_12HOUR:logtime}\s\t%{WORD:category}\t%{WORD:severity}\t%{MULTIWORD:type}\s\t%{DATETIME_12HOUR:localtime}\s\t%{DATETIME_12HOUR:utctime}\s\t%{HOSTNAME:hostname}

However the logs entry after that aren't being picked up because it's a lot of white space (spaces) and tabs between between the hostname and the next field with real data. Is there a trick to pickup/skip all that whitespace?

I also saw my MULTIWORD:type, may not be correct. It picks up the Start Logging but it also gets all the spaces after it and I can't figure out the correct regex to just get the two works and skip all the whitespace after it. Do you know the correct syntax for that?

Thanks


(Ralph Lo) #4

Hi tweetybird,

However the logs entry after that aren't being picked up because it's a lot of white space (spaces) and tabs between between the hostname and the next field with real data. Is there a trick to pickup/skip all that whitespace?

The only solution which comes into my mind at the moment is to use the mutate function. I've never used it before, I just read that you can replace characters , maybe it's a got approach to get rid of all the withspaces.

However the logs entry after that aren't being picked up because it's a lot of white space (spaces) and tabs between between the hostname and the next field with real data. Is there a trick to pickup/skip all that whitespace?

Check out the following links:


and
http://grokdebug.herokuapp.com/

You could include an example of the line with the white spaces and the tabs if you don't get any further.

Cheers,
Ralph


#5

The good news is I think I figured out a grok pattern that works, the bad news is I don't see any new fields in Kibana.

In Kibana, under the discover tab, I see the new log entries but when I expand one, all the extra fields i defined in my grok pattern are not there. Is there any way I can debug this to see what i'm not getting all the extra fields?


#6

Edit: removed. figured out the fields and duplication stuff. think i'm ok now


(system) #7