Grokparsefailure when field is no longer in the message

Hi

I have a log that randomly shows an ip address at the end of the line. I have successfully created a filter to isolate the ip address but when the log does not produce the number I get a grokparsefailure. I really need the ip address to be isolated when it randomly appears and I was wondering what is the most efficient way to drop the grokparsefailure to stop the filter from failing?

Log passes with the following line

INFO UserSession [Session.4502957:Default Site:user] user logged into Default Site with protocol SFTP on 10.xxx.xxx.xxx:22 from 10.xxx.xxx.xxx:48110

Log fails with the following line

INFO SSHAuthService [Session.4502956:Default Site:user] User user can retry authentication

Grok

input{
    beats{
        port => "5044"
    }
}

filter {
    if [type] == "diagnostic" {
        grok {
            match => { "message" => "%{TIMESTAMP_ISO8601:diagstamp}%{SPACE}%{CISCO_REASON:info}%{SYSLOG5424SD:session}%{SPACE}%{USERNAME:userid}%{SPACE}%{GREEDYDATA:message}%{IPV4:client}" }
        }
        date {
           match => ["diagstamp" , "yyyy-MM-dd HH:mm:ss,SSS"]
           target => ["@timestamp"]
        }
    }
}

output {
    if [type] == "diagnostic" {
      elasticsearch {
        hosts => ["10.xxx.xxx.xxx:9200"]
        manage_template => false
        index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
        document_type => "%{[@metadata][type]}" }
      }
    }

Just make the IP address optional with e.g. (%{IPV4:client})?.

Hi Magnus

That works when I run the filter on the Grok Debugger but when I add the filter to the .conf file it produces a duplicate entry at the end of line so I am guessing I have a syntax error somewhere on the filter.

Input

2017-09-15 06:31:16,728 INFO UserSession [Session.4502959:Default Site:user] user logged into Default Site with protocol SFTP on 10.xxx.xxx.xxx:22 from 10.xxx.xxx.xxx:52167

Ouput

2017-09-15 06:31:16,728 INFO UserSession [Session.4502959:Default Site:user] user logged into Default Site with protocol SFTP on 10.xxx.xxx.xxx:22 from 10.xxx.xxx.xxx:52167 , logged into Default Site with protocol SFTP on 10.xxx.xxx.xxx:22 from 10.xxx.xxx.xxx:52167

.conf file

input{
    beats{
        port => "5044"
    }
}

filter {
    if [type] == "diagnostic" {
        grok {
            match => { "message" => "%{TIMESTAMP_ISO8601:diagstamp}%{SPACE}%{CISCO_REASON:info}%{SYSLOG5424SD:session}%{SPACE}%{USERNAME:userid}%{SPACE}%{GREEDYDATA:message} (%{IPV4:client}:%{POSINT:port})?" }
        }
        date {
           match => ["diagstamp" , "yyyy-MM-dd HH:mm:ss,SSS"]
           target => ["@timestamp"]
        }
    }
}

output {
    if [type] == "diagnostic" {
      elasticsearch {
        hosts => ["10.xxx.xxx.xxx:9200"]
        manage_template => false
        index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
        document_type => "%{[@metadata][type]}" }
      }
    }

it produces a duplicate entry at the end of line

I don't understand. Show what you get, don't describe it.

What do you mean with the input and output? The lines are identical and both look like inputs.

If you look at the output you will see the end of the line is replicated, which I showed in the previous reply. I have supplied the json file if that helps. The line of text from the source is not like that.


2017-09-15 06:31:16,728 INFO UserSession [Session.4502959:Default Site:user] user logged into Default Site with protocol SFTP on 10.xxx.xxx.xxx:22 from 10.xxx.xxx.xxx:52167 , logged into Default Site with protocol SFTP on 10.xxx.xxx.xxx:22 from 10.xxx.xxx.xxx:52167

    {
  "_index": "filebeat-2017.09.14",
  "_type": "diagnostic",
  "_id": "AV6Yo-KRTNMqag3Z2ImC",
  "_version": 1,
  "_score": null,
  "_source": {
    "offset": 836,
    "session": "[Session.4502959:Default Site:user]",
    "input_type": "log",
    "source": "C:\\Test\\diagnostic.log",
    "message": [
      "2017-09-15 06:31:16,728 INFO UserSession [Session.4502959:Default Site:user] user logged into Default Site with protocol SFTP on 10.xxx.xxx.xxx:22 from 10.xxx.xxx.xxx:52167 ",
      "logged into Default Site with protocol SFTP on 10.xxx.xxx.xxx:22 from 10.xxx.xxx.xxx:52167"
    ],
    "type": "diagnostic",
    "userid": "user",
    "tags": [
      "Aust_Melb",
      "beats_input_codec_plain_applied"
    ],
    "@timestamp": "2017-09-14T20:31:16.728Z",
    "@version": "1",
    "beat": {
      "hostname": "FTP1",
      "name": "FTP1",
      "version": "5.5.2"
    },
    "host": "FTP1",
    "diagstamp": "2017-09-15 06:31:16,728",
    "fields": {
      "test": "test",
      "hosts": [
        "localhost:9200"
      ]
    },
    "info": "INFO UserSession "
  },
  "fields": {
    "@timestamp": [
      1505421076728
    ]
  },
  "sort": [
    1505421076728
  ]
}

It looks like you are trying to overwrite the message field from your grok pattern. Does the behaviour change if you use the overwrite parameter?

I will give that a go and let you know. The strange thing is the filter works in the Grok Debugger

It looks like the pattern is capturing the correct part, I am just not sure if it is appending it to the existing message field instead of overwriting it or not.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.