Date filter does not assign value to target field


(Stefan Amshey) #1

Hi- I'm having some trouble using a date filter in logstash. I am able to parse the date elements from the original log message into a custom field, but the date filter does not assign it to the @timestamp field and I don't understand why not. I am not getting _dateparsefailure tags in the output, either, so I'm befuddled.

Here's my logstash configuration:

input {
  file {
    path => "/home/netaffx/wl-proxy.*.log"
    type => "wl-proxy-log"
    start_position => "beginning"
  }
}

filter {
  if [type] == "wl-proxy-log" {
    grok {
      match => {
        "message" => [
          "%{DATA}\s+\[%{WORD:method}\s+%{URIPATH:uri_path}%{URIPARAM:uri_params}?\s+%{URIPROTO:uri_proto}/%{BASE10NUM:proto_version}\]\s+%{GREEDYDATA}\n?",
          "(%{DAY}\s+)?(?<syslog_datetime>%{SYSLOGTIMESTAMP}\s+%{YEAR})\s+<%{INT:session_id}>\s+(?<log_message>ap_unescape_url return success. Using tempQueryStr =\s+{GREEDYDATA:uri_params})\n?",
          "(%{DAY}\s+)?(?<syslog_datetime>%{SYSLOGTIMESTAMP}\s+%{YEAR})\s+<%{INT:session_id}>\s+%{GREEDYDATA:log_message}\n?",
          "(%{DAY}\s+)?(?<syslog_datetime>%{SYSLOGTIMESTAMP}\s+%{YEAR})\s+<%{INT:session_id}>\s+%{DATA}\s+\[%{URIPATH}%{URIPARAM:uri_params}\]\n?"
        ]
      }
      break_on_match => true
    }
    date {
      match => [ "syslog_datetime[0]", "MMM dd HH:mm:ss yyyy", "MMM  d HH:mm:ss yyyy" ]
      locale => "en-US"
      timezone => "America/Los_Angeles"
      target => "@timestamp"
    }
    kv {
      source => "uri_params"
      field_split => "&?"
    }
  }
}

output {
  if [type] == "wl-proxy-log" {
    elasticsearch {
      ssl => false
      index => "wl-proxy"
      manage_template => false
    }
  }
}

Here is the JSON that results from that:

{
  "_index": "wl-proxy",
  "_type": "wl-proxy-log",
  "_id": "gL7He2ABNapjGXAZJiUn",
  "_version": 1,
  "_score": null,
  "_source": {
    "path": "/home/netaffx/wl-proxy.ws8-prod.colo.log",
    "@timestamp": "2017-12-22T01:12:16.713Z",
    "syslog_datetime": "Sep 29 16:23:40 2016",
    "@version": "1",
    "host": "na6.affymetrix.com",
    "session_id": "6249147519142051",
    "log_message": "request [/publications/full_list.affx?year=2006&result_page=23] processed sucessfully..................",
    "message": "Thu Sep 29 16:23:40 2016 <6249147519142051> request [/publications/full_list.affx?year=2006&result_page=23] processed sucessfully..................",
    "type": "wl-proxy-log"
  },
  "fields": {
    "@timestamp": [
      "2017-12-22T01:12:16.713Z"
    ]
  },
  "sort": [
    1513905136713
  ]
}

I think that the issue may be related to the parsed field being a string, and the field that I am trying to assign it to is a datetime, but I haven't been able to establish that and the documentation seems to indicate that the date filter, if it is parsing the field correctly, will output a value that is compatible with a date field.

Can anyone help with this? Thanks in advance!


(Magnus Bäck) #2
 match => [ "syslog_datetime[0]", "MMM dd HH:mm:ss yyyy", "MMM  d HH:mm:ss yyyy" ]

Why syslog_datetime[0]? The syslog_datetime field isn't an array.


(Stefan Amshey) #3

I used the grok debugger to test the pattern and it returned the structure below, which shows syslog_datetime as an array within an array. In practice, if I use just syslog_datetime something strange happens and the resulting document does not contain a syslog_datetime field at all, even though I did nothing to change the grok filter! I don't pretend to understand why this is the case.

Here's the document shown by the grok debugger:

{
  "DAY": [
    [
      "Thu"
    ]
  ],
  "syslog_datetime": [
    [
      "Sep 29 15:14:22 2016"
    ]
  ],
  "SYSLOGTIMESTAMP": [
    [
      "Sep 29 15:14:22"
    ]
  ],
  "MONTH": [
    [
      "Sep"
    ]
  ],
  "MONTHDAY": [
    [
      "29"
    ]
  ],
  "TIME": [
    [
      "15:14:22"
    ]
  ],
  "HOUR": [
    [
      "15"
    ]
  ],
  "MINUTE": [
    [
      "14"
    ]
  ],
  "SECOND": [
    [
      "22"
    ]
  ],
  "YEAR": [
    [
      "2016"
    ]
  ],
  "session_id": [
    [
      "624514751872623"
    ]
  ],
  "log_message": [
    [
      "Remote Host 192.168.4.48 Remote Port 10101"
    ]
  ]
}

(Magnus Bäck) #4

I can't speak for the grok debugger, but your sample event clearly shows that syslog_datetime isn't an array but a string.

if I use just syslog_datetime something strange happens and the resulting document does not contain a syslog_datetime field at all, even though I did nothing to change the grok filter!

Please show the input, output, and configuration of when that happens.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.