Parsing different dates from XML & Logs in one Date filed

Hi all,
I have different types of timestamps coming into my logstash input from different logfiles xml and log type:
My Configuration for filebeat looks like below:

filebeat.inputs:
- type: log
  paths:
    - /home/Dev/logs/server_error.log
  multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
  multiline.negate: true
  multiline.match: after

- type: log
  paths:
    - /home/Dev/logs/icn.log
  multiline.pattern: '^\[[0-9]{1}/[0-9]{2}/[0-9]{2}'
  multiline.negate: true
  multiline.match: after

- type: log
  paths:
    - /home/Dev/logs/cpe.log
  multiline.pattern: '^\[[0-9]{1}/[0-9]{2}/[0-9]{2}'
  multiline.negate: true
  multiline.match: after

- type: log
  paths:
    - /home/Dev/logs/default_XML.log
  multiline.pattern: '^<record>'
  multiline.negate: true
  multiline.match: after

output.logstash:
    hosts: ["localhost:5044"]

my configuration for logstash looks like below:

input { beats {port => 5044 } }
    filter {

 if [fields][log_type] == "server_error" {
  grok {
   match => [ "message", 
             "%{TIMESTAMP_ISO8601:logtime} %{DATA:thread} %{DATA:sub} [ ]* %{DATA:category} \- %{LOGLEVEL:sev} %{GREEDYDATA:message}" ]
  overwrite => [ "message" ]
  }
  mutate {
    replace => [ "type", "server_error_log" ]
    }
 }

 if [fields][log_type] == "icn" {
  grok {
   match => [ "message", 
             "%{DATESTAMP:logtime} %{DATA} %{DATA:thread} %{DATA:source} [ ]* %{DATA:sev} %{DATA:module} %{DATA:log-level} %{DATA} \[ \] %{DATA:java-method} %{GREEDYDATA:message}" ]
   overwrite => [ "message" ]
  }
  mutate {
    replace => [ "type", "icn_log" ]
    }
 }

 if [fields][log_type] == "cpe" {
  grok {
   match => [ "message",
             "%{DATESTAMP:logtime} %{DATA} %{DATA:thread} %{DATA:java-class} %{DATA:sev} %{DATA:java-package} %{DATA:java-method} %{GREEDYDATA:message}" ]

   overwrite => [ "message" ]
  }
  mutate {
    replace => [ "type", "cpe_log" ]
   }
 }

 if [fields][log_type] == "default_xml" {
  grok {
   match => [ "message", 
              "%{GREEDYDATA:raw_xml}" ]
  }
  xml {
   source => "raw_xml"
   #target => "xmldata"
   store_xml => "false"
   xpath => ["/record/date/text()","logtime"]
   xpath => ["/record/millis/text()","millis"]
   xpath => ["/record/sequence/text()","sequence"]
   xpath => ["/record/logger/text()","logger"]
   xpath => ["/record/level/text()","level"]
   xpath => ["/record/class/text()","class"]
   xpath => ["/record/method/text()","method"]
   xpath => ["/record/thread/text()","thread"]
   xpath => ["/record/message/text()","msg"]
  }    }

date {
       match => [ "logtime", "yyyy-MM-dd'T'HH:mm:ss.SSS", "yyyy-MM-dd'T'HH:mm:ss", "M/dd/yy HH:mm:ss:SSS" ]
       target => "Timestamp"
       }
    mutate { remove_field => ["logtime"] }
    }

output {   
    elasticsearch {
        hosts => ["localhost:9200"]
        index => "index_test"
     }
}

I have 4 different log files, one of them is as xml file, I have parsed everything by grok in logstash and saved in individual feild (for later visualization in kibana) but I also have the dates of each log file through (Date Filter parsed), every things work fine.
But I have a problem with the date from xml file I can not parse in (Date Filter). on the same Filed (logtime)
can someone help

How could we possibly help out without knowing the contents of the logtime field?

Hi @magnusbaeck,
sure below is my logs:

the first log -> server_error:

2018-04-17T15:19:20.313 FC772FA2 CBR FNRCE0000I - INFO binary_recognizer_....

the secend log -> icn & cpe have the same Datestamp format like below:

[6/13/18 5:29:50:575 CEST] 0000006d SystemOut O CIWEB Perf : com.ibm.ecm.configuration....

the third log -> default_XML

<record>
    <date>2018-05-31T09:06:28</date>
    <millis>1527750388992</millis>
    <sequence>825425</sequence>
    <logger>com.ibm.es.nuvo.inyo.ingest.DocIngestHandlerMulti</logger>
    <level>INFO</level>
    <class>com.ibm.es.nuvo.inyo.ingest.DocIngestHandlerMulti</class>
    <method>addDocument</method>
    <thread>908</thread>
    <message>Last document notification received for collection /</message>

according to my configuration the contents of the logtime field? is the dates from all logs type except the log (default xml). i will to pars also this date into logtime field with the others dates so that i can in kibana using logtime filter for all logs type.

Okay, so in the last example you posted logtime should contain "2018-05-31T09:06:28" which should be parseable by the date filter you have. Please remove the mutate { remove_field => ["logtime"] } filter and show the raw event produced by Logstash when it's fed an XML document. Copy/paste the text from the JSON tab in Kibana or use a stdout { codec => rubydebug } output.

i removed the mutat filter und i get this:

{
       "raw_xml" => "<record>\n    <date>2018-05-31T09:06:27</date>\n    <millis>1527750387452</millis>\n    <sequence>825423</sequence>\n    <logger>com.ibm.es.nuvo.inyo.ingest.DocIngestHandlerMulti</logger>\n    <level>INFO</level>\n    <class>com.ibm.es.nuvo.inyo.ingest.DocIngestHandlerMulti</class>\n    <method>startOutputingMessages</method>\n    <thread>892</thread>\n    <message>Starting subscription for collection /opt/filenet/collections/ADR_Folder_20170628111748_310CC8C62293451FB1A1BAB3998CA2E2 for client 3244</message>\n</record>\n",
        "logger" => [
        [0] "com.ibm.es.nuvo.inyo.ingest.DocIngestHandlerMulti"
    ],
      "@version" => "1",
    "prospector" => {
        "type" => "log"
    },
        "thread" => [
        [0] "892"
    ],
         "input" => {
        "type" => "log"
    },
    "@timestamp" => 2018-07-10T15:12:32.263Z,
      "sequence" => [
        [0] "825423"
    ],
        "source" => "/home/badr/Dev/logs/mini_default0.log",
          "host" => {
        "name" => "badr-VirtualBox"
    },
        "method" => [
        [0] "startOutputingMessages"
    ],
        "millis" => [
        [0] "1527750387452"
    ],
        "fields" => {
             "log_type" => "default0",
          "environment" => "test",
        "document_type" => "default0"
    },
          "beat" => {
        "hostname" => "badr-VirtualBox",
         "version" => "6.3.0",
            "name" => "badr-VirtualBox"
    },
        "offset" => 1008,
           "msg" => [
        [0] "Starting subscription for collection /opt/filenet/collections/ADR_Folder_20170628111748_310CC8C62293451FB1A1BAB3998CA2E2 for client 3244"
    ],
         "class" => [
        [0] "com.ibm.es.nuvo.inyo.ingest.DocIngestHandlerMulti"
    ],
       "message" => "<record>\n    <date>2018-05-31T09:06:27</date>\n    <millis>1527750387452</millis>\n    <sequence>825423</sequence>\n    <logger>com.ibm.es.nuvo.inyo.ingest.DocIngestHandlerMulti</logger>\n    <level>INFO</level>\n    <class>com.ibm.es.nuvo.inyo.ingest.DocIngestHandlerMulti</class>\n    <method>startOutputingMessages</method>\n    <thread>892</thread>\n    <message>Starting subscription for collection /opt/filenet/collections/ADR_Folder_20170628111748_310CC8C62293451FB1A1BAB3998CA2E2 for client 3244</message>\n</record>\n",
       "logtime" => [
        [0] "2018-05-31T09:06:27"
    ],
         "level" => [
        [0] "INFO"
    ],
          "tags" => [
        [0] "beats_input_codec_plain_applied",
        [1] "_dateparsefailure"
    ]
}

So the date (2018-05-31T09:06:28) of the XML file is already parsed in logtime but that does not help me if I can't parse it in eventlog where all dates parsed (from the other logs). because eventlog is my datefilter on kibana at the index pattern creating and there I find only the dates from the other logs (p8, icn, cpe). i need to know how i can get this issue, is important (for me ) to go on with the Visualiz on kibana

When you use xpath you always get an array, usually containing a single object. You can add this to the section of the config that processes xml

if [logtime] { mutate { replace => { "logtime" => "%{[logtime][0]}" } } }

@Badger thank you again, it works fine now, i get all dates in one filed (eventlog)

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.