Filebeat - multiline: Ingest XML's without line feed at end of file


(Christopher von Anhalt) #1

I want to ingest XML files to the ELK-Stack. I want one event per XML file. These XML files end without line feed, thus filebeat's multiline codec never forwards the last line of the XML to Logstash. Because of this Logstash's XML filter is then not able to parse the XML correctly.

I'm using filebeat 5.2.1.

My XML's look like this (I inserted line feeds (LF) to show):

LF
<taskReport>LF
LF
  <toplevelinfo Error="0" Warning="0"/>LF
LF
</taskReport>

My filebeat.yml looks like this:

filebeat.prospectors:
- 
      paths:
        - C:\*.xml
      input_type: log
      document_type: xml
      multiline:
        pattern: '^<taskReport>'
        negate: true
        match: after

output.logstash:
  hosts: ["LS:5044"]

Is there an option in filebeat to send this last line within the event?

If I manually add a line feed at the end of the XML logstash can perfectly parse the XML, but this is not an option to me.

Thanks in advance,
Chris


(Imma) #2

I think in your case you can try close_eof. I understand each XML event is in a file?

https://www.elastic.co/guide/en/beats/filebeat/current/configuration-filebeat-options.html


(Christopher von Anhalt) #3

Yes, each file should be handled as one event.

close_eof option seems to have no effect on my problem.
Logstash error is still: <REXML::ParseException: No close tag for /taskReport>

New filebeat.yml:

filebeat.prospectors:
- 
  paths:
    - C:\*.xml
  input_type: log
  document_type: xml
  multiline:
    pattern: '^<taskReport>'
    negate: true
    match: after
  close_eof: true

(Imma) #4

Try match: before


(Christopher von Anhalt) #5

I tried every possible combination, also with:

pattern: '^</taskReport>'

Sadly none of them fixes the problem.


(Chuck Boyer) #6

I'm having similar issues with ELMAH error logs. My tag is not sent when I run filebeat with the publish option and I'm seeing missing messages in my logstash log file. When I manually add a crlf to the end of the file, it works as I'm expecting.

  • input_type: log
    paths:
    • C:\Logs\RESAP\API*.xml
      document_type: ELMAHLog
      multiline.pattern: ''
      multiline.negate: true
      multiline.match: before
      ignore_older: 5m
      close_eof: true

(Christopher von Anhalt) #7

Resolution for me was to write a script that puts a line feed at the end of each XML.
Not pretty, but it works.


Logstash skips the last line of build.xml file
(system) #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.