Logstash multiline filter not merging xml after new line


#1

I want to parse xml files with logstash
Example File:

<?xml version="1.0" encoding="UTF-8"?>
<ns1:Alert_E01 xmlns:ns1="urn:contoso.com:elasticSearch:alert"><Alerts><Alert><PI_SID>SPI</PI_SID></Alert></Alerts></ns1:Alert_E01>

There is a newline (\n) after <?xml version="1.0" encoding="UTF-8"?>.

To parse the whole xml in one field i have to use the multiline filter:
filter{ multiline { pattern => "\s$" negate => false what => "next" } }

But it doesnt work, i get only the first line in the message field:
message:"<?xml version="1.0" encoding="UTF-8"?>"

BUT when i make a xml file like this:

firstline
secondline
thirdline

and parse it, I get
message:"firstline secondline thirdline"
as expected


(Magnus Bäck) #2

I'm not following the logic here. You want to join with the next line if the current line ends with a newline character?


#3

Yes, if a line ends with newline, i want to join with the next line.
Because the XML files have newline characters, and to parse the XML i need the whole XML to be in one field.


(Magnus Bäck) #4

Well, with the possible exception of the last line of the file all lines end with a newline character. When using \s like you do here I'm not sure it matches the line's newline character. I'd use ^ instead. All lines have a beginning.


#5

i tried it with

multiline { pattern=>"^.*"
            negate=>false
            what=> "next"
}

But get the file splitted after newline.


#6

i tried it also with

multiline { pattern=>"^<.*"
            negate=>false
            what=> "next"
}

and

multiline { pattern=>"^<.*"
            negate=>false
            what=> "previous"
}

and

multiline { pattern=>"^>.*"
            negate=>true
            what=> "next"
}

No line have ">" at the beginning, so if i am following the logic right, every line must be joined with the next line.
But still get the file splitted after newline:
message:"<?xml version="1.0" encoding="UTF-8"?>"


(Magnus Bäck) #7

If you provide a complete and reproducible configuration example it'll be easier to help.


#8
input{
	file{
		path => "C:/XMLdata/*.xml"
		start_position => "beginning"
		sincedb_path => "C:/parsedfiles.sincedb"

	}
}
filter{

   multiline {
      pattern => "^>.*"
      negate => true
      what => "next"
    }
  }

output{
	elasticsearch{
		
		index => "xml-index"
		hosts => "127.0.0.1:9200"
	}
}

here it is! :slight_smile:


(Magnus Bäck) #9

I'm not able to reproduce what you describe. The problem I have is actually getting Logstash to emit anything, because if you always join with the next line Logstash won't know when to stop waiting for the next line. I don't have any more time to spend on this. Good luck.


#10

Okay, thank you very much for your time :slight_smile:


(Jlogan) #11

hello,Are you solved this?


#12

No Unfortunately not. I decided to make it without logstash and stored the
data directly via the elasticsearch API


(Jlogan) #13

How to achieve this? could you please show me ?


#14

I used SAP as Interface, If you have SAP in your Company your SAP
colleagues will help you with your data connection :slight_smile:


(Jlogan) #15

ok thanks:blush:


(system) #16