Logstash XML

Hi

i'am trying to parse a xml file with logstash and i having some issues

this is my XML exemple :

<books>
 <book>
		<name>HarryPotter</name>
		<id>785</id>
		<numero>7</numero>
	</book>
<book>
		<name>game of thrones</name>
		<id>441</id>
		<numero>3</numero>
	</book>
</books>

my .conf

     input{
               ..................
               ......
                codec => multiline {
    	      pattern => '^<book' 	
    	      negate => true
    	      what => "previous"
    	    }   
    	}
    }

      xpath =>
                [
                "/book/id/text()", "id",
                "/book/name/integer", "name",
                "/book/numero/integer", "numero",
                ]
        }

the first issue is i'am getting some erreur
exception=>#<REXML::ParseException: Missing end tag for '' (got "books")

the second one is that i can't change the type i'am getting string value in elastic for all how to change that

tks

You claim to be matching a pattern that never matches. That will not flush an event unless you enable auto_flush_interval. Since you are getting an event I infer that you are not using the configuration you claim to be using.

Once you get that part working you will need to update the xpath expressions

"/books/book/id/text()", "id",

for example.

i just fixed my post

codec => multiline {
    	      pattern => '^<book' 	
    	      negate => true
    	      what => "previous"
    	    } 

So ?

That will result in an event that has this message field

<books>\n <book>\n        <name>HarryPotter</name>\n        <id>785</id>\n        <numero>7</numero>\n    </book>

That is not valid XML. You could use mutate+gsub to fix it.

Since you did not enable auto_flush_interval the second book element will never get flushed.

i did enable the auto_flush_interval and still the same problem this is what i'am getting in my log :

Error parsing xml with XmlSimple {:source=>"message", :value=>"\n\t\tHarryPotter\n\t\t785\n\t\t7\n\t\t\n", :exception=>#<REXML::ParseException: Missing end tag for '' (got "books")

i am using a http poller. how can i fix it with mutate+gsub and Remove line-breaks ?

You do not need to remove the line breaks. If you want to the post I linked to shows how to do it. You can remove the books tag using

mutate { gsub => [ "message", "<books>", "" ] }

thanks i fix it.

Another question: i'am ussing logstash 7.2.0 in my test environment and it work bu in another one there is logstash 6.0.1 and that doesn't work it, is it normal ?

I would expect the same configuration to work. What does not work?

log stash bloc completely, and i don't have any log and reset log config don't work. it not problem i'am sending the data to elasticsearch from another environment.

i have this :

status online 282673s

i want to split the "online 282673s" i tried split => { "status" => " " } but that doesn't work how to slip with space ??

this is what i did and that doesn't work

   gsub => ["status", " ", ":"]
    split => { "status" => ":" }
    add_field => {
             "firstPart" => "%{[status][0]}"
             "secondPart" => "%{[status][1]}"
              }

this is what i'am getting

secondPart: %{[status][1]}
firstPart: online:5039s

I cannot explain why that would not work. With this configuration

filter {
    mutate { add_field => { "status" => "online 282673s" } }
    mutate {
        split => { "status" => " " }
        add_field => {
            "firstPart" => "%{[status][0]}"
            "secondPart" => "%{[status][1]}"
        }
    }
}

or this configuration

input { generator { count => 1 lines => [ '' ] } }
filter {
    mutate { add_field => { "status" => "online 282673s" } }
    mutate {
        gsub => ["status", " ", ":"]
        split => { "status" => ":" }
        add_field => {
            "firstPart" => "%{[status][0]}"
            "secondPart" => "%{[status][1]}"
        }
    }
}

I get

 "firstPart" => "online",
"secondPart" => "282673s",
    "status" => [
    [0] "online",
    [1] "282673s"
],

That didn't work with my field, don't now why. so this is what i did and it work

mutate { add_field => { "statusbis" => "%{[status]}" } } 
   mutate {
        split => { "statusbis" => " " }
        add_field => {
            "firstPart" => "%{[statusbis][0]}"
            "secondPart" => "%{[statusbis][1]}"
        }
    }

And it works tanks for your help

All of that would be consistent with the original status field being an array with a single member.

    "status" => [
    [0] "online:282673s"
],

That would explain why the original mutates did nothing and why it started working when you copied status to another field (which is not an array).

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.