Logstash xml filter help

Hello Team, Hope all are doing good with stay home stay safe. Lately i tried playing with logstash to parse an xml file, using grok but did not had any luck. I am able to do text file but xml is giving me very hard time. I also read about logstash xml filter but no luck. Can someone please help me understand how i can achieve filtering data via logstash to kibana.

I wish to capture total value from 3rd line and test name, domain and testtype value.

xml file:-

<?xml version="1.0" encoding="utf-8"?>
<assemblies timestamp="05/13/2020 01:02:14">
  <assembly name="C:\tfssvc\SMS.Test\bin\Debug\SMS.Test.DLL" environment="64-bit .NET 4.0.30319.42000 [collection-per-class, parallel (4 threads)]" test-framework="xUnit.net 2.4.1.0" run-date="2020-05-13" run-time="01:01:22" config-file="C:\tfssvc\SMS.Test\bin\Debug\SMS.Test.dll.config" total="3" passed="3" failed="0" skipped="0" time="50.891" errors="0">
    <errors />
    <collection total="2" passed="2" failed="0" skipped="0" name="Test collection for SMS.Test.Unit_Tests.SMS_Business.Email_Test.Unsubscribe_Test.When_Loading" time="50.593">
      <test name="SMS.Test.Unit_Tests.SMS_Business.Email_Test.Unsubscribe_Test.When_Loading.Load_NoValue_FalseReturned" type="SMS.Test.Unit_Tests.SMS_Business.Email_Test.Unsubscribe_Test.When_Loading" method="Load_NoValue_FalseReturned" time="49.5732281" result="Pass">
        <traits>
          <trait name="04 - SMS &gt; Email" value="Unsubscribe" />
          <trait name="Domain" value="Email" />
          <trait name="Test Type" value="Unit" />
        </traits>
      </test>
      <test name="SMS.Test.Unit_Tests.SMS_Business.Email_Test.Unsubscribe_Test.When_Loading.Load_TrueValue_TrueReturned" type="SMS.Test.Unit_Tests.SMS_Business.Email_Test.Unsubscribe_Test.When_Loading" method="Load_TrueValue_TrueReturned" time="1.0195929" result="Pass">
        <traits>
          <trait name="04 - SMS &gt; Email" value="Unsubscribe" />
          <trait name="Domain" value="Email" />
          <trait name="Test Type" value="Unit" />
        </traits>
      </test>
    </collection>
    <collection total="1" passed="1" failed="0" skipped="0" name="Test collection for SMS.Test.Unit_Tests.SMS_Business.Email_Test.Unsubscribe_Test.When_Saving" time="49.596">
      <test name="SMS.Test.Unit_Tests.SMS_Business.Email_Test.Unsubscribe_Test.When_Saving.Save_FalseValue_FalseValueSaved" type="SMS.Test.Unit_Tests.SMS_Business.Email_Test.Unsubscribe_Test.When_Saving" method="Save_FalseValue_FalseValueSaved" time="49.5958384" result="Pass">
        <traits>
          <trait name="04 - SMS &gt; Email" value="Unsubscribe" />
          <trait name="Domain" value="Email" />
          <trait name="Test Type" value="Unit" />
        </traits>
      </test>
    </collection>
  </assembly>
</assemblies>

Logstash.config:-- i tried looking in stackoverflow and many places to no luck to understand this

input {
 file {
  path => "C:\\Users\\sgorasia\\unittestresult\\TestResults.xml"
  start_position => "assemblies"
  codec => multiline
  {
   pattern => "^<\?xmldata .*\>"
   negate => true
   what => "previous"
  }
 }
}

filter {
  xml {
   store_xml => false
   source => "message"
   xpath =>
   [
    "/assemblies/assembly/id/text()", "id",
    "/xmldata/head1/date/text()", "date",
    "/xmldata/head1/key1/text()", "key1"
   ]
}
 
date {
    match => [ "date" , "dd-MM-yyyy HH:mm:ss" ]
    timezone => "Europe/Amsterdam"
}
 
}








output {
  stdout { codec => rubydebug }
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "logstashxml"

  }
}

To collect all the test names and types (and keep them associated) I would use a ruby filter to iterate over the collections and tests. Something like this

    xml {
        source => "message"
        target => "[@metadata][theXML]"
        remove_field => [ "message" ]
     }
    ruby {
        code => '
            a = event.get("[@metadata][theXML][assembly]")
            if a
                tests = []
                a.each { |x|
                    x["collection"].each { |y|
                        y["test"].each { |z|
                            tests << { "name" => z["name"], "type" => z["type"] }
                        }
                    }
                }
                event.set("tests", tests)
            end
        '
    }

Thanks Badger for ur immediate help on this. I have never tried ruby before. With ur example i will try to see if i can achieve what i need. Thanks again.

Hi Badger, sorry to bother you again, but i tried to work with above sample code but it did not work as intended. I got error in logstash log file. Also there is 1 single entry into kibana which captured xml data as is till first <test was encountered.

The way i am looking to filter data is like i want testname as a column which will contain test and similarly column called total pass which will have value as 3 or as per result in that fashion.

Ruby is totally new to me and i am trying hard to understand syntax and how to iterate the xml. But one single working example will push my learning fater.

Logstash error:- all i could understand from error was it says no close tag for assemblies but i checked the xml and it looks correct. Not sure if i interpreted error correctly.

[2020-05-13T13:27:04,338][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2020-05-13T13:27:04,353][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.4.0"}
[2020-05-13T13:27:06,444][INFO ][org.reflections.Reflections] Reflections took 42 ms to scan 1 urls, producing 20 keys and 40 values 
[2020-05-13T13:27:08,263][INFO ][logstash.outputs.elasticsearch][main] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}}
[2020-05-13T13:27:08,423][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"http://localhost:9200/"}
[2020-05-13T13:27:08,466][INFO ][logstash.outputs.elasticsearch][main] ES Output version determined {:es_version=>7}
[2020-05-13T13:27:08,469][WARN ][logstash.outputs.elasticsearch][main] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>7}
[2020-05-13T13:27:08,486][INFO ][logstash.outputs.elasticsearch][main] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["http://localhost:9200"]}
[2020-05-13T13:27:08,534][INFO ][logstash.outputs.elasticsearch][main] Using default mapping template
[2020-05-13T13:27:08,608][INFO ][logstash.outputs.elasticsearch][main] Attempting to install template {:manage_template=>{"index_patterns"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s", "number_of_shards"=>1}, "mappings"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}
[2020-05-13T13:27:10,452][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge][main] A gauge metric of an unknown type (org.jruby.specialized.RubyArrayOneObject) has been create for key: cluster_uuids. This may result in invalid serialization.  It is recommended to log an issue to the responsible developer/development team.
[2020-05-13T13:27:10,454][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>1000, :thread=>"#<Thread:0x692d51c0 run>"}
[2020-05-13T13:27:11,100][INFO ][logstash.inputs.file     ][main] No sincedb_path set, generating one based on the "path" setting {:sincedb_path=>"C:/Users/sgorasia/logstash7.4/logstash-7.4.0/data/plugins/inputs/file/.sincedb_98af0331a566b5047b360c29425537e8", :path=>["C:/Users/sgorasia/unittestresult/TestResults.xml"]}
[2020-05-13T13:27:11,128][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2020-05-13T13:27:11,184][INFO ][filewatch.observingtail  ][main] START, creating Discoverer, Watch with file and sincedb collections
[2020-05-13T13:27:11,185][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2020-05-13T13:27:11,661][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2020-05-13T14:27:12,257][WARN ][logstash.filters.xml     ][main] Error parsing xml with XmlSimple {:source=>"message", :value=>"<?xml version=\"1.0\" encoding=\"utf-8\"?>\r\n<assemblies timestamp=\"05/13/2020 01:02:14\">\r\n  <assembly name=\"C:\\tfssvc\\Cornerstone\\Dev\\Cornerstone\\DevPatch\\CornerstoneApp\\LMS.Test\\bin\\Debug\\LMS.Test.DLL\" environment=\"64-bit .NET 4.0.30319.42000 [collection-per-class, parallel (4 threads)]\" test-framework=\"xUnit.net 2.4.1.0\" run-date=\"2020-05-13\" run-time=\"01:01:22\" config-file=\"C:\\tfssvc\\Cornerstone\\Dev\\Cornerstone\\DevPatch\\CornerstoneApp\\LMS.Test\\bin\\Debug\\LMS.Test.dll.config\" total=\"3\" passed=\"3\" failed=\"0\" skipped=\"0\" time=\"50.891\" errors=\"0\">\r\n    <errors />\r\n    <collection total=\"2\" passed=\"2\" failed=\"0\" skipped=\"0\" name=\"Test collection for LMS.Test.Unit_Tests.LMS_Business.Email_Test.Unsubscribe_Test.When_Loading\" time=\"50.593\">\r\n      <test name=\"LMS.Test.Unit_Tests.LMS_Business.Email_Test.Unsubscribe_Test.When_Loading.Load_NoValue_FalseReturned\" type=\"LMS.Test.Unit_Tests.LMS_Business.Email_Test.Unsubscribe_Test.When_Loading\" method=\"Load_NoValue_FalseReturned\" time=\"49.5732281\" result=\"Pass\">\r\n        <traits>\r\n          <trait name=\"04 - LMS &gt; Email\" value=\"Unsubscribe\" />\r\n          <trait name=\"Domain\" value=\"Email\" />\r\n          <trait name=\"Test Type\" value=\"Unit\" />\r\n        </traits>\r\n      </test>\r\n      <test name=\"LMS.Test.Unit_Tests.LMS_Business.Email_Test.Unsubscribe_Test.When_Loading.Load_TrueValue_TrueReturned\" type=\"LMS.Test.Unit_Tests.LMS_Business.Email_Test.Unsubscribe_Test.When_Loading\" method=\"Load_TrueValue_TrueReturned\" time=\"1.0195929\" result=\"Pass\">\r\n        <traits>\r\n          <trait name=\"04 - LMS &gt; Email\" value=\"Unsubscribe\" />\r\n          <trait name=\"Domain\" value=\"Email\" />\r\n          <trait name=\"Test Type\" value=\"Unit\" />\r\n        </traits>\r\n      </test>\r\n    </collection>\r\n    <collection total=\"1\" passed=\"1\" failed=\"0\" skipped=\"0\" name=\"Test collection for LMS.Test.Unit_Tests.LMS_Business.Email_Test.Unsubscribe_Test.When_Saving\" time=\"49.596\">\r\n      <test name=\"LMS.Test.Unit_Tests.LMS_Business.Email_Test.Unsubscribe_Test.When_Saving.Save_FalseValue_FalseValueSaved\" type=\"LMS.Test.Unit_Tests.LMS_Business.Email_Test.Unsubscribe_Test.When_Saving\" method=\"Save_FalseValue_FalseValueSaved\" time=\"49.5958384\" result=\"Pass\">\r\n        <traits>\r\n          <trait name=\"04 - LMS &gt; Email\" value=\"Unsubscribe\" />\r\n          <trait name=\"Domain\" value=\"Email\" />\r\n          <trait name=\"Test Type\" value=\"Unit\" />\r\n        </traits>\r\n      </test>\r\n    </collection>\r\n  </assembly>\r", :exception=>#<REXML::ParseException: No close tag for /assemblies
Line: 30
Position: 2455
Last 80 unconsumed characters:
>, :backtrace=>["uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rexml/parsers/treeparser.rb:28:in `parse'", "uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rexml/document.rb:288:in `build'", "uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rexml/document.rb:45:in `initialize'", "C:/Users/sgorasia/logstash7.4/logstash-7.4.0/vendor/bundle/jruby/2.5.0/gems/xml-simple-1.1.5/lib/xmlsimple.rb:971:in `parse'", "C:/Users/sgorasia/logstash7.4/logstash-7.4.0/vendor/bundle/jruby/2.5.0/gems/xml-simple-1.1.5/lib/xmlsimple.rb:164:in `xml_in'", "C:/Users/sgorasia/logstash7.4/logstash-7.4.0/vendor/bundle/jruby/2.5.0/gems/xml-simple-1.1.5/lib/xmlsimple.rb:203:in `xml_in'", "C:/Users/sgorasia/logstash7.4/logstash-7.4.0/vendor/bundle/jruby/2.5.0/gems/logstash-filter-xml-4.0.7/lib/logstash/filters/xml.rb:185:in `filter'", "C:/Users/sgorasia/logstash7.4/logstash-7.4.0/logstash-core/lib/logstash/filters/base.rb:143:in `do_filter'", "C:/Users/sgorasia/logstash7.4/logstash-7.4.0/logstash-core/lib/logstash/filters/base.rb:162:in `block in multi_filter'", "org/jruby/RubyArray.java:1800:in `each'", "C:/Users/sgorasia/logstash7.4/logstash-7.4.0/logstash-core/lib/logstash/filters/base.rb:159:in `multi_filter'", "org/logstash/config/ir/compiler/AbstractFilterDelegatorExt.java:115:in `multi_filter'", "C:/Users/sgorasia/logstash7.4/logstash-7.4.0/logstash-core/lib/logstash/java_pipeline.rb:243:in `block in start_workers'"]}

Single entry in kibana:-

<?xml version="1.0" encoding="utf-8"?>
<assemblies timestamp="05/13/2020 01:02:14">
  <assembly name="C:\tfssvc\Cornerstone\Dev\Cornerstone\DevPatch\CornerstoneApp\LMS.Test\bin\Debug\LMS.Test.DLL" environment="64-bit .NET 4.0.30319.42000 [collection-per-class, parallel (4 threads)]" test-framework="xUnit.net 2.4.1.0" run-date="2020-05-13" run-time="01:01:22" config-file="C:\tfssvc\Cornerstone\Dev\Cornerstone\DevPatch\CornerstoneApp\LMS.Test\bin\Debug\LMS.Test.dll.config" total="3" passed="3" failed="0" skipped="0" time="50.891" errors="0">
    <errors />
    <collection total="2" passed="2" failed="0" skipped="0" name="Test collection for LMS.Test.Unit_Tests.LMS_Business.Email_Test.Unsubscribe_Test.When_Loading" time="50.593">
      <test

As it says, there is no closing tag for <assemblies>, so it is not valid XML.

Humm ok Badger, but i tried many online xml validator and i got document valid
Here's the link i tried for validation:-
https://www.liquid-technologies.com/online-xml-validator

I will still try to look into what format logstash need for xml to be valid.

Just for ur refernce below is my xml file that is generated by Xunit runner after it completes running test from a particular repo.

Xml file;-

<?xml version="1.0" encoding="utf-8"?>
<assemblies timestamp="05/13/2020 01:02:14">
  <assembly name="C:\tfssvc\Cornerstone\Dev\Cornerstone\DevPatch\CornerstoneApp\LMS.Test\bin\Debug\LMS.Test.DLL" environment="64-bit .NET 4.0.30319.42000 [collection-per-class, parallel (4 threads)]" test-framework="xUnit.net 2.4.1.0" run-date="2020-05-13" run-time="01:01:22" config-file="C:\tfssvc\Cornerstone\Dev\Cornerstone\DevPatch\CornerstoneApp\LMS.Test\bin\Debug\LMS.Test.dll.config" total="3" passed="3" failed="0" skipped="0" time="50.891" errors="0">
    <errors />
    <collection total="2" passed="2" failed="0" skipped="0" name="Test collection for LMS.Test.Unit_Tests.LMS_Business.Email_Test.Unsubscribe_Test.When_Loading" time="50.593">
      <test name="LMS.Test.Unit_Tests.LMS_Business.Email_Test.Unsubscribe_Test.When_Loading.Load_NoValue_FalseReturned" type="LMS.Test.Unit_Tests.LMS_Business.Email_Test.Unsubscribe_Test.When_Loading" method="Load_NoValue_FalseReturned" time="49.5732281" result="Pass">
        <traits>
          <trait name="04 - LMS &gt; Email" value="Unsubscribe" />
          <trait name="Domain" value="Email" />
          <trait name="Test Type" value="Unit" />
        </traits>
      </test>
      <test name="LMS.Test.Unit_Tests.LMS_Business.Email_Test.Unsubscribe_Test.When_Loading.Load_TrueValue_TrueReturned" type="LMS.Test.Unit_Tests.LMS_Business.Email_Test.Unsubscribe_Test.When_Loading" method="Load_TrueValue_TrueReturned" time="1.0195929" result="Pass">
        <traits>
          <trait name="04 - LMS &gt; Email" value="Unsubscribe" />
          <trait name="Domain" value="Email" />
          <trait name="Test Type" value="Unit" />
        </traits>
      </test>
    </collection>
    <collection total="1" passed="1" failed="0" skipped="0" name="Test collection for LMS.Test.Unit_Tests.LMS_Business.Email_Test.Unsubscribe_Test.When_Saving" time="49.596">
      <test name="LMS.Test.Unit_Tests.LMS_Business.Email_Test.Unsubscribe_Test.When_Saving.Save_FalseValue_FalseValueSaved" type="LMS.Test.Unit_Tests.LMS_Business.Email_Test.Unsubscribe_Test.When_Saving" method="Save_FalseValue_FalseValueSaved" time="49.5958384" result="Pass">
        <traits>
          <trait name="04 - LMS &gt; Email" value="Unsubscribe" />
          <trait name="Domain" value="Email" />
          <trait name="Test Type" value="Unit" />
        </traits>
      </test>
    </collection>
  </assembly>
</assemblies>

The file appears to be valid XML, but the event you posted is not. And the configuration you posted would result in syntax errors and logstash refusing to start.

You are not doing what you say you are doing, which makes it difficult to help fix what you are doing.

Sorry for creating confusion.

To be clear below is my logstash.config file:- filter part is the one you provided me. As i mentioned earlier i am new to ruby so once i get this to work a bit i will explore more to expand the filter part.

Appreciate your time and efforts.

input {
 file {
  path => "C:/Users/sgorasia/unittestresult/TestResults.xml"
  start_position => beginning
  codec => multiline
  {
   pattern => "^<\?xmldata .*\>"
   negate => true
   what => "previous"
  }
 }
}

filter {
    xml {
        source => "message"
        target => "[@metadata][theXML]"
        remove_field => [ "message" ]
     }
    ruby {
        code => '
            a = event.get("[@metadata][theXML][assembly]")
            if a
                tests = []
                a.each { |x|
                    x["collection"].each { |y|
                        y["test"].each { |z|
                            tests << { "name" => z["name"], "type" => z["type"] }
                        }
                    }
                }
                event.set("tests", tests)
            end
        '
    }
 
}




output {
  stdout { codec => rubydebug }
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "logstashxml"

  }
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.