Not able to remove tag from xml

I am trying to load xml through logstash. I have an unwnated tag which needs to be removed from xml while parsing. Used remove_tag but not able to remove the tag while indexing to Elasticsearch
xml File

<?xml version="1.0" encoding="UTF-8"?>
<testsuites name="Report" time="212.715" tests="7" failures="1" errors="3">
   <testsuite name="Report" tests="7" failures="1" errors="3" time="212.715" skipped="0" timestamp="2023-02-23 16:45:10" id="Test Suites/Report">
      <testcase name="Test Cases/01" time="34.627" classname="Test Cases/01" status="FAILED">
        <system-err><![CDATA[2023-02-23 16:45:10 - [TEST_SUITE][FAILED] - Report: Test Cases/TC01 FAILED.
Reason:
com.kms.katalon.core.exception.StepFailedException: Unable to scroll to object 'Object Repository/ScrollToEle'
]]></system-err>
      </testcase>
      <testcase name="Test Cases/02" time="34.627" classname="Test Cases/02" status="PASSED">
       
      </testcase>
      <system-err><![CDATA[2023-02-23 16:45:10 - [TEST_SUITE][FAILED] - Report: Test Cases/TC01 FAILED.
Reason:
com.kms.katalon.core.exception.StepFailedException: Unable to scroll to object 'Object Repository/ScrollToEle'
]]></system-err>
   </testsuite>
</testsuites>

logstash Conf file

input {
  file 
{
    path => "C/log-sample.xml"
    start_position => "beginning"
     codec => multiline 
    {
        pattern => "<testsuites>"
        negate => true
        what => "previous"
        max_lines => 50000
    }
    sincedb_path => "NULL"
  }
}

filter {
xml {
source => "message"
target => "theXML"
force_array => false
remove_field => ["message"]
}
mutate {remove_tag => ["[testsuites][testsuite][testcase][system-err]"]}
ruby {
code => '
event.get("theXML").each { |k, v|
event.set(k,v)
}
event.remove("theXML")
'
}
}

output 
{
    elasticsearch {
hosts => "localhost:9200"
index => "xml_testing"
}
    stdout 
    {
        codec => rubydebug
    }
}

output I got. I still see system-err tag

{
"took": 13,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "xml_testing",
"_type": "_doc",
"_id": "v7GTnYYBhrMoCAz3NLRv",
"_score": 1,
"_source": {
"tags": [
"multiline"
],
"name": "Report",
"time": "212.715",
"tests": "7",
"host": "hdc3-l-JNRPMW2",
"path": "C:/Users/navya.krishna.voggu/Downloads/sample/sample.xml",
"failures": "1",
"@timestamp": "2023-03-01T14:28:33.127Z",
"@version": "1",
"errors": "3",
"testsuite": {
"skipped": "0",
"failures": "1",
"timestamp": "2023-02-23 16:45:10",
"testcase": [
{
"time": "34.627",
"classname": "Test Cases/01",
"status": "FAILED",
"system-err": "2023-02-23 16:45:10 - [TEST_SUITE][FAILED] - Report: Test Cases/TC01 FAILED.\nReason:\ncom.kms.katalon.core.exception.StepFailedException: Unable to scroll to object 'Object Repository/ScrollToEle'\n",
"name": "Test Cases/01"
},
{
"name": "Test Cases/02",
"time": "34.627",
"classname": "Test Cases/02",
"status": "PASSED",
"content": "\n       \n      "
}
],
"name": "Report",
"id": "Test Suites/Report",
"system-err": "2023-02-23 16:45:10 - [TEST_SUITE][FAILED] - Report: Test Cases/TC01 FAILED.\nReason:\ncom.kms.katalon.core.exception.StepFailedException: Unable to scroll to object 'Object Repository/ScrollToEle'\n",
"tests": "7",
"time": "212.715",
"errors": "3"
}
}
}
]
}
}

The remove_tag is used to remove tags from the tags field, what you want is to remove a field.

You need to use the remove_field .

mutate {
    remove_field => ["field-name"]
}

Tried with remove_field too but still it did not help. Still able to see system-err field

mutate {
    remove_field => ["system-err"]
}

You need the full field name, which appears to be "[testsuite][testcase][0][system-err]"

No luck with this too. I have given filter as below still able to see system-err

filter {
xml {
source => "message"
target => "theXML"
force_array => false
remove_field => ["message"]
}
mutate {remove_field => ["[testsuite][testcase][0][system-err]"]}
ruby {
code => '
event.get("theXML").each { |k, v|
event.set(k,v)
}
event.remove("theXML")
'
}
}

What does your event look like if you use output { stdout { codec => rubydebug } }

You are using the target option in the xml filter, so you won't have your field in the root of the document.

Try to use [theXML][testsuite][testcase][0][system-err].

1 Like

Thanks, that worked!
If there are multiple testcase fields which consists of multiple system-err field as below xml, how can we remove all the fields with name system-err at once?

<?xml version="1.0" encoding="UTF-8"?>
<testsuites name="Report" time="212.715" tests="7" failures="1" errors="3">
   <testsuite name="Report" tests="7" failures="1" errors="3" time="212.715" skipped="0" timestamp="2023-02-23 16:45:10" id="Test Suites/Report">
      <testcase name="Test Cases/01" time="34.627" classname="Test Cases/01" status="FAILED">
        <system-err><![CDATA[2023-02-23 16:45:10 - [TEST_SUITE][FAILED] - Report: Test Cases/TC01 FAILED.
Reason:
com.kms.katalon.core.exception.StepFailedException: Unable to scroll to object 'Object Repository/ScrollToEle'
]]></system-err>
      </testcase>
      <testcase name="Test Cases/02" time="34.627" classname="Test Cases/02" status="FAILED">
       <system-err><![CDATA[2023-02-23 16:45:10 - [TEST_SUITE][FAILED] - Report: Test Cases/TC02 FAILED.
Reason:
com.kms.katalon.core.exception.StepFailedException: Unable to scroll to object 'Object Repository/ScrollToEle'
]]></system-err>
      </testcase>
	  <testcase name="Test Cases/03" time="34.627" classname="Test Cases/03" status="FAILED">
       <system-err><![CDATA[2023-02-23 16:45:10 - [TEST_SUITE][FAILED] - Report: Test Cases/TC03 FAILED.
Reason:
com.kms.katalon.core.exception.StepFailedException: Unable to scroll to object 'Object Repository/ScrollToEle'
]]></system-err>
      </testcase>
   </testsuite>
</testsuites>

You will need to do this with a ruby code.

I have tried using this ruby code but it is not removing system-err.

ruby {
code => '
event.get("theXML").each_with_index { |b,index|
event.remove("[theXML][testsuite][testcase][#{index}][system-err]")
}
}
    xml { source => "message" target => "theXML" force_array => false remove_field => ["message"] }
    ruby { code => 'event.get("theXML").each_with_index { |b,index| event.remove("[theXML][testsuite][testcase][#{index}][system-err]") }' }
    ruby { code => 'event.remove("theXML").each { |k, v| event.set(k, v) }' }

works just fine for me.

When I am using this code it is working for 6 indices i.e, we are able to remove system-err upto 6th index(testcase[0] to [5]) after that from 7th index(testcase[6]) we are able to see system-err field. Is there any max limit for indices?

No, there is not.

But I see it is only removing the field for few indices. Any idea why this is happening?

Hi @Badger,

Can you suggest a solution for this?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.