Syntax error for XPath

Hi All,
Trying to get the xml-xpath working, but logstash is crashing out at a certain part of the config file.
The error points to it being a syntax problem

	Jul 28 17:09:30 QPS-HPI-TESTELK1 logstash[730767]: [2020-07-28T17:09:30,986][FATAL [logstash.runner          ] An unexpected error occurred! {:error=>java.lang.IllegalStateException: org.jruby.exceptions.RaiseException: (SyntaxError) /root/request/UpdateRequest/kernel/a:item[@item='nexus version']/text(), :backtrace=>["org.logstash.execution.WorkerLoop.run(org/logstash/execution/WorkerLoop.java:1>
	Jul 28 17:09:30 QPS-HPI-TESTELK1 logstash[730767]: [2020-07-28T17:09:30,999][ERROR][org.logstash.Logstash    ] java.lang.IllegalStateException: Logstash stopped processing because of an error: (SystemExit) exit

My config file contains this

	xml {
		source => "test"
		store_xml => "false"
		remove_namespaces => "false"
		target => "test"
		xpath => [
			"/root/request/ClientTimeStamp/text()", "client_timestamp",
			"/root/request/AuthToken/text()", "authtoken",
			"/root/request/UpdateRequest/kernel/version/text()", "kernel_version",
			"/root/request/UpdateRequest/kernel/a:item[@item='nexus version']/text()", "kernel_nexus_version"
		]
	}

Not too sure what the 'proper' syntax for xpath should be. Using an online x-path tester it works just not in logstash.

Currently running logstash 7.8.0

can u please put some sample of your data payload too?

Also I presume, the above xml config you provided is within Filter? How's your inputs? Does inputs have multi-line codec? (good to post your inputs and filter config)

Thanks for replying @kelk

Currently filebeating a test log file across. Ingestion works fine as I have tested it with other filters.
Input for the problematic log entry

input {
    beats {
            type => "beats"
            port => "5044"
    }
}

    filter {
        if ([fields][log_type] == "apilog"){
	        grok {
		        match => {
			        "message" => [
				        "^%{YEAR:year}-%{MONTHNUM:month}-%{MONTHDAY:day} %{TIME:time} \[%{BASE10NUM:message_number}\] %{WORD:logtype}  \- %{IP:ip}\:%{POSINT:port} \- %{GREEDYDATA:action} \(%{GREEDYDATA:test}\)"
			        ]
		        }
	        }
	        xml {
		        source => "test"
		        store_xml => "false"
		        remove_namespaces => "true"
    			target => "test"
        		xpath => [
	        		"/root/request/ClientTimeStamp/text()", "client_timestamp",
		        	"/root/request/AuthToken/text()", "authtoken",
			        "/root/request/UpdateRequest/kernel/version/text()", "kernel_version",
    				"/root/request/UpdateRequest/kernel/a:item[@item='nexus version']/text()", "kernel_nexus_version"
        		]
	        }
    	}

    output {
        elasticsearch {
	        	hosts => ["http://192.168.0.2:9200"]
		        index => "test"
        }
    }

Payload is

    2020-07-01 10:39:46,046 [72] INFO  - 12.345.678.99:12345 - Terminal Management Request (<root type="object">
  <request type="object">
    <ClientTimeStamp type="string">20200728</ClientTimeStamp>
    <AuthToken type="string">XXX123XXX456XXX789</AuthToken>
    <UpdateRequest type="object">
      <kernel type="object">
        <name type="string">OS</name>
        <version type="string">1.01</version>
        <a:item xmlns:a="item" item="nexus version" type="string">0.12</a:item>
        <a:item xmlns:a="item" item="active bank" type="string">0</a:item>
        <a:item xmlns:a="item" item="active bank flag" type="string">0</a:item>
        <development type="string">0</development>
        <serial type="string">12345678</serial>
        <a:item xmlns:a="item" item="model name" type="string">XXX</a:item>
        <a:item xmlns:a="item" item="display columns" type="string">200</a:item>
        <a:item xmlns:a="item" item="display rows" type="string">200</a:item>
        <a:item xmlns:a="item" item="msr tracks" type="array">
        </a:item>
      </kernel>
    </UpdateRequest>
  </request>
</root>)

And yes you are correct, multiline is used in filebeat.

It's a namespace problem. You can query your field without it:

"/root/request/UpdateRequest/kernel/item[@item='nexus version']/text()", "kernel_nexus_version"

Managed to simulate and fix it. Please find the simulation data and fix

# mixed_xml_log_sample.xml
2020-07-01 10:39:46,046 [72] INFO  - 192.168.1.2:12345 - Terminal Management Request (<root type="object">
<request type="object">
  <ClientTimeStamp type="string">20200728</ClientTimeStamp>
  <AuthToken type="string">XXX123XXX456XXX789</AuthToken>
  <UpdateRequest type="object">
    <kernel type="object">
      <name type="string">OS</name>
      <version type="string">1.01</version>
      <a:item xmlns:a="item" item="nexus version" type="string">0.12</a:item>
      <a:item xmlns:a="item" item="active bank" type="string">0</a:item>
      <a:item xmlns:a="item" item="active bank flag" type="string">0</a:item>
      <development type="string">0</development>
      <serial type="string">12345678</serial>
      <a:item xmlns:a="item" item="model name" type="string">XXX</a:item>
      <a:item xmlns:a="item" item="display columns" type="string">200</a:item>
      <a:item xmlns:a="item" item="display rows" type="string">200</a:item>
      <a:item xmlns:a="item" item="msr tracks" type="array"></a:item>
    </kernel>
  </UpdateRequest>
</request>
</root>)
input {
    file {
        path => "/tmp/data/mixed_xml_log_sample.xml"
        start_position => "beginning"
        sincedb_path => "/dev/null"
        exclude => "*.gz"
        type => "xml"
        codec => multiline {
            pattern => "^%{TIMESTAMP_ISO8601}"
            negate => "true"
            what => "previous"
        }
    }
}

# xpath namespace problem
filter {
      grok {
        match => {
          "message" => "^(?m)%{TIMESTAMP_ISO8601:timestamp} \[%{BASE10NUM:message_number}\] %{WORD:logtype}\s*\-\s*%{IP:ip}\:%{NUMBER:port} \- %{GREEDYDATA:action} \(%{GREEDYDATA:test}\)"
        }
      }
      xml {
        source => "test"
        store_xml => "false"
        remove_namespaces => "true"
        target => "test"
        xpath => [
          "/root/request/ClientTimeStamp/text()", "client_timestamp",
          "/root/request/AuthToken/text()", "authtoken",
          "/root/request/UpdateRequest/kernel/version/text()", "kernel_version",
          "/root/request/UpdateRequest/kernel/item[@item='nexus version']/text()", "kernel_nexus_version"
        ]
      }
}

output {
    stdout {
        codec => rubydebug
    }
}

Beautiful, works as expected now. Thanks @Jenni and @kelk for the help. Much appreciated.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.