Unable to load the xml document to elastic search

Hi Team, I am trying to load the whole xml document to elastic search using logstash. I don't see any errors in the logstash console, and the document is not there in elastic search also. my config file below.

input {
  file {
    path => "/Users/userid/POC/ELK/data/data.xml"
    start_position => "beginning"
    sincedb_path => "nul"
      type => "xml"
    codec => multiline
      {
       pattern => "<Document"
       negate => "true"
       what => "previous"
       auto_flush_interval=>2
      }
    }
}

filter {
    xml {
      source => "message"
      remove_namespaces => true
      target => "doc"
  }
}
output
{
	stdout {
	}

 	elasticsearch {
    hosts => ["localhost:9200"]
    index => "document"
  }

}

Sample Xml document

<?xml version="1.0" encoding="UTF-8"?>
<Document> 
    <recordTarget>
          <role>
                   .....
                  ...... have multiple internal tags....
          </role>
    </recordTarget>
 </Document>

You do not have a field called [ClinicalDocument]. The file input will create a field called [message].

If you do not want the in-memory sincedb persisted across restarts then use sincedb_path => "NUL" on Windows and sincedb_path => "/dev/null" on UNIX.

Hi @Badger , Thanks for your response. I have updated the config with correct tag. Also, I tried using the Source as message, still I am getting the same response, I don't see any error message but not able to see it in elastic search.

Hi @Badger I was able to see the xml in the kibana after running the logstash command using sudo command. But the XML parsing is failing at different stages. Single XML file created 7 events with the below tags multiline, multiline_codec_max_lines_reached, _xmlparsefailure. I would like to create a single event for the xml content. Below is my current config file.

input {
    file {
        path => "/Users/user/POC/ELK/data/data.xml"
        start_position => "beginning"
        sincedb_path => "/dev/null"
        codec => multiline {
            pattern => "<Document"
            negate => "true"
            what => "previous"
            auto_flush_interval => 1
            max_lines => 2000
        }
    }
}
filter {
    xml {
      source => "message"
      target => "xml_content"
    }
}
output {
    stdout { codec => rubydebug }
    elasticsearch{
      hosts => ["localhost:9200"]
      index => "document"
    }

}

If the multiline codec stops accumulating lines before it reaches the next <Document element then it is not going to be valid XML and the xml filter will not be able to parse it.