Logstash doesn't work with XML file

    XML file

<?xml version="1.0" encoding="UTF-8"?>
-<Documents xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://localhost/request">
-<Document>
<Status>Зарегистрировано</Status>
<Number>000000001</Number>
<Tip>Внутренняя</Tip>
</Document>
-<Document>
<Status>Исполнено</Status>
<Number>000000002</Number>
<Tip>Внешняя</Tip>
</Document>
-<Document>
   <Status>Зарегистрировано</Status>
<Number>000000003</Number>
<Tip>Внешняя</Tip>
</Document>
-<Document>
<Status>Исполнено</Status>
<Number>000000004</Number>
<Tip>Внутренняя</Tip>
</Document>

Logstash config

        input {
         file {
        	path =>"D:/ElasticMain/1C/servicesdeskxml.xml"
        	start_position => beginning
        		type => "document"
        	codec => multiline
        		{
        		pattern => "^<\?Document .*\>"
        		negate => true
        		what => "previous"
        	}
         }
        }
        filter {
        		xml {
        		source => "document"
        		xpath =>
        		[
        			"/documents/document/status/text()", "document_status",	
        			"/documents/document/number/text()", "document_number",
        			"/documents/document/tip/text()", "document_tip"
           ]
           store_xml => true
           target => "doc"
        }
         }
         output {
        		stdout { codec => rubydebug }
        elasticsearch {
        		index => "logstash-xml"
        		hosts => ["localhost:9200"]
        		document_type => "document"
         }
        }

logstash log

[2019-08-28T10:10:24,995][INFO ][filewatch.observingtail  ] START, creating Discoverer, Watch with file and sincedb collections
[2019-08-28T10:10:25,339][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}

That does not match anything in your file, so every line in the file will be combined into a single event. However, the event only gets flushed to the pipeline when a line that matches that pattern occurs, and that never happens.

If you want to consume the file as a single event then you can just add 'auto_flush_interval => 1' to the codec options. You could then use a split filter to divide the event into individual documents.

If you want to consume each document as an event then change the pattern option to be "pattern => '^'" and add a mutate filter to remove </Documents> from the last document in the file.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.