Data sent to elasticsearch only after SIGTERM

Hello,

I'm using logstash to parse an XML file, split it, do some basic extraction and send results to elasticsearch. I also use kibana. All in kubernetes.

Logstash's version is 6.2.4.
Elasticsearch's version is 6.2.4.
Kibana's version is 6.2.4.

My input xml file is more than 100 000 lines and less than 300 000 lines.

My config file is :

input {
  file {
    path => "/usr/share/logstash/files/test7.xml"
    start_position => beginning
    sincedb_path => "/dev/null"
    codec => multiline
    {
      pattern => "^<\?Report .*\>"
      negate => true
      what => "previous"
      max_lines => 300000
    }
  }
}
filter {
  xml {
    store_xml => false
    source => "message"
    xpath => ["/Report/Takes/Take", "take"]
  }
  mutate {
    remove_field => [ "message" ]
  }
  split {
    field => "[take]"
  }
  xml {
    source => "take"
    store_xml => "false"
    xpath => ["/Take/CarId/text()","carId"]
    xpath => ["/Take/ModelId/text()","modelId"]
    xpath => ["/Take/ColorDetails/Mode/text()","mode"]
    xpath => ["/Take/ColorDetails/Polarisation/text()","polarisation"]
  }
}
output {
  elasticsearch {
    index => "logstash-test-xml"
    hosts => ["es-svc:25000"]
    document_type => "xmlfiles"
  }
  stdout { codec => rubydebug }
}

My xml file (simplified for the forum to only show 2 exemples of the "data" I use) looks like this :

<?xml version="1.0" standalone="yes"?>
<Report>
<ReportingTime>2018-08-07T08:15:37</ReportingTime>
<ValidityStart>2018-08-07T19:00:00</ValidityStart>
<ValidityStop>2018-09-02T22:00:00</ValidityStop>
<Takes>
  <Take>
    <CarId>S1A</CarId>
    <ModelId>164761</ModelId>
    <ColorDetails>
      <InstrumentId>Tec instrument</InstrumentId>
      <Mode>AA</Mode>
      <Swath>BB</Swath>
      <Polarisation>DV</Polarisation>
    </ColorDetails>
  </Take>
  <Take>
    <CarId>S1A</CarId>
    <ModelId>164762</ModelId>
    <ColorDetails>
      <InstrumentId>Tec instrument</InstrumentId>
      <Mode>AB</Mode>
      <Swath>DC</Swath>
      <Polarisation>DH</Polarisation>
    </ColorDetails>
    </Take>
  </Takes>
</Report>

Now the log with debug level, only the end when the big xml file is almost totally parsed. Then I wait 4 minutes because I have no data in elasticsearch and kill logstash and only after the kill my data is sent to elasticsearch:
https://pastebin.com/eViSL81t

(I had to post in pastebin due to size limit on the forum)

What is wrong with my configuration ?

Thanks.

Ok I have found the issue with my configuration.

The pattern is false. I changed from :

pattern => "^<\?Report .*\>"

to :

pattern => "<\Report>"

Indeed as my xml file contains only 1 element of "Report", logstash did not find a new one and never wrote to the output until I kill it (properly ends the workers).

Solved.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.