Logstash loops on ingestion

I am facing a problem with Logstash on a single node. I need to ingest a ~30MB xml file with the following format

<ROOT>
    <field1>...</field1>
    <row>
        <fieldA>...</fieldA>
        <fieldB>...</fieldB>
    </row>
    <row>
        <fieldA>...</fieldA>
        <fieldB>...</fieldB>
    </row>
       :
</ROOT>

There are no problems if I shrink the file a bit, I tried with 8MB and logstash works correctly. However if I try to ingest the whole file I start getting these errors from elasticsearch:

[2018-07-03T17:53:51,870][INFO ][o.e.m.j.JvmGcMonitorService] [RurP6jQ] [gc][125] overhead, spent [262ms] collecting in the last [1s]
[2018-07-03T17:53:54,576][INFO ][o.e.c.m.MetaDataIndexTemplateService] [RurP6jQ] adding template [logstash] for index patterns [track*, planned*]
[2018-07-03T17:54:32,252][INFO ][o.e.m.j.JvmGcMonitorService] [RurP6jQ] [gc][165] overhead, spent [266ms] collecting in the last [1s]
[2018-07-03T17:56:08,744][INFO ][o.e.c.r.a.AllocationService] [RurP6jQ] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[track][0]] ...]).
[2018-07-03T17:58:06,301][INFO ][o.e.m.j.JvmGcMonitorService] [RurP6jQ] [gc][378] overhead, spent [294ms] collecting in the last [1s]
[2018-07-03T17:59:22,795][INFO ][o.e.m.j.JvmGcMonitorService] [RurP6jQ] [gc][454] overhead, spent [279ms] collecting in the last [1s]

While logstash outputs:

[WARN ] 2018-07-03 17:56:01.570 [Ruby-0-Thread-7@[main]>worker2: :1] elasticsearch - Marking url as dead. Last error: [LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError] Elasticsearch Unreachable: [http://localhost:9200/][Manticore::SocketTimeout] Read timed out {:url=>http://localhost:9200/, :error_message=>"Elasticsearch Unreachable: [http://localhost:9200/][Manticore::SocketTimeout] Read timed out", :error_class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError"}
[ERROR] 2018-07-03 17:56:01.606 [Ruby-0-Thread-7@[main]>worker2: :1] elasticsearch - Attempted to send a bulk request to elasticsearch' but Elasticsearch appears to be unreachable or down! {:error_message=>"Elasticsearch Unreachable: [http://localhost:9200/][Manticore::SocketTimeout] Read timed out", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError", :will_retry_in_seconds=>2}
[WARN ] 2018-07-03 17:56:04.169 [Ruby-0-Thread-7@[main]>worker2: :1] elasticsearch - UNEXPECTED POOL ERROR {:e=>#<LogStash::Outputs::ElasticSearch::HttpClient::Pool::NoConnectionAvailableError: No Available connections>}
[ERROR] 2018-07-03 17:56:04.177 [Ruby-0-Thread-7@[main]>worker2: :1] elasticsearch - Attempted to send a bulk request to elasticsearch, but no there are no living connections in the connection pool. Perhaps Elasticsearch is unreachable or down? {:error_message=>"No Available connections", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::NoConnectionAvailableError", :will_retry_in_seconds=>4}
[INFO ] 2018-07-03 17:56:05.030 [Ruby-0-Thread-4: :1] elasticsearch - Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://localhost:9200/, :path=>"/"}
[WARN ] 2018-07-03 17:56:05.064 [Ruby-0-Thread-4: :1] elasticsearch - Restored connection to ES instance {:url=>"http://localhost:9200/"}

and so on.

If I don't kill logstash it keeps ingesting data, but the strange thing is that I only have about 30k rows in my file but the ingestion doesn't stop ( I even found 400k+ indexed rows when I tried to let them run).

Has anyone ever experienced this?

Not I. How are you ingesting the file? What does your logstash configuration look like, and if you are ingesting with filebeat, what does the filebeat configuration look like.

I am not using Filebeat. The Logstash configuration file looks like this:

input { stdin { } }

filter {
xml {
source => "message"
target => "xmldata"
store_xml => "false"

xpath => ["/ROOT/field1/text()","field1"]
xpath => ["/ROOT//row","row"]

}
mutate {
remove_field => [ "message", "inxml", "xmldata" ]
}
split {
field => "[row]"
}

xml {
source => "row"
store_xml => "false"
xpath => ["/row/fieldA/text()","fieldA"]
xpath => ["/row/fieldB/text()","fieldB"]
}

mutate {
replace => {
"field1" => "%{[field1][0]}"
"fieldA" => "%{[fieldA][0]}"
"fieldB" => "%{[fieldB][0]}"
}
}
mutate {
remove_field => [ "ROOT", "row",'@timestamp', '@version', 'path', 'host']
}
mutate { convert => {"fieldA" => "integer"} }

}
output {
elasticsearch {
hosts => "localhost"
index => "my_index"
document_type => "_doc"
template => "my_template.json"
manage_template => "true"
template_overwrite => "true"
codec => json
}
stdout { }
}

Logstash is called with:
cat my_data.xml | sudo -Hu logstash /usr/share/logstash/bin/logstash -f my_index.config

I do not understand what you mean when you say it worked with an 8 MB file. Did you look at the documents that ended up in elasticsearch? Each line of the file is indexed separately. Which means none of the xml parsing can work. You will need to use a multiline codec on the input. Is the entire 30 MB a single XML entity? That might be a problem.

[2018-07-03T17:58:06,301][INFO ][o.e.m.j.JvmGcMonitorService] [RurP6jQ] [gc][378] overhead, spent [294ms] collecting in the last [1s]
[2018-07-03T17:59:22,795][INFO ][o.e.m.j.JvmGcMonitorService] [RurP6jQ] [gc][454] overhead, spent [279ms] collecting in the last [1s]

This is a non-issue. A 279 milliseconds spent in GC in 76 seconds is nothing.

Yes, I tries using the multiline codec but I was having some issues with it so I used a workaround. I forgot to mention that the file given to the logstash stdin is first "flattened" with

cat [original xml file] | sed -zE 's/[[:space:]]+(<)/\1/g' > [flattened xml file]

So everything now is on a single line of file. And yes, the file is a single 30 MB XML entity

What I mean by shrinking the file is this: there are around 30k <row>'s. I removed some of them (i.e. I kept only the row>'s from 1st to 8k-th to see if the problem was the file size)

I don't really understand this. This happens only if I ingest the whole 30 MB file. Smaller XMLs work fine

Bump

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.