Logstash loops on ingestion

band · July 3, 2018, 4:10pm

I am facing a problem with Logstash on a single node. I need to ingest a ~30MB xml file with the following format

<ROOT>
    <field1>...</field1>
    <row>
        <fieldA>...</fieldA>
        <fieldB>...</fieldB>
    </row>
    <row>
        <fieldA>...</fieldA>
        <fieldB>...</fieldB>
    </row>
       :
</ROOT>

There are no problems if I shrink the file a bit, I tried with 8MB and logstash works correctly. However if I try to ingest the whole file I start getting these errors from elasticsearch:

[2018-07-03T17:53:51,870][INFO ][o.e.m.j.JvmGcMonitorService] [RurP6jQ] [gc][125] overhead, spent [262ms] collecting in the last [1s]
[2018-07-03T17:53:54,576][INFO ][o.e.c.m.MetaDataIndexTemplateService] [RurP6jQ] adding template [logstash] for index patterns [track*, planned*]
[2018-07-03T17:54:32,252][INFO ][o.e.m.j.JvmGcMonitorService] [RurP6jQ] [gc][165] overhead, spent [266ms] collecting in the last [1s]
[2018-07-03T17:56:08,744][INFO ][o.e.c.r.a.AllocationService] [RurP6jQ] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[track][0]] ...]).
[2018-07-03T17:58:06,301][INFO ][o.e.m.j.JvmGcMonitorService] [RurP6jQ] [gc][378] overhead, spent [294ms] collecting in the last [1s]
[2018-07-03T17:59:22,795][INFO ][o.e.m.j.JvmGcMonitorService] [RurP6jQ] [gc][454] overhead, spent [279ms] collecting in the last [1s]

While logstash outputs:

[WARN ] 2018-07-03 17:56:01.570 [Ruby-0-Thread-7@[main]>worker2: :1] elasticsearch - Marking url as dead. Last error: [LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError] Elasticsearch Unreachable: [http://localhost:9200/][Manticore::SocketTimeout] Read timed out {:url=>http://localhost:9200/, :error_message=>"Elasticsearch Unreachable: [http://localhost:9200/][Manticore::SocketTimeout] Read timed out", :error_class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError"}
[ERROR] 2018-07-03 17:56:01.606 [Ruby-0-Thread-7@[main]>worker2: :1] elasticsearch - Attempted to send a bulk request to elasticsearch' but Elasticsearch appears to be unreachable or down! {:error_message=>"Elasticsearch Unreachable: [http://localhost:9200/][Manticore::SocketTimeout] Read timed out", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError", :will_retry_in_seconds=>2}
[WARN ] 2018-07-03 17:56:04.169 [Ruby-0-Thread-7@[main]>worker2: :1] elasticsearch - UNEXPECTED POOL ERROR {:e=>#<LogStash::Outputs::ElasticSearch::HttpClient::Pool::NoConnectionAvailableError: No Available connections>}
[ERROR] 2018-07-03 17:56:04.177 [Ruby-0-Thread-7@[main]>worker2: :1] elasticsearch - Attempted to send a bulk request to elasticsearch, but no there are no living connections in the connection pool. Perhaps Elasticsearch is unreachable or down? {:error_message=>"No Available connections", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::NoConnectionAvailableError", :will_retry_in_seconds=>4}
[INFO ] 2018-07-03 17:56:05.030 [Ruby-0-Thread-4: :1] elasticsearch - Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://localhost:9200/, :path=>"/"}
[WARN ] 2018-07-03 17:56:05.064 [Ruby-0-Thread-4: :1] elasticsearch - Restored connection to ES instance {:url=>"http://localhost:9200/"}

and so on.

If I don't kill logstash it keeps ingesting data, but the strange thing is that I only have about 30k rows in my file but the ingestion doesn't stop ( I even found 400k+ indexed rows when I tried to let them run).

Has anyone ever experienced this?

Badger · July 3, 2018, 6:35pm

Not I. How are you ingesting the file? What does your logstash configuration look like, and if you are ingesting with filebeat, what does the filebeat configuration look like.

band · July 3, 2018, 7:01pm

I am not using Filebeat. The Logstash configuration file looks like this:

input { stdin { } }

filter {
xml {
source => "message"
target => "xmldata"
store_xml => "false"
xpath => ["/ROOT/field1/text()","field1"]
xpath => ["/ROOT//row","row"]
}
mutate {
remove_field => [ "message", "inxml", "xmldata" ]
}
split {
field => "[row]"
}

xml {
source => "row"
store_xml => "false"
xpath => ["/row/fieldA/text()","fieldA"]
xpath => ["/row/fieldB/text()","fieldB"]
}

mutate {
replace => {
"field1" => "%{[field1][0]}"
"fieldA" => "%{[fieldA][0]}"
"fieldB" => "%{[fieldB][0]}"
}
}
mutate {
remove_field => [ "ROOT", "row",'@timestamp', '@version', 'path', 'host']
}
mutate { convert => {"fieldA" => "integer"} }

}
output {
elasticsearch {
hosts => "localhost"
index => "my_index"
document_type => "_doc"
template => "my_template.json"
manage_template => "true"
template_overwrite => "true"
codec => json
}
stdout { }
}

Logstash is called with:
cat my_data.xml | sudo -Hu logstash /usr/share/logstash/bin/logstash -f my_index.config

Badger · July 3, 2018, 7:11pm

I do not understand what you mean when you say it worked with an 8 MB file. Did you look at the documents that ended up in elasticsearch? Each line of the file is indexed separately. Which means none of the xml parsing can work. You will need to use a multiline codec on the input. Is the entire 30 MB a single XML entity? That might be a problem.

[2018-07-03T17:58:06,301][INFO ][o.e.m.j.JvmGcMonitorService] [RurP6jQ] [gc][378] overhead, spent [294ms] collecting in the last [1s]
[2018-07-03T17:59:22,795][INFO ][o.e.m.j.JvmGcMonitorService] [RurP6jQ] [gc][454] overhead, spent [279ms] collecting in the last [1s]

This is a non-issue. A 279 milliseconds spent in GC in 76 seconds is nothing.

band · July 3, 2018, 7:31pm

Yes, I tries using the multiline codec but I was having some issues with it so I used a workaround. I forgot to mention that the file given to the logstash stdin is first "flattened" with

cat [original xml file] | sed -zE 's/[[:space:]]+(<)/\1/g' > [flattened xml file]

So everything now is on a single line of file. And yes, the file is a single 30 MB XML entity

What I mean by shrinking the file is this: there are around 30k <row>'s. I removed some of them (i.e. I kept only the row>'s from 1st to 8k-th to see if the problem was the file size)

Badger:

[2018-07-03T17:59:22,795][INFO ][o.e.m.j.JvmGcMonitorService] [RurP6jQ] [gc][454] overhead, spent [279ms] collecting in the last [1s]
This is a non-issue. A 279 milliseconds spent in GC in 76 seconds is nothing.

I don't really understand this. This happens only if I ingest the whole 30 MB file. Smaller XMLs work fine

band · July 11, 2018, 3:14pm

Bump

system · August 8, 2018, 3:14pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Ingestion performance issues - where to start? Elasticsearch	6	655	September 18, 2020
Logstash missing XML records Logstash	1	343	October 7, 2020
How to gain ingestion rate Elasticsearch	15	5908	July 5, 2017
Tuning to handle extreme initial ingestion conditions (with logstash) Elasticsearch	4	993	July 27, 2019
Large XML crashes logstash with OOM Logstash	5	2716	July 6, 2017

Logstash loops on ingestion

Related topics