Logstash -> Elastic Timeouts

I have 2 logstash nodes with about 30 different pipelines sending data to a 6 node elastic cluster. For about half of my pipelines I am seeing a ton of errors like this in the /var/log/logstash/pipeline_mypipeline.log files:

[WARN ][logstash.outputs.elasticsearch] Marking url as dead. Last error: [LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError] Elasticsearch Unreachable: [https://node1.mynetwork.com:9200/_bulk][Manticore::SocketTimeout] Read timed out and [ERROR][logstash.outputs.elasticsearch] Attempted to send a bulk request but Elasticsearch appears to be unreachable or down {:message=>"Elasticsearch Unreachable: [https://node1.mynetwork.com:9200/_bulk][Manticore::SocketTimeout] Read timed out"

My data is ingesting fine. I suppose once a node times out then logstash moves on to another one, but I would like to handle these messages. Some are labeled as "WARN" and others are "ERROR". I see the elasticsearch output has a timeout parameter. Bumping that up to 2 minutes instead of 1 might do the trick but that seems more like avoiding the problem instead of handling it.

I have a monitoring cluster set up. My cluster health is green, every node has plenty of free space, cpu is around 20%... For the logstash nodes their CPU is low (5%), JVM heap averages around 6.1GB out of 7.9GB.

What should I look at to address timeout warnings / errors like these?

Hello,
Could you please paste your elastic output from logstash conf file?

output {
    elasticsearch {
        hosts => ["https://node1.mynetwork.com:9200", "https://node2.mynetwork.com:9200", ...]
        manage_template => false
        index => "%{[@metadata][target_index]}"
        pipeline => "xdr-all"
        user => "${xdr_user}"
        password => "${xdr_password}"
        document_id => "%{log.fingerprint}"
        ssl => true
        cacert => '/etc/certificates/logstash/my-cert.pem'
        data_stream => "false"
    }
}

Hello @6igwig

The log clearly shows that the below host is unreachable, could you check whether the node is up or not?

https://node1.mynetwork.com:9200

The node is up

Maybe node is up but how about FW?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.