Lots of 429s, can't connect

A little over a week ago, I updated from 7.0 to 7.1.1. I didn't realize it until late last week, but going back through logs I see I am getting a ton of 429 errors that started about the same time. Now it sends messages saying it mostly can't connect and almost nothing makes it to elasticsearch from logstash. Sample messages include:

[2019-06-10T15:48:59,203][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://10.226.1.93:9200/"}
[2019-06-10T15:48:59,207][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://10.226.1.96:9200/"}
[2019-06-10T15:49:03,504][WARN ][logstash.outputs.elasticsearch] Marking url as dead. Last error: [LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError] Elasticsearch Unreachable: [http://10.226.1.99:9200/][Manticore::SocketTimeout] Read timed out {:url=>http://10.226.1.99:9200/, :error_message=>"Elasticsearch Unreachable: [http://10.226.1.99:9200/][Manticore::SocketTimeout] Read timed out", :error_class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError"}
[2019-06-10T15:49:03,504][ERROR][logstash.outputs.elasticsearch] Attempted to send a bulk request to elasticsearch' but Elasticsearch appears to be unreachable or down! {:error_message=>"Elasticsearch Unreachable: [http://10.226.1.99:9200/][Manticore::SocketTimeout] Read timed out", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError", :will_retry_in_seconds=>8}
[2019-06-10T15:49:04,212][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://10.226.1.99:9200/"}
[2019-06-10T15:49:08,257][WARN ][logstash.outputs.elasticsearch] Marking url as dead. Last error: [LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError] Elasticsearch Unreachable: [http://10.226.1.95:9200/][Manticore::SocketTimeout] Read timed out {:url=>http://10.226.1.95:9200/, :error_message=>"Elasticsearch Unreachable: [http://10.226.1.95:9200/][Manticore::SocketTimeout] Read timed out", :error_class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError"}
[2019-06-10T15:49:08,258][ERROR][logstash.outputs.elasticsearch] Attempted to send a bulk request to elasticsearch' but Elasticsearch appears to be unreachable or down! {:error_message=>"Elasticsearch Unreachable: [http://10.226.1.95:9200/][Manticore::SocketTimeout] Read timed out", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError", :will_retry_in_seconds=>16}
[2019-06-10T15:49:08,269][WARN ][logstash.outputs.elasticsearch] Marking url as dead. Last error: [LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError] Elasticsearch Unreachable: [http://10.226.1.96:9200/][Manticore::SocketTimeout] Read timed out {:url=>http://10.226.1.96:9200/, :error_message=>"Elasticsearch Unreachable: [http://10.226.1.96:9200/][Manticore::SocketTimeout] Read timed out", :error_class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError"}

Any assistance would be appreciated!

What do your Elasticsearch logs show?

Lots more interesting stuff, apparently.

[2019-06-11T00:11:33,123][WARN ][o.e.x.m.e.l.LocalExporter] [FL-JAX-SECELKELS1] unexpected error while indexing monitoring document
org.elasticsearch.xpack.monitoring.exporter.ExportException: RemoteTransportException[[FL-JAX-SECELKELS6][10.226.1.99:9300][indices:data/write/bulk[s]]]; nested: RemoteTransportException[[FL-JAX-SECELKELS5][10.226.1.92:9300][indices:data/write/bulk[s]]]; nested: RemoteTransportException[[FL-JAX-SECELKELS5][10.226.1.92:9300][indices:data/write/bulk[s][p]]]; nested: EsRejectedExecutionException[rejected execution of processing of [2425608][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[.monitoring-es-7-2019.06.11][0]] containing [index {[.monitoring-es-7-2019.06.11][_doc][ZuS7RGsBU6B_O6Kvx0a0], source[{"cluster_uuid":"8Le75XanSxuXyvR2IvzoYw","timestamp":"2019-06-11T04:11:25.398Z","interval_ms":10000,"type":"node_stats","source_node":{"uuid":"IKx-4rGjR2qCf-IPXbVvWw","host":"10.226.1.94","transport_address":"10.226.1.94:9300","ip":"10.226.1.94","name":"FL-JAX-SECELKELS1","timestamp":"2019-06-11T04:11:25.397Z"},"node_stats":{"node_id":"IKx-4rGjR2qCf-IPXbVvWw","node_master":false,"mlockall":false,"indices":{"docs":{"count":7000135709},"store":{"size_in_bytes":4558644814452},"indexing":{"index_total":41772,"index_time_in_millis":27982,"throttle_time_in_millis":0},"search":{"query_total":10184,"query_time_in_millis":4068},"query_cache":{"memory_size_in_bytes":2160,"hit_count":182,"miss_count":145,"evictions":0},"fielddata":{"memory_size_in_bytes":11064,"evictions":0},"segments":{"count":655,"memory_in_bytes":8021958571,"terms_memory_in_bytes":5668982758,"stored_fields_memory_in_bytes":2160722024,"term_vectors_memory_in_bytes":0,"norms_memory_in_bytes":0,"points_memory_in_bytes":189215633,"doc_values_memory_in_bytes":3038156,"index_writer_memory_in_bytes":0,"version_map_memory_in_bytes":0,"fixed_bit_set_memory_in_bytes":91472},"request_cache":{"memory_size_in_bytes":39439,"evictions":0,"hit_count":1883,"miss_count":547}},"os":{"cpu":{"load_average":{"1m":0.0,"5m":0.01,"15m":0.05}},"cgroup":{"cpuacct":{"control_group":"/","usage_nanos":49194604672373},"cpu":{"control_group":"/","cfs_period_micros":100000,"cfs_quota_micros":-1,"stat":{"number_of_elapsed_periods":0,"number_of_times_throttled":0,"time_throttled_nanos":0}},"memory":{"control_group":"/","limit_in_bytes":"9223372036854771712","usage_in_bytes":"64726368256"}}},"process":{"open_file_descriptors":7656,"max_file_descriptors":65535,"cpu":{"percent":0}},"jvm":{"mem":{"heap_used_in_bytes":9970663680,"heap_used_percent":37,"heap_max_in_bytes":26773815296},"gc":{"collectors":{"young":{"collection_count":1161,"collection_time_in_millis":55812},"old":{"collection_count":2,"collection_time_in_millis":115}}}},"thread_pool":{"generic":{"threads":12,"queue":0,"rejected":0},"get":{"threads":0,"queue":0,"rejected":0},"management":{"threads":5,"queue":0,"rejected":0},"search":{"threads":13,"queue":0,"rejected":0},"watcher":{"threads":0,"queue":0,"rejected":0},"write":{"threads":8,"queue":0,"rejected":0}},"fs":{"total":{"total_in_bytes":10994540609536,"free_in_bytes":6289274310656,"available_in_bytes":6289274310656},"io_stats":{"total":{"operations":749123,"read_operations":148205,"write_operations":600918,"read_kilobytes":49172880,"write_kilobytes":367900076}}}}}]}], target allocation id: ZBiDS4RhRKymNm4kKGsEGQ, primary term: 1 on EsThreadPoolExecutor[name = FL-JAX-SECELKELS5/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@119b39bf[Running, pool size = 8, active threads = 8, queued tasks = 200, completed tasks = 148727]]];
at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.lambda$throwExportException$2(LocalBulk.java:125) ~[x-pack-monitoring-7.1.1.jar:7.1.1]
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) ~[?:?]
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177) ~[?:?]
at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) ~[?:?]
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?]
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[?:?]
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) ~[?:?]
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) ~[?:?]
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?]
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.throwExportException(LocalBulk.java:126) [x-pack-monitoring-7.1.1.jar:7.1.1]
at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.lambda$doFlush$0(LocalBulk.java:108) [x-pack-monitoring-7.1.1.jar:7.1.1]
at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:61) [elasticsearch-7.1.1.jar:7.1.1]
at org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:43) [elasticsearch-7.1.1.jar:7.1.1]
at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:68) [elasticsearch-7.1.1.jar:7.1.1]
at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:64) [elasticsearch-7.1.1.jar:7.1.1]
at org.elasticsearch.action.bulk.TransportBulkAction$BulkRequestModifier.lambda$wrapActionListenerIfNeeded$0(TransportBulkAction.java:659) [elasticsearch-7.1.1.jar:7.1.1]
at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:61) [elasticsearch-7.1.1.jar:7.1.1]
at org.elasticsearch.action.bulk.TransportBulkAction$BulkOperation$1.finishHim(TransportBulkAction.java:464) [elasticsearch-7.1.1.jar:7.1.1]
at org.elasticsearch.action.bulk.TransportBulkAction$BulkOperation$1.onFailure(TransportBulkAction.java:459) [elasticsearch-7.1.1.jar:7.1.1]
at org.elasticsearch.action.support.TransportAction$1.onFailure(TransportAction.java:74) [elasticsearch-7.1.1.jar:7.1.1]
at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase.finishAsFailed(TransportReplicationAction.java:937) [elasticsearch-7.1.1.jar:7.1.1]
at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase$1.handleException(TransportReplicationAction.java:895) [elasticsearch-7.1.1.jar:7.1.1]
at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1124) [elasticsearch-7.1.1.jar:7.1.1]

Any tips on uploading a .txt file so I can share a full entry? Above is abbreviated.

If that's not enough, let me know. :slight_smile:

Based on other things I've found online, I commented out the xpack.monitoring options from the various yml files, restarted logstash and one of the ES nodes in my cluster, now the tremendous flows of errors in logs have stopped.... but I'm still not seeing data in kibana under Discover for my logstash filter. (Worth noting, the filebeat instances I have running do seem to be passing traffic directly to ES just fine throughout.)

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.