I have a cluster with 2 nodes. Master node and data node. Both was fine back then. But it becoming a mess lately, like it is really slow ingesting csv data from filebeat, really slow kibana ui open.
These is what it shown on logstash
[WARN ] 2020-07-21 21:02:13.680 [[main]>worker1] elasticsearch - Marking url as dead. Last error: [LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError] Elasticsearch Unreachable: [http://127.0.0.1:9200/][Manticore::SocketTimeout] Read timed out {:url=>http://127.0.0.1:9200/, :error_message=>"Elasticsearch Unreachable: [http://127.0.0.1:9200/][Manticore::SocketTimeout] Read timed out", :error_class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError"}
[WARN ] 2020-07-21 21:02:15.124 [[main]>worker0] elasticsearch - Marking url as dead. Last error: [LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError] Elasticsearch Unreachable: [http://127.0.0.1:9200/][Manticore::SocketTimeout] Read timed out {:url=>http://127.0.0.1:9200/, :error_message=>"Elasticsearch Unreachable: [http://127.0.0.1:9200/][Manticore::SocketTimeout] Read timed out", :error_class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError"}
[ERROR] 2020-07-21 21:02:16.281 [[main]>worker1] elasticsearch - Attempted to send a bulk request to elasticsearch' but Elasticsearch appears to be unreachable or down! {:error_message=>"Elasticsearch Unreachable: [http://127.0.0.1:9200/][Manticore::SocketTimeout] Read timed out", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError", :will_retry_in_seconds=>2}
[ERROR] 2020-07-21 21:02:16.281 [[main]>worker0] elasticsearch - Attempted to send a bulk request to elasticsearch' but Elasticsearch appears to be unreachable or down! {:error_message=>"Elasticsearch Unreachable: [http://127.0.0.1:9200/][Manticore::SocketTimeout] Read timed out", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError", :will_retry_in_seconds=>2}
[ERROR] 2020-07-21 21:02:25.082 [[main]>worker1] elasticsearch - Attempted to send a bulk request to elasticsearch, but no there are no living connections in the connection pool. Perhaps Elasticsearch is unreachable or down? {:error_message=>"No Available connections", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::NoConnectionAvailableError", :will_retry_in_seconds=>4}
[ERROR] 2020-07-21 21:02:29.656 [[main]>worker1] elasticsearch - Attempted to send a bulk request to elasticsearch, but no there are no living connections in the connection pool. Perhaps Elasticsearch is unreachable or down? {:error_message=>"No Available connections", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::NoConnectionAvailableError", :will_retry_in_seconds=>8}
[ERROR] 2020-07-21 21:02:31.053 [[main]>worker0] elasticsearch - Attempted to send a bulk request to elasticsearch, but no there are no living connections in the connection pool. Perhaps Elasticsearch is unreachable or down? {:error_message=>"No Available connections", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::NoConnectionAvailableError", :will_retry_in_seconds=>4}
[ERROR] 2020-07-21 21:02:36.502 [[main]>worker0] elasticsearch - Attempted to send a bulk request to elasticsearch, but no there are no living connections in the connection pool. Perhaps Elasticsearch is unreachable or down? {:error_message=>"No Available connections", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::NoConnectionAvailableError", :will_retry_in_seconds=>8}
[ERROR] 2020-07-21 21:02:39.192 [[main]>worker1] elasticsearch - Attempted to send a bulk request to elasticsearch, but no there are no living connections in the connection pool. Perhaps Elasticsearch is unreachable or down? {:error_message=>"No Available connections", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::NoConnectionAvailableError", :will_retry_in_seconds=>16}
[WARN ] 2020-07-21 21:02:49.121 [Ruby-0-Thread-5: :1] elasticsearch - Restored connection to ES instance {:url=>"http://127.0.0.1:9200/"}
[ERROR] 2020-07-21 21:02:49.723 [[main]>worker0] elasticsearch - Attempted to send a bulk request to elasticsearch, but no there are no living connections in the connection pool. Perhaps Elasticsearch is unreachable or down? {:error_message=>"No Available connections", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::NoConnectionAvailableError", :will_retry_in_seconds=>16}
[WARN ] 2020-07-21 21:04:06.500 [[main]>worker1] elasticsearch - Marking url as dead. Last error: [LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError] Elasticsearch Unreachable: [http://127.0.0.1:9200/][Manticore::SocketTimeout] Read timed out {:url=>http://127.0.0.1:9200/, :error_message=>"Elasticsearch Unreachable: [http://127.0.0.1:9200/][Manticore::SocketTimeout] Read timed out", :error_class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError"}
[ERROR] 2020-07-21 21:04:06.707 [[main]>worker1] elasticsearch - Attempted to send a bulk request to elasticsearch' but Elasticsearch appears to be unreachable or down! {:error_message=>"Elasticsearch Unreachable: [http://127.0.0.1:9200/][Manticore::SocketTimeout] Read timed out", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError", :will_retry_in_seconds=>32}
[ERROR] 2020-07-21 21:04:27.520 [[main]>worker0] elasticsearch - Attempted to send a bulk request to elasticsearch, but no there are no living connections in the connection pool. Perhaps Elasticsearch is unreachable or down? {:error_message=>"No Available connections", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::NoConnectionAvailableError", :will_retry_in_seconds=>32}
[ERROR] 2020-07-21 21:04:54.871 [[main]>worker1] elasticsearch - Attempted to send a bulk request to elasticsearch, but no there are no living connections in the connection pool. Perhaps Elasticsearch is unreachable or down? {:error_message=>"No Available connections", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::NoConnectionAvailableError", :will_retry_in_seconds=>64}
[WARN ] 2020-07-21 21:04:55.353 [Ruby-0-Thread-5: :1] elasticsearch - Restored connection to ES instance {:url=>"http://127.0.0.1:9200/"}
filebeat
2020-07-21T20:23:50.051+0700 DEBUG [input] input/input.go:152 Run input
2020-07-21T20:23:50.051+0700 DEBUG [input] log/input.go:191 Start next scan
2020-07-21T20:23:50.055+0700 DEBUG [input] log/input.go:421 Check file for harvesting: C:\Program Files\Filebeat\call-log-june28\Call Record 12 - 18 July 2020-CBN PJ.csv
2020-07-21T20:23:50.056+0700 DEBUG [input] log/input.go:511 Update existing file for harvesting: C:\Program Files\Filebeat\call-log-june28\Call Record 12 - 18 July 2020-CBN PJ.csv, offset: 179504
2020-07-21T20:23:50.056+0700 DEBUG [input] log/input.go:563 Harvester for file is still running: C:\Program Files\Filebeat\call-log-june28\Call Record 12 - 18 July 2020-CBN PJ.csv
2020-07-21T20:23:50.056+0700 DEBUG [input] log/input.go:212 input states cleaned up. Before: 1, After: 1, Pending: 0
2020-07-21T20:24:00.057+0700 DEBUG [input] input/input.go:152 Run input
2020-07-21T20:24:00.057+0700 DEBUG [input] log/input.go:191 Start next scan
2020-07-21T20:24:00.059+0700 DEBUG [input] log/input.go:421 Check file for harvesting: C:\Program Files\Filebeat\call-log-june28\Call Record 12 - 18 July 2020-CBN PJ.csv
2020-07-21T20:24:00.059+0700 DEBUG [input] log/input.go:511 Update existing file for harvesting: C:\Program Files\Filebeat\call-log-june28\Call Record 12 - 18 July 2020-CBN PJ.csv, offset: 179504
2020-07-21T20:24:00.059+0700 DEBUG [input] log/input.go:563 Harvester for file is still running: C:\Program Files\Filebeat\call-log-june28\Call Record 12 - 18 July 2020-CBN PJ.csv
2020-07-21T20:24:00.059+0700 DEBUG [input] log/input.go:212 input states cleaned up. Before: 1, After: 1, Pending: 0
2020-07-21T20:24:03.482+0700 DEBUG [transport] transport/client.go:205 handle error: read tcp 10.64.233.87:57431->10.64.2.246:5044: i/o timeout
2020-07-21T20:24:03.482+0700 ERROR [logstash] logstash/async.go:279 Failed to publish events caused by: read tcp 10.64.233.87:57431->10.64.2.246:5044: i/o timeout
2020-07-21T20:24:03.490+0700 DEBUG [transport] transport/client.go:118 closing
2020-07-21T20:24:03.491+0700 ERROR [logstash] logstash/async.go:279 Failed to publish events caused by: read tcp 10.64.233.87:57431->10.64.2.246:5044: i/o timeout
2020-07-21T20:24:03.491+0700 DEBUG [logstash] logstash/async.go:171 1249 events out of 1249 events sent to logstash host 10.64.2.246:5044. Continue sending
2020-07-21T20:24:03.491+0700 INFO [publisher] pipeline/retry.go:173 retryer: send wait signal to consumer
2020-07-21T20:24:03.491+0700 ERROR [logstash] logstash/async.go:279 Failed to publish events caused by: read tcp 10.64.233.87:57431->10.64.2.246:5044: i/o timeout
2020-07-21T20:24:03.492+0700 INFO [publisher] pipeline/retry.go:175 done
2020-07-21T20:24:03.560+0700 DEBUG [logstash] logstash/async.go:171 1512 events out of 1512 events sent to logstash host 10.64.2.246:5044. Continue sending
2020-07-21T20:24:03.560+0700 DEBUG [logstash] logstash/async.go:127 close connection
2020-07-21T20:24:03.562+0700 ERROR [logstash] logstash/async.go:279 Failed to publish events caused by: client is not connected
2020-07-21T20:24:03.562+0700 DEBUG [logstash] logstash/async.go:127 close connection
2020-07-21T20:24:04.942+0700 ERROR [publisher_pipeline_output] pipeline/output.go:127 Failed to publish events: client is not connected
2020-07-21T20:24:04.942+0700 INFO [publisher_pipeline_output] pipeline/output.go:101 Connecting to backoff(async(tcp://10.64.2.246:5044))
2020-07-21T20:24:04.943+0700 DEBUG [logstash] logstash/async.go:119 connect
2020-07-21T20:24:04.954+0700 INFO [publisher_pipeline_output] pipeline/output.go:111 Connection to backoff(async(tcp://10.64.2.246:5044)) established
2020-07-21T20:24:04.961+0700 DEBUG [logstash] logstash/async.go:171 55 events out of 55 events sent to logstash host 10.64.2.246:5044. Continue sending
2020-07-21T20:24:04.961+0700 INFO [publisher] pipeline/retry.go:196 retryer: send unwait-signal to consumer
2020-07-21T20:24:04.961+0700 INFO [publisher] pipeline/retry.go:198 done
2020-07-21T20:24:04.981+0700 DEBUG [logstash] logstash/async.go:171 1249 events out of 1249 events sent to logstash host 10.64.2.246:5044. Continue sending
2020-07-21T20:24:10.061+0700 DEBUG [input] input/input.go:152 Run input
2020-07-21T20:24:10.061+0700 DEBUG [input] log/input.go:191 Start next scan
2020-07-21T20:24:10.063+0700 DEBUG [input] log/input.go:421 Check file for harvesting: C:\Program Files\Filebeat\call-log-june28\Call Record 12 - 18 July 2020-CBN PJ.csv
2020-07-21T20:24:10.063+0700 DEBUG [input] log/input.go:511 Update existing file for harvesting: C:\Program Files\Filebeat\call-log-june28\Call Record 12 - 18 July 2020-CBN PJ.csv, offset: 179504
2020-07-21T20:24:10.063+0700 DEBUG [input] log/input.go:563 Harvester for file is still running: C:\Program Files\Filebeat\call-log-june28\Call Record 12 - 18 July 2020-CBN PJ.csv
2020-07-21T20:24:10.063+0700 DEBUG [input] log/input.go:212 input states cleaned up. Before: 1, After: 1, Pending: 0
2020-07-21T20:24:12.843+0700 INFO [monitoring] log/log.go:145 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":24328,"time":{"ms":32}},"total":{"ticks":46171,"time":{"ms":94},"value":46171},"user":{"ticks":21843,"time":{"ms":62}}},"handles":{"open":255},"info":{"ephemeral_id":"5532913f-11e4-4cb6-a995-57ac5380eb3c","uptime":{"ms":14881709}},"memstats":{"gc_next":23980544,"memory_alloc":16322784,"memory_total":1003854240,"rss":3039232},"runtime":{"goroutines":29}},"filebeat":{"harvester":{"open_files":1,"running":1}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"active":232,"batches":4,"failed":4096,"total":4328},"read":{"errors":1},"write":{"bytes":98868}},"pipeline":{"clients":1,"events":{"active":4117,"retry":5608}}},"registrar":{"states":{"current":2}}}}}
Master node's elasticsearch could die sometimes which is make me really wondering if it really some hardware problem or memory or software problem. Data node's elasticsearch never die everytime i check it out stop printing stdout from logstash. Even after lot of waitings it works, seems i cant do this everytime. I need some solution. Any respond or suggestion will be appreciated a lot. Thank you