I have created a question on the stack-exchange network about this problem but without a lot of success.
elasticsearch is unreachable every two hours 8h52 , 10h52 during several seconds./minutes. And then it comes back and acts normally.
I have access to Kibana dashboard. I am searching for hints on what could cause this bug, or where to investigate?
have you taken a look at the elasticsearch logfiles when this happens? are they empty? Have you taken a look at the elasticsearch node stats if your GC times increase during that time? Does this only affect Elasticsearch on those hosts? Are they running on the same VM?
What does unreachable mean? You got four nodes. Are all of them unreachable? Does unreachable mean you can open a HTTP connection and send your request but dont get a reply?
Side note: Elasticsearch 1.7 has been End-Of-Life since the beginning of this year, is missing a ton of bugfixes and features. I'd try to upgrade.
Thanks for your answer. unfortunately I am a developer and I don't have access to the servers (API nor elasticsearch). (the ops team has but they don't have time for us right now...) I only have access to the Kibana dashboard.
What I mean by "unreachable" is, when a query is made to elasticsearch we have 2 kind of errors;
- connection reset by peer
- timeout after x time
Migration is planned but it will take time as there is a big gap.
Without further details this is IMO impossible to answer.
Sounds like a GC happening, but they usually dont happen down to the minute. Might be that you have a cronjob that sends a crazy query every two hours that makes your cluster stuck, but this is all just guesswork instead of working with logs and detailed infos and monitoring data...
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.