Suddenly Elastic Failure (Graphs Included)

devday · July 10, 2018, 11:54am

Hi,

We are experiencing a strange problem.
Everything with Elastic Search is fine but we get sudden deaths now and again.
This may happen once a week or once a month or twice a day.
During this time, I do not believe the index count or query count is unusual.

I've attached the graphs

You can see the current rate of indexing and searching is not abnormal.
However, if you look at the "rate of opened http connections", this rocketed at 10:20.
The "search thread pool queue by size by node" also rocketed to > 1000 at this time.
This caused all nodes to go offline and ES to become unresponsive.

Has anyone had this and do you know the cause of such issues?

Any help will be much appreciated,
Dev

devday · July 10, 2018, 11:56am

Hi,

Just to follow up - the graph eneds at 11:00 as the node was unreachable.
However, we just need to look at the time before and around 10:20 when the issue happened and ES was unresponsive.

Dev

Christian_Dahlqvist · July 10, 2018, 12:02pm

Is there anything in the Elasticsearch logs around that time? Which version are you using?

devday · July 10, 2018, 12:04pm

Hi Christian,

Thank you for getting back to me.
We're on version 5.3.2.
I don't have access to the logs as we are using a hosted service provider.

Any initial thoughts?

Dev

Christian_Dahlqvist · July 10, 2018, 12:25pm

Without logs I have no idea what is going on.

devday · July 10, 2018, 1:20pm

Hi Christian,

I've requested the logs,

Dev

devday · July 12, 2018, 1:16pm

Hi Christian,

I am still trying to get the logs but to no avail right now.

Dev

Christian_Dahlqvist · July 12, 2018, 1:21pm

Have you looked at Elastic Cloud, which comes with monitoring and access to logs via the UI?

devday · July 12, 2018, 1:56pm

Hi Christian,

Yes we do use Elastic Cloud.
But for this particular client, we are using Qbox.

Regards,
Dev

Topic		Replies	Views
Elasticsearch appears to be down but isn't and status is green Elasticsearch	4	6052	February 2, 2018
[logstash.outputs.elasticsearch] Marking url as dead. Elasticsearch Unreachable: [Manticore::SocketTimeout] Read timed out. LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError Logstash	2	2817	April 22, 2020
ES cluster becomes unresponsive Elasticsearch	1	732	May 30, 2013
Logstash.outputs.elasticsearch Marking url as dead. connection issue. AWS EBS datastore Logstash	8	2925	July 3, 2019
Elasticsearch gets unresponsive with logstash Elasticsearch	2	460	December 29, 2014

Suddenly Elastic Failure (Graphs Included)

Related topics