Elastic Cluster Went Down

Hitesh_Baldaniya1 · August 22, 2019, 6:49am

Hi Team,

We have cluster on cloud.elastic.co, which went down due to JVM heap memory pressure(My assumptions).

Following is my cluster configuration:

2 Data Nodes(8GB for each)
1 Master Node(8GB)

Cluster was not able to process the write requests which were sent from our application with timeout of 30 seconds and all the requests were timedout. (According to cloud.elastic support team) They found our cluster in unhealthy state and resatrted the node but nothing happend still the write requests were getting the timedout requests.

Not sure what happened here? Any one ever faced such issues?

Thanks,

Christian_Dahlqvist · August 22, 2019, 6:58am

How many indices and shards do you have in the cluster? How many of these are you actively indexing into? What is the size of your bulk requests?

Hitesh_Baldaniya1 · August 22, 2019, 7:03am

Hi @Christian_Dahlqvist,

We have only 1 index with 5 shards. All of the shards are actively used and we have write heavy application. We do not have bulk requests on any write operation.

Christian_Dahlqvist · August 22, 2019, 7:07am

Ok. That does eliminate a common source of problems. Indexing and updating a lot of single documents can be quite inefficient and cause a lot of disk I/O as the translog will be synced for each request.

Hitesh_Baldaniya1 · August 22, 2019, 7:44am

So this might be same applicable to Bulk Requests as well?

Also, I am not sure why the write requests were getting timeout while the read requests were able to serve.

We tried writing single object manually from Kibana that did work but not from our application. This is confusing part as from Kibana writes are working but not from the application.(We are using node elasticsearch).

Thanks,

system · September 19, 2019, 7:44am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Unstable cluster performance Elasticsearch	8	937	April 15, 2019
Es 5.5 cluster unsearcable, node stats time-outs Elasticsearch	6	1194	October 1, 2017
Elasticsearch JVM Memory Pressure Issue Elasticsearch	29	3656	June 26, 2019
429 Too many requests Elasticsearch	4	9310	March 10, 2019
ElasticSearch Node goes down Elasticsearch	6	3813	July 27, 2019

Elastic Cluster Went Down

Related topics