I have 30+ client putting data in 60 data node ES cluster all running on physical machine. Around 15tb data get inserted daily in ES with retention of 2 days. Heavy indexing happens on 3 main indexes which are roll over 4 times a day. Each rollover has 100 shards and 1 replica. Translog set to async…

Hi christian. Yes i saw that post. Some other team member from our team has posted another problem. This problem is serious as due to this exception the bulk indexing slows down to snail pace. However I think the problem is this root cause. As far as your suggestion is concered from that post: Th…

I was suggesting indexing into a smaller number of shards for each index (typically one primary shard per data node) and increase the average shard size. Also make sure you optimize the mappings as this can save a lot of work and increase speed. This will mean that your indices instead roll over mor…

Hi Christian. We will go with 60 shards per index from tomorrow but decreasing shard will lead to low indexing throughput. Mappings are optimized as per the link. We dont have text fields. We have majority of the fields are keywords. Since we need aggregations so doc values are enabled. When you s…

Hello @jiouser After reading both post, your one and the one from other member of your team, I think that your issue seems to be a GC issue. You have a lot of data and if most of them are mapped as keyword it would increase the terms space in memory. Could you check the following api: GET _nodes…

Hello @Juanma @Christian_Dahlqvist I checked heap on all data nodes with _cat/nodes . With 32 GB heap given, most of them are between 60-80% with few between 45-50%. RAM is between 95-100%. I understand Lucene uses entire available memory as FS cache. CPU is absolutely idle. Max at 10%.

Lucene stored some data off heap, but the file system cache is primarily used when querying and does not necessarily benefit indexing a lot unless you are updating documents. If you have nodes in or about 75% heap usage you are likely to have problems with GC, so I would recommend adding additional …

Hi @Christian_Dahlqvist We are in the process of adding additional data nodes. We indeed saw sawtooth pattern on the Data nodes. but still in ES logs we can constantly see GC lines being printed. Also I observed, this exception starts coming only when I enable sniffing in client. If my client conne…

Failed to get local cluster state Critical issue

Elastic Stack Elasticsearch

Christian_Dahlqvist (Christian Dahlqvist) June 30, 2019, 6:48pm 2

This seems to be the same issue as this thread. Did you try any of the suggestions made, e.g. installing monitoring?

Topic		Replies	Views
Indexing is going on very slowly Elasticsearch	8	3694	September 7, 2015
ES Prod cluster Receive Timeout Transport Exception Elasticsearch	6	6710	January 14, 2016
org.elasticsearch.transport.ReceiveTimeoutTransportException...timed out after [5002ms] without any hints and exceptions in cluster Elasticsearch	2	2955	December 10, 2014
Receive Timeout Transport Exception Error on Elastic nodes Elasticsearch	6	2816	April 15, 2020
Elasticsearch timeout after 30000ms Elasticsearch	6	780	May 28, 2019

Failed to get local cluster state Critical issue

Related topics