RetryOnPrimaryException in ES node

dipathak · October 12, 2016, 1:29pm

Hi, we are using ES 2.0 on a 16 node cluster. There are 2 indexes. One index has 5 shards with RF 2. This is giving RetryOnPrimaryException while trying to index a document. This error is happening on only one of the nodes and writes are getting timed out (timeout value = 1min). ES process is not crashing on the nodes though.
We create index only once and never delete it. The other index is not having any issues.
Cluster health is green and all the shards are in STARTED state. What does this error mean? Shall I increase timeout value?
Please let me know what information can I look into or provide here to debug this.

Thanks.

[2016-10-12 04:22:07,867][INFO ][rest.suppressed ] objindex/obj/BFJlY1U Params: {index=objindex, id=BFJlY1U, type=obj, timeout=60s}
[objindex][[objindex][4]] RetryOnPrimaryException[Dynamics mappings are not available on the node that holds the primary yet]
at org.elasticsearch.action.support.replication.TransportReplicationAction.executeIndexRequestOnPrimary(TransportReplicationAction.java:1069)
at org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:170)
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase.performOnPrimary(TransportReplicationAction.java:579)
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase$1.doRun(TransportReplicationAction.java:452)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[2016-10-12 04:22:08,331][INFO ][rest.suppressed ] objindex/obj/BFNOd0M Params: {index=objindex, id=BFNOd0M, type=obj, timeout=60s}
[objindex][[objindex][4]] RetryOnPrimaryException[Dynamics mappings are not available on the node that holds the primary yet]
at org.elasticsearch.action.support.replication.TransportReplicationAction.executeIndexRequestOnPrimary(TransportReplicationAction.java:1069)
at org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:170)
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase.performOnPrimary(TransportReplicationAction.java:579)
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase$1.doRun(TransportReplicationAction.java:452)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

ywelsch · October 12, 2016, 2:15pm

It sounds like there is a problem updating the cluster state on the node holding that shard. Can you share the output of running the following command both on master node and on the node throwing the exception:

curl -XGET 'http://localhost:9200/_cluster/state?local&pretty'

Probably the easiest way to fix this is to just restart the node.

dipathak · October 12, 2016, 5:54pm

Hi Yannick, thanks for the response. The output is around 20k characters and upload allows only image files. Can you please tell me how to provide the output of this command here. Thanks !

ywelsch · October 12, 2016, 8:33pm

Please use http://pastebin.com or a similar (free) service. If the cluster state contains private / confidential information, you can also make the paste exposure "unlisted" and send me the link in a private message here.

Topic		Replies	Views
Primary shard is not active Timeout: [1m] (reindex api) Elasticsearch	2	1927	October 8, 2018
Connection time out for indexing request - ES 1.0.2 Elasticsearch	6	1429	April 19, 2017
Cluster issue -> raiseTimeoutFailure Elasticsearch	2	407	July 6, 2017
Elasticsearch throws "primary shard is not active Timeout:" error Elasticsearch	2	713	November 16, 2017
Weird timeouts with transport client after re-indexing Elasticsearch	6	1848	August 17, 2017

RetryOnPrimaryException in ES node

Related Topics