What is the timeout for writing to replica shard ? If it is taking long time to write to replica can i specify value to timeout write on replica shard.
Valid write consistency values : one, quorum , and all
Because of below exception index is turning to yellow and shard reallocation is happening.
2019-02-20T13:10:11,644][WARN ][o.e.c.a.s.ShardStateAction] [SEIGWPP05ES03] [sample_index][3] received shard failed for shard id [[sample_index][3]], allocation id [OKXJ2CKNR8CKilqJZpVe4Q], primary term [1], message [failed to perform indices:data/write/bulk[s] on replica [sample_index][3], node[IiIb5vQNSj6GAxyv2LbWBQ], [R], s[STARTED], a[id=OKXJ2CKNR8CKilqJZpVe4Q]], failure [NodeNotConnectedException[[samplenode][172.18.72.120:9300] Node not connected]]
No, each write goes to the primary and both replicas before acknowledgement, so that's 3 shard copies in total.
Elasticsearch guarantees to write every document to every in-sync shard copy before responding. This means that if it can't write to a shard copy it must mark that copy as out-of-sync, which will mean it becomes unassigned and therefore that the index health reports as yellow.
There is no way to disregard this check. It's very important.
There will be more information in the logs telling you why. If you need help interpreting the logs then please share more information here - stack traces and other messages from around the same time are all important.
The guide to which you link was written about the 2.x series and is rather out of date. The reference manual has fresher information.
Thanks @DavidTurner for valuable information.
Below is stack trace. I see node not connected exceptions.
I dont see any exception on disconnected node. should I increase node timeout value.
> [2019-03-26T02:16:55,992][WARN ][o.e.c.a.s.ShardStateAction] [sample_node_1] [sample_index][3] received shard failed for shard id [[sample_index][3]], allocation id [-HYXaiLbTRSUyFyRhTO6Dw], primary term [3], message [failed to perform indices:data/write/bulk[s] on replica [sample_index][3], node[IiIb5vQNSj6GAxyv2LbWBQ], [R], s[STARTED], a[id=-HYXaiLbTRSUyFyRhTO6Dw]], failure [NodeNotConnectedException[[sample_node_2][172.18.72.120:9300] Node not connected]]
org.elasticsearch.transport.NodeNotConnectedException: [sample_node_2][172.18.72.120:9300] Node not connected
at org.elasticsearch.transport.TcpTransport.getConnection(TcpTransport.java:692) ~[elasticsearch-6.1.0.jar:6.1.0]
at org.elasticsearch.transport.TcpTransport.getConnection(TcpTransport.java:122) ~[elasticsearch-6.1.0.jar:6.1.0]
at org.elasticsearch.transport.TransportService.getConnection(TransportService.java:525) ~[elasticsearch-6.1.0.jar:6.1.0]
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:501) ~[elasticsearch-6.1.0.jar:6.1.0]
at org.elasticsearch.action.support.replication.TransportReplicationAction.sendReplicaRequest(TransportReplicationAction.java:1188) ~[elasticsearch-6.1.0.jar:6.1.0]
at org.elasticsearch.action.support.replication.TransportReplicationAction$ReplicasProxy.performOn(TransportReplicationAction.java:1152) ~[elasticsearch-6.1.0.jar:6.1.0]
at org.elasticsearch.action.support.replication.ReplicationOperation.performOnReplica(ReplicationOperation.java:171) ~[elasticsearch-6.1.0.jar:6.1.0]
at org.elasticsearch.action.support.replication.ReplicationOperation.performOnReplicas(ReplicationOperation.java:155) ~[elasticsearch-6.1.0.jar:6.1.0]
at org.elasticsearch.action.support.replication.ReplicationOperation.execute(ReplicationOperation.java:122) ~[elasticsearch-6.1.0.jar:6.1.0]
By default Elasticsearch does log something when a node is disconnected, and continues to log failure messages every few minutes if it can't reconnect. You're not sharing very much in the way of logs so it's not very easy to help here. Can you share the last few minutes of logs leading up to this NodeNotConnectedException? Perhaps use https://gist.github.com.
I don't understand. What timeout are you asking about?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.