Assume that shard has 1 primary + 1 replica, if a bulk request goes to primary and finished the shard bulk write operation, then transfer the bulk request to replica node, if the replica node is a really slow node, for example may get stuck for several minutes (cpu or memory issue, node network ping is ok, node cannot be removed from cluster), then all the bulk operation requests would get stuck before the slow replica node done.
In a 100+ nodes cluster, each node has the same index shard, if we have a single slow node, the above case may slow down the whole cluster bulk operations.
Could we add timeout mechanism for the replica bulk request? For example, if a replica got timeout after like 30s, then make the shard failed, don't block primary shard bulk operation forever.
I have done some test, I modified code to sleep 10mins in replica write operation, we could find that curl request took 10mins+:
/_bulk?pretty" -H 'Content-Type: application/json' -d'
> { "index" : { "_index" : "replica_test", "_id" : "1" } }
> { "field1" : "value1" }
> { "index" : { "_index" : "replica_test", "_id" : "2" } }
> { "field1" : "value2" }
> { "index" : { "_index" : "replica_test", "_id" : "3" } }
> { "field1" : "value3" }
> '
{
"took" : 600051,
"errors" : false,
"items" : [
{
"index" : {
"_index" : "replica_test",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 2,
"status" : 201
}
},
Related issue: Support timeout mechanism for replica bulk request. · Issue #90981 · elastic/elasticsearch · GitHub