After killing 1 node (without -9) out of 3 nodes in the cluster, bulk create via transport client hangs for minutes!
The logs show:
[2017-12-13T10:52:58,703][INFO ][o.e.c.r.a.AllocationService] [node-3] Cluster health status changed from [GREEN] to [YELLOW] (reason: [{node-2}{y5SsksY3RYqoroDPrSvfdg}{u1cSLNqgRgusPfdhhY90Fw}{172.16.69.2}{172.16.69.2:9300} failed to ping, tried [3] times, each with maximum [1s] timeout]).
[2017-12-13T10:52:58,704][INFO ][o.e.c.s.ClusterService ] [node-3] removed {{node-2}{y5SsksY3RYqoroDPrSvfdg}{u1cSLNqgRgusPfdhhY90Fw}{172.16.69.2}{172.16.69.2:9300},}, reason: zen-disco-node-failed({node-2}{y5SsksY3RYqoroDPrSvfdg}{u1cSLNqgRgusPfdhhY90Fw}{172.16.69.2}{172.16.69.2:9300}), reason(failed to ping, tried [3] times, each with maximum [1s] timeout)[{node-2}{y5SsksY3RYqoroDPrSvfdg}{u1cSLNqgRgusPfdhhY90Fw}{172.16.69.2}{172.16.69.2:9300} failed to ping, tried [3] times, each with maximum [1s] timeout]
[2017-12-13T10:52:58,827][INFO ][o.e.c.r.DelayedAllocationService] [node-3] scheduling reroute for delayed shards in [59.8s] (1 delayed shards)
[2017-12-13T10:52:58,836][WARN ][o.e.c.a.s.ShardStateAction] [node-3] [events_1513161010363][0] received shard failed for shard id [[events_1513161010363][0]], allocation id [KXtrTmKQSkmIuZ528faD7A], primary term [2], message [mark copy as stale]
[2017-12-13T10:52:58,836][WARN ][o.e.c.a.s.ShardStateAction] [node-3] [events_1513161010363][0] received shard failed for shard id [[events_1513161010363][0]], allocation id [KXtrTmKQSkmIuZ528faD7A], primary term [2], message [mark copy as stale]
[2017-12-13T10:52:58,836][WARN ][o.e.c.a.s.ShardStateAction] [node-3] [events_1513161010363][0] received shard failed for shard id [[events_1513161010363][0]], allocation id [KXtrTmKQSkmIuZ528faD7A], primary term [2], message [mark copy as stale]
[2017-12-13T10:52:58,836][WARN ][o.e.c.a.s.ShardStateAction] [node-3] [events_1513161010363][0] received shard failed for shard id [[events_1513161010363][0]], allocation id [KXtrTmKQSkmIuZ528faD7A], primary term [2], message [mark copy as stale]
[2017-12-13T10:52:58,836][WARN ][o.e.c.a.s.ShardStateAction] [node-3] [events_1513161010363][0] received shard failed for shard id [[events_1513161010363][0]], allocation id [KXtrTmKQSkmIuZ528faD7A], primary term [2], message [mark copy as stale]
[2017-12-13T10:52:58,836][WARN ][o.e.c.a.s.ShardStateAction] [node-3] [events_1513161010363][0] received shard failed for shard id [[events_1513161010363][0]], allocation id [KXtrTmKQSkmIuZ528faD7A], primary term [2], message [mark copy as stale]
[2017-12-13T10:52:58,836][WARN ][o.e.c.a.s.ShardStateAction] [node-3] [events_1513161010363][0] received shard failed for shard id [[events_1513161010363][0]], allocation id [KXtrTmKQSkmIuZ528faD7A], primary term [2], message [mark copy as stale]
[2017-12-13T10:58:06,206][INFO ][o.e.c.s.ClusterService ] [node-3] added {{node-2}{y5SsksY3RYqoroDPrSvfdg}{wEPJXyNNRG2vAnj1Qb5oMA}{172.16.69.2}{172.16.69.2:9300},}, reason: zen-disco-node-join[{node-2}{y5SsksY3RYqoroDPrSvfdg}{wEPJXyNNRG2vAnj1Qb5oMA}{172.16.69.2}{172.16.69.2:9300}]
Our configuration is:
discovery.zen.commit_timeout: 2s
discovery.zen.publish_timeout: 2s
discovery.zen.fd.ping_timeout: 1s
transport.tcp.connect_timeout: 5s
Using version 5.4.1 of ES.
This is the code of bulk request:
BulkRequestBuilder bulkRequestBuilder = client().prepareBulk();
for (Map.Entry<Long, String> eventJson : eventJsons.entrySet()) {
indexRequestBuilder = client().prepareIndex(EventsConstants.CURRENT_ALIAS, EventsConstants.BASE_TYPE, eventJson.getKey().toString());
bulkRequestBuilder.add(indexRequestBuilder.setSource(eventJson.getValue(), XContentType.JSON).setCreate(true));
}
bulkRequestBuilder.setTimeout(TIMEOUT);
bulkRequestBuilder.execute().actionGet()
In case of no master i expect org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/2/no master]; and not hanging... what could be the problem?