Elasticsearch Net, "all shards failed" when one node is shut down, 7.11.1


I have a three nodes cluster setup on version 7.11.1, with this configuration file:

bootstrap.memory_lock: false
  - STHLM-04
  - STHLM-05
  - STHLM-06
cluster.name: ELASTIC-TEST
  - STHLM-04
  - STHLM-05
  - STHLM-06
http.port: 9200
network.host: STHLM-04
node.max_local_storage_nodes: 1
node.name: STHLM-04
node.roles: [ master, data, ingest ]
path.data: D:\Elastic\ElasticSearch\Data
path.logs: D:\Elastic\ElasticSearch\Logs
transport.tcp.port: 9300
xpack.license.self_generated.type: basic
xpack.security.enabled: false

The other nodes config is setup the same except for node.name and netework.host.

Then I have a .net project running Elasticsearch Net and the connection is setup like this:

var nodeUrls = nodeUrl.Split(',');
var nodeUris = nodeUrls.Select(n => new Uri(n));

var nodes = nodeUris.Select(u => new Node(u));
var connectionPool = new StickyConnectionPool(nodes);

var settings = new ConnectionSettings(connectionPool)
    .BasicAuthentication(username, password);

var client = new CustomElasticClient(settings);

As long as both nodes are up and running it works just perfect but if I shut down one of the Elasticsearch services I get the "all shards failed" error.

[2021-08-17T15:54:26,100][WARN ][r.suppressed             ] [STHLM-04] path: /webutv2_1/_search, params: {typed_keys=true, index=webutv2_1}
org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:601) [elasticsearch-7.11.1.jar:7.11.1]
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:332) [elasticsearch-7.11.1.jar:7.11.1]
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:636) [elasticsearch-7.11.1.jar:7.11.1]
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:415) [elasticsearch-7.11.1.jar:7.11.1]
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.lambda$performPhaseOnShard$0(AbstractSearchAsyncAction.java:240) [elasticsearch-7.11.1.jar:7.11.1]
	at org.elasticsearch.action.search.AbstractSearchAsyncAction$2.doRun(AbstractSearchAsyncAction.java:308) [elasticsearch-7.11.1.jar:7.11.1]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) [elasticsearch-7.11.1.jar:7.11.1]
	at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33) [elasticsearch-7.11.1.jar:7.11.1]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:732) [elasticsearch-7.11.1.jar:7.11.1]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) [elasticsearch-7.11.1.jar:7.11.1]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
	at java.lang.Thread.run(Thread.java:832) [?:?]
Caused by: org.elasticsearch.action.NoShardAvailableActionException
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:448) ~[elasticsearch-7.11.1.jar:7.11.1]
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:397) [elasticsearch-7.11.1.jar:7.11.1]
	... 9 more

What am I missing? How do I get Elastic to run without error even if one node shuts down?
At this point I cannot upgrade without downtime for example which is something you should be able to do I guess.



Does your index have replicas?

Can you make the following request to check the index webutv2_1 ?

GET _cat/indices/webutv2_1?v&s=i 

If your indices does not have replicas, you will get this message if a node with a shard of this index goes down, to solve this you need replicas.

It says rep = 0 so I guess there are no replicas.
I'm trying to find how to add replicas to an existing index but I can't find anything, could guide the the right way? Should it be setup in the elasticsearch.yml?

One other thing, when I run GET _cat/indices I find webutv2_1 and webutv2_2?
When pushing data using Elasticsearch Net I have defined the index name to just webutv2.



It is controlled through index settings that can be updated. If you want to change the default for new indices you need to have se an index template.

Ok, and to get rid of the "shards" error I need to set it at least one or do I need to set it to two?

One should be sufficient. If you need additional resiliency you can however increase it.

Getting there!
After updating the replicas to 1 I can shut down STHLM-05 or STHLM-06 and everything works just fine.
However if I shutdown STHLM-04 and try to run:

GET /_cat/nodes

from Dev Tools I get:
{"statusCode":503,"error":"Service Unavailable","message":"License is not available."}

I guess there is some other configuration error somewhere or why must the 04 always be running? They are all configured the same.



Forget this one... Kibana is installed on STHLM-04 and will of course not work properly if I shut down 04...

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.