Reducing latency when node down


We are developing an API server using Elasticsearch as a storage. The latency of that API is usually around 50ms.
The issue I am facing is that when I stop one Elasticsearch node, the latency goes up to seconds even for a short period time.

Do you have any idea why such high latency arises and are there anything I can do to reduce this latency?
I thought that since we have replica shards, clients or coordinating nodes can just access to another node in case of failure, so did not expect such latency.

Here is our cluster setup.
Hardware: (20CPU, 32GB memory, SSD, 10Gbps NW) x 3 nodes.
Elasticsearch: 6.2.3
index: 12GB / 3 * 2 shards (1 replica)

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.