I have a simple setup, with test data of 10 documents.
I have 3 nodes, 2 data 1 only master
I have 5 shards with 1 replica.
I run a search query every second via small simulator
I then disable the network card on the node that contains only the replicas.
My search queries are lagging - all of them, during the first 20 seconds post the card disable
So first call post the NIC down scenario, will get reply after 19s
Second call will get reply after 18s
Third call will get reply after 17s
I am using Elastic 6.7.1 - can someone elaborate on the root cause for this ?
How comes that killing 1 node my cluster hangs for 20 seconds ?
How have you configured /proc/sys/net/ipv4/tcp_retries2? It defaults to 15 which is far too many IMO, and there are others who recommend reducing it to 3 for high-availability situations.
There's also an issue in older Elasticsearch versions (fixed in #39629, released in 7.2.0) that could slow down cluster state updates in your situation. I don't know that this will affect this experiment, unless you're disabling the NIC on the master, but I recommend upgrading to a later version.
Normally a search will be distributed across the whole cluster, so I would expect it to try and search some of the shards on node B. If your OS is configured to retry transmission an unreasonable number of times before giving up then those remote searches could take a long time to fail.
I understand ES is round robin between the nodes, but I make a call every sec - all of the calls are hanged during this time.
Even if ES is distributing my search the local shard should reply and I expect to get the reply back.
I made this test with only one document in my index... latest code and still issue occur.
Please note that the test is disabling the NIC, if I kill the service all works perfect without this hang...
I don't understand the logic in this design :
ES sends my search query to all nodes, lets say I have 5 nodes, where one of the has crushed.
Now I am getting replies from 4 nodes but instead of returning the results, the server will wait for the reply from node #5 that is down?
Right. The much more common case is that you don't have a failing node and there you want each search to use all the CPU/IO/etc. resources in the cluster, rather than restricting itself to a single node.
Elasticsearch will notice that the remote node is down as soon as the OS tells it the connection has dropped. The issue you're facing is that the OS is taking far longer than you would like to notice that the connection has dropped.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.