Cross-cluster errors if remote cluster cannot be reached


(Alessandro Federici) #1

Hi,
I'm experimenting with the cross-cluster search feature and I have a question about error handling.

Imagine I have a multi-cluster setup like this:

  1. cluster_master: my app connects here and this is where I setup the connections to the other clusters
  2. cluster_two: some data, say from customer A
  3. cluster_three: some data, say from customer B

I need to be able to query cluster_master regardless of the status of the connection to cluster_two and cluster_three. In other words, it's not acceptable, in my situation, that the whole search breaks if cluster_three cannot be reached for 10 seconds, 10 minutes or 10 days. The "whole" need to keep working and I would just get less data, possibly with some indication that not all clusters were reacheable.

What is the best way to accomplish this, beside trying to adjust the query dynamically after parsing the error texts that I get back (i.e. "unable to communicate with remote cluster [cluster_XXX]" )?

Thanks!


(Luca Cavanna) #2

Hi Alessandro,
the problem you are facing is that none of the nodes are available in one of the remote clusters.

When doing a cross cluster search, one node per remote cluster is initially contacted to find out where the shards for such request are located. If this step cannot be completed, the whole search request fails. If you had at least one node available, the search would rather go on although some other nodes, potentially nodes holding relevant shards, are down. This would just cause partial results, but not a total failure.

I don't see a way to work around this other than checking upfront which remote clusters are online and avoid querying the ones that are completely down.

Cheers
Luca


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.