Cross Cluster hangs frequently

I've been trying to get cross cluster to work between two geographically remote clusters. I've gotten it to work but frequently, it just stops working. Querying _remote/info and remote indices hang indefinitely. But after some time (I don't know the length since I usually go do something else), it works again.

Is there some sort of logging that I can look at to see what is happening?

I'm running ES 5.6.2 in both clusters. Cluster A can talk to Cluster B on both the http transport and the es transport port. Cluster B can NOT talk to Cluster A, though. Only Cluster A is set up for cross cluster, though, so I believe that should not be an issue.

When it hangs, would you please take a stack dump on both clusters (use jstack) and share them?

Which servers in the cluster should I run jstack on? In cluster A, we have almost 30 servers and the other has 3.

And do you want me to just paste them here?

Sorry for not being clear. The coordinating node in cluster A, and all nodes in cluster B (unless it's an outrageous number of nodes, in which case let us start with the gateway nodes in cluster B).

I have the 4 files but I can't seem to upload them here. The upload appears to be limited to image files. How do you want me to share them?

Also, just to make sure I did it correctly, I waited for _remote/info to hang. I then did a jstack -F pid of the node on cluster A that I had queried _remote/info (I think that is what you mean by the coordinating node since I don't have special settings). I then ran jstack -F pid on the three nodes in cluster B.

Would you be willing to email them? My email address is my first name at elastic.co.

I sent you them in an email.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.