Cross Cluster Replication / Search between Self Managed and ECK

Okay I need some help here. I am trying to set up Cross Cluster Search / Replication on our clusters but just not getting there.

Quick overview I have an on Prem Cluster sitting in our datacenters and trying to connect it to a new ECK cluster running in Azure. I have followed all the guides and set up everything as per documentation but the cluster still wont connect.

Below is all my configurations as well as the errors i am getting out of my stack logs:

On Prem Elasticsearch.yml (mostly the xpack settings):

# Enable security features
xpack.security.enabled: true

xpack.security.enrollment.enabled: true

# Enable encryption for HTTP API client connections, such as Kibana, Logstash, and Agents
xpack.security.http.ssl:
  enabled: true
  keystore.path: certs/http.p12

# Enable encryption and mutual authentication between cluster nodes
xpack.security.transport.ssl:
  enabled: true
  verification_mode: certificate
  keystore.path: certs/transport.p12
  truststore.path: certs/transport.p12
# Create a new cluster with the current node only
# Additional nodes can still join the cluster later
discovery.seed_hosts: ["xxxxxxxxxxxx"]

# Allow HTTP API connections from anywhere
# Connections are encrypted and require user authentication
http.host: 0.0.0.0

# Allow other nodes to join the cluster from anywhere
# Connections are encrypted and mutually authenticated
#transport.host: 0.0.0.0

#----------------------- END SECURITY AUTO CONFIGURATION -------------------------

# Cross Cluster XPACK #
xpack.security.remote_cluster_client.ssl.enabled: true
xpack.security.remote_cluster_client.ssl.verification_mode: certificate
xpack.security.remote_cluster_client.ssl.certificate_authorities: ["certs/remote-ca.crt"]

AKS ECK Deployment Config:

eck-elasticsearch:
  ##### Load Balancer Config Start #####
  transport:
    service:
      metadata:
        name: elastic-eck-internal-loadbalancer
        annotations: 
          service.beta.kubernetes.io/azure-load-balancer-internal: "true"
      spec:
        type: LoadBalancer
      tls:
        certificate: 
          secretName: "elastic-pp-we-es-transport-certs-public"
  ##### Load Balancer Config End #####
  enabled: true
  version: 8.17.0
  fullnameOverride: elastic-pp-we
  nodeSets:
  - name: masters
    count: 1
    config:
      node.roles: ["master", "remote_cluster_client", "transform"]
      node.store.allow_mmap: false
      remoteClusterServer:
        enabled: true

Additionally I have created a cross-cluster API key and added it to the keystore on the local cluster and also created the cluster in Kibana -> Remote Clusters

This is the log file from my local cluster:

[2025-02-10T11:31:54,881][WARN ][o.e.t.TcpTransport       ] [XXX-Master-1] exception caught on transport layer [Netty4TcpChannel{localAddress=/x.x.x.x:36232, remoteAddress=/x.x.x.x:9300, profile=_remote_cluster}], closing connection
io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: (certificate_required) Received fatal alert: certificate_required
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:500) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[?:?]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1357) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[?:?]
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:868) ~[?:?]
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562) ~[?:?]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[?:?]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]
        at java.lang.Thread.run(Thread.java:1575) ~[?:?]
Caused by: javax.net.ssl.SSLHandshakeException: (certificate_required) Received fatal alert: certificate_required
        at sun.security.ssl.Alert.createSSLException(Alert.java:130) ~[?:?]
        at sun.security.ssl.Alert.createSSLException(Alert.java:117) ~[?:?]
        at sun.security.ssl.TransportContext.fatal(TransportContext.java:365) ~[?:?]
        at sun.security.ssl.Alert$AlertConsumer.consume(Alert.java:287) ~[?:?]
        at sun.security.ssl.TransportContext.dispatch(TransportContext.java:204) ~[?:?]
        at sun.security.ssl.SSLTransport.decode(SSLTransport.java:172) ~[?:?]
        at sun.security.ssl.SSLEngineImpl.decode(SSLEngineImpl.java:736) ~[?:?]
        at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:691) ~[?:?]
        at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:506) ~[?:?]
        at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:482) ~[?:?]
        at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:679) ~[?:?]
        at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:309) ~[?:?]
        at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1473) ~[?:?]
        at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1366) ~[?:?]
        at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1415) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:530) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:469) ~[?:?]
        ... 16 more
[2025-02-10T11:31:54,880][WARN ][o.e.t.SniffConnectionStrategy] [XXX-Master-1] fetching nodes from external cluster [WE-ECK-Preprod] failed
org.elasticsearch.transport.ConnectTransportException: [][x.x.x.x:9300] general node connection failure
        at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.lambda$onResponse$2(TcpTransport.java:1133) ~[elasticsearch-8.17.1.jar:?]
        at org.elasticsearch.action.ActionListenerImplementations.safeAcceptException(ActionListenerImplementations.java:64) ~[elasticsearch-8.17.1.jar:?]
        at org.elasticsearch.action.ActionListener$2.onFailure(ActionListener.java:265) ~[elasticsearch-8.17.1.jar:?]
        at org.elasticsearch.transport.TransportHandshaker$HandshakeResponseHandler.handleLocalException(TransportHandshaker.java:264) ~[elasticsearch-8.17.1.jar:?]
        at org.elasticsearch.transport.TransportHandshaker.lambda$sendHandshake$0(TransportHandshaker.java:155) ~[elasticsearch-8.17.1.jar:?]
        at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:217) ~[elasticsearch-8.17.1.jar:?]
        at org.elasticsearch.action.support.SubscribableListener$SuccessResult.complete(SubscribableListener.java:387) ~[elasticsearch-8.17.1.jar:?]
        at org.elasticsearch.action.support.SubscribableListener.tryComplete(SubscribableListener.java:307) ~[elasticsearch-8.17.1.jar:?]
        at org.elasticsearch.action.support.SubscribableListener.setResult(SubscribableListener.java:336) ~[elasticsearch-8.17.1.jar:?]
        at org.elasticsearch.action.support.SubscribableListener.onResponse(SubscribableListener.java:250) ~[elasticsearch-8.17.1.jar:?]
        at org.elasticsearch.transport.netty4.Netty4Utils.lambda$addListener$2(Netty4Utils.java:232) ~[?:?]
        at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:590) ~[?:?]
        at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:583) ~[?:?]
        at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:559) ~[?:?]
        at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:492) ~[?:?]
        at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:636) ~[?:?]
        at io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:625) ~[?:?]
        at io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:105) ~[?:?]
        at io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84) ~[?:?]
        at io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:1161) ~[?:?]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:753) ~[?:?]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:729) ~[?:?]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:619) ~[?:?]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.close(DefaultChannelPipeline.java:1299) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeClose(AbstractChannelHandlerContext.java:755) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:733) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:560) ~[?:?]
        at io.netty.handler.ssl.SslUtils.handleHandshakeFailure(SslUtils.java:497) ~[?:?]
        at io.netty.handler.ssl.SslHandler.setHandshakeFailure(SslHandler.java:2025) ~[?:?]
        at io.netty.handler.ssl.SslHandler.handleUnwrapThrowable(SslHandler.java:1404) ~[?:?]
        at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1371) ~[?:?]
        at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1415) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:530) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:469) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[?:?]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1357) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[?:?]
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:868) ~[?:?]
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562) ~[?:?]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[?:?]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]
        at java.lang.Thread.run(Thread.java:1575) ~[?:?]
Caused by: org.elasticsearch.transport.TransportException: handshake failed because connection reset
        ... 46 more
[2025-02-10T11:31:54,883][WARN ][o.e.t.RemoteClusterService] [XXX-Master-1] failed to update remote cluster connection [WE-ECK-Preprod]
org.elasticsearch.transport.ConnectTransportException: [][x.x.x.x:9300] general node connection failure
        at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.lambda$onResponse$2(TcpTransport.java:1133) ~[elasticsearch-8.17.1.jar:?]
        at org.elasticsearch.action.ActionListenerImplementations.safeAcceptException(ActionListenerImplementations.java:64) ~[elasticsearch-8.17.1.jar:?]
        at org.elasticsearch.action.ActionListener$2.onFailure(ActionListener.java:265) ~[elasticsearch-8.17.1.jar:?]
        at org.elasticsearch.transport.TransportHandshaker$HandshakeResponseHandler.handleLocalException(TransportHandshaker.java:264) ~[elasticsearch-8.17.1.jar:?]
        at org.elasticsearch.transport.TransportHandshaker.lambda$sendHandshake$0(TransportHandshaker.java:155) ~[elasticsearch-8.17.1.jar:?]
        at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:217) ~[elasticsearch-8.17.1.jar:?]
        at org.elasticsearch.action.support.SubscribableListener$SuccessResult.complete(SubscribableListener.java:387) ~[elasticsearch-8.17.1.jar:?]
        at org.elasticsearch.action.support.SubscribableListener.tryComplete(SubscribableListener.java:307) ~[elasticsearch-8.17.1.jar:?]
        at org.elasticsearch.action.support.SubscribableListener.setResult(SubscribableListener.java:336) ~[elasticsearch-8.17.1.jar:?]
        at org.elasticsearch.action.support.SubscribableListener.onResponse(SubscribableListener.java:250) ~[elasticsearch-8.17.1.jar:?]
        at org.elasticsearch.transport.netty4.Netty4Utils.lambda$addListener$2(Netty4Utils.java:232) ~[?:?]
        at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:590) ~[?:?]
        at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:583) ~[?:?]
        at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:559) ~[?:?]
        at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:492) ~[?:?]
        at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:636) ~[?:?]
        at io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:625) ~[?:?]
        at io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:105) ~[?:?]
        at io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84) ~[?:?]
        at io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:1161) ~[?:?]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:753) ~[?:?]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:729) ~[?:?]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:619) ~[?:?]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.close(DefaultChannelPipeline.java:1299) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeClose(AbstractChannelHandlerContext.java:755) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:733) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:560) ~[?:?]
        at io.netty.handler.ssl.SslUtils.handleHandshakeFailure(SslUtils.java:497) ~[?:?]
        at io.netty.handler.ssl.SslHandler.setHandshakeFailure(SslHandler.java:2025) ~[?:?]
        at io.netty.handler.ssl.SslHandler.handleUnwrapThrowable(SslHandler.java:1404) ~[?:?]
        at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1371) ~[?:?]
        at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1415) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:530) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:469) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[?:?]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1357) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[?:?]
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:868) ~[?:?]
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562) ~[?:?]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[?:?]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]
        at java.lang.Thread.run(Thread.java:1575) ~[?:?]
Caused by: org.elasticsearch.transport.TransportException: handshake failed because connection reset
        ... 46 more

I have checked and doublechecked firewall and i am sure traffic is being allowed to the remote cluster and the specified port. I have also checked from my local cluster that i can actually get to the remote IP and the port.

I am new to CCS/CCR as you can tell, so any help would be appreciated.

Howdy!

It sounds like you're trying to use API key-based CCS but you've got Certificate-based CCS configured.

The error is for a connection on 9300 (transport) and the certificate required message you're getting is coming from the server -- it's saying you need to provide a client certificate. Because communication on the transport port is certificate based.

The port for certificate based CCS is 9300, uses the transport protocol, and requires providing a client certificate.

The port for API-key based CCS is 9443 by default.

So all you may need to do is go into kibana and point the Elasticsearch URL at 9443.

I've filed an issue here, which if I have time for, I will propose a PR for, to update the documentation to better cover this scenario Document remote cluster setup with non-eck clusters · Issue #8502 · elastic/cloud-on-k8s · GitHub as it currently only guides you to setup certificate based auth when connecting a remote cluster to an eck cluster