Cross Cluster Search horizontal scaling

kannan_raj · January 15, 2021, 7:44am

We have cross cluster search running as a separate cluster with two nodes, and cross cluster search is connected to multiple remote clusters.

We have an issue with cross cluster search that all search quires are getting processed by single node and the node is getting bottleneck.

could you please confirm that, is cross cluster search support horizontal scaling to handle the search queries.

also could you please confirm that we can do delete the document using cross-cluster search

Christian_Dahlqvist · January 15, 2021, 7:57am

Which node(s) are getting bottlenecked? How have you configured CCS?

As far as I know CCS only supports searching, not updating, indexing or deleting.

kannan_raj · January 15, 2021, 8:24am

Thank you for your quick response @Christian_Dahlqvist

cross cluster search node is getting bottleneck. have deployed same config on two nodes

       cluster.name: admin-csc
       node.name: use1-es-admin02
       node.attr.dc: us-east-1a
       cluster.routing.allocation.awareness.attributes: dc
       path.data:
       - /data/es
       path.logs: /var/log/elasticsearch
       bootstrap.memory_lock: true
       network.host: 192.168.0.30
       cluster.initial_master_nodes: ["use1-es-admin01", "use1-es-admin02"]
       action.destructive_requires_name: true
       cluster:
         remote:
           apne1-analytics:
             seeds:
             - apne1-coordinator-analytics01:9300
             - apne1-coordinator-analytics02:9300"
             - apne1-coordinator-analytics03:9300"
             skip_unavailable: true

           sqa-use-1-2:
             seeds:
             - sqa-use-1-2-es-hot-2:9300
             - sqa-use-1-2-es-warm-2:9300"
             skip_unavailable: true

           prodm-use-es1-2-cluster:
             seeds:
             - prodm-use-es1-2-es-data-5:9300
             - prodm-use-es1-2-es-data-4:9300
             - prodm-use-es1-2-es-data-3:9300
             skip_unavailable: true

           prodm-ouse-es2-1-cluster:
             seeds:
             - prodm-ouse-es2-1-es-data-3:9300
             - prodm-ouse-es2-1-es-data-2:9300
             - prodm-ouse-es2-1-es-data-1:9300
             skip_unavailable: true

           sqa-ipr-use-1-1:
             seeds:
             - sqa-ipr-use-1-1-es-hot1:9300
             - sqa-ipr-use-1-1-es-warm1:9300
             skip_unavailable: true

           prodm-use-es1-1-cluster:
             seeds:
             - prodm-use-es1-1-es-data-2:9300
             - prodm-use-es1-1-es-data-1:9300
             - prodm-use-es1-1-es-data-3:9300
             skip_unavailable: true

           prod-ouse-es2-1-cluster:
             seeds:
             - prod-ouse-es2-1-es-hot-2:9300
             - prod-ouse-es2-1-es-hot-1:9300
             - prod-ouse-es2-1-es-hot-3:9300
             skip_unavailable: true

           prod-usw-es2-1-cluster:
             seeds:
             - prod-usw-es2-1-es-hot-2:9300
             - prod-usw-es2-1-es-hot-1:9300
             skip_unavailable: true

           prod-ouse-es2-4-cluster:
             seeds:
             - prod-ouse-es2-4-es-hot-2:9300
             - prod-ouse-es2-4-es-hot-3:9300
             skip_unavailable: true

           prod-ouse-es2-2-cluster:
             seeds:
             - prod-ouse-es2-2-es-hot-2:9300
             - prod-ouse-es2-2-es-hot-3:9300
             skip_unavailable: true

           prod-use-es1-8-cluster:
             seeds:
             - prod-use-es1-8-es-hot-1:9300
             - prod-use-es1-8-es-hot-2:9300
             skip_unavailable: true

           prod-beuw-es2-1-cluster:
             seeds:
             - prod-beuw-es2-1-es-hot-1:9300
             - prod-beuw-es2-1-es-hot-2:9300
             skip_unavailable: true

           prod-euw-es1-1-cluster:
             seeds:
             - prod-euw-es1-1-es-hot-1:9300
             - prod-euw-es1-1-es-hot-2:9300
             skip_unavailable: true

           prodm-euw-es1-1-cluster:
             seeds:
             - prodm-euw-es1-1-es-hot-2:9300
             - prodm-euw-es1-1-es-hot-1:9300
             skip_unavailable: true

           prodm-beuw-es2-1-cluster:
             seeds:
             - prodm-beuw-es2-1-es-hot-1:9300
             - prodm-beuw-es2-1-es-hot-2:9300
             skip_unavailable: true

           prodm-apse-es2-2-cluster:
             seeds:
             - prodm-apse-es2-2-es-hot-2:9300
             - prodm-apse-es2-2-es-hot-1:9300
             skip_unavailable: true

           prod-apse-es2-2-cluster:
             seeds:
             - prod-apse-es2-2-es-hot-1:9300
             - prod-apse-es2-2-es-hot-2:9300
             skip_unavailable: true

           prod-use-es1-7-cluster:
             seeds:
             - prod-use-es1-7-es-hot-1:9300
             - prod-use-es1-7-es-hot-3:9300
             skip_unavailable: true

           prodm-aps-es1-1-cluster:
             seeds:
             - prodm-aps-es1-1-es-hot-1:9300
             - prodm-aps-es1-1-es-hot-2:9300
             skip_unavailable: true

           prod-apne-es1-1-cluster:
             seeds:
             - prod-apne-es1-1-es-hot-1:9300
             - prod-apne-es1-1-es-hot-2:9300
             skip_unavailable: true

           prod-streams-euw1-1-cluster:
             seeds:
             - prod-streams-euw1-1-es-hot-2:9300
             - prod-streams-euw1-1-es-hot-1:9300
             skip_unavailable: true

           prod-streams-use2-1-cluster:
             seeds:
             - prod-streams-use2-1-es-hot-1:9300
             - prod-streams-use2-1-es-hot-2:9300
             skip_unavailable: true

           prod-streams-use1-3-cluster:
             seeds:
             - prod-streams-use1-3-es-hot-1:9300
             - prod-streams-use1-3-es-hot-2:9300
             skip_unavailable: true

       xpack.monitoring.collection.enabled: true

       xpack.security.enabled: true

       xpack.security.authc.realms:
         file.csc:
           order: 0

         native.native_seccloud:
           order: 1

       xpack.security.http.ssl.enabled: true
       xpack.security.http.ssl.key: "es.key"
       xpack.security.http.ssl.certificate: "es.crt"
       xpack.security.http.ssl.verification_mode: "certificate"
       xpack.security.http.ssl.client_authentication: "required"
       xpack.security.http.ssl.certificate_authorities: ["ca.crt" ]
       xpack.security.http.ssl.supported_protocols: TLSv1.2
       xpack.security.http.ssl.cipher_suites:
       - "TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256"
       - "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256"
       - "TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384"
       - "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384"

       #Transport TLS/SSL settings
       xpack.security.transport.ssl.enabled: true
       xpack.security.transport.ssl.key: "es.key"
       xpack.security.transport.ssl.certificate: "es.crt"
       xpack.security.transport.ssl.verification_mode: "certificate"
       xpack.security.transport.ssl.certificate_authorities: ["ca.crt"]
       xpack.security.transport.ssl.supported_protocols: TLSv1.2
       xpack.security.transport.ssl.cipher_suites:
       - "TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256"
       - "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256"
       - "TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384"
       - "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384"

Christian_Dahlqvist · January 15, 2021, 8:29am

The CCS node will need to coordinate the requests and gather and process the results from all clusters so querying that many clusters probably require a fair amount of resources. Exactly how much I assume depends on the data and query and how much data need to be returned.

How much resources in terms of heap and CPU do the CCS nodes have? Are you sending requests to all CCS nodes? What seem to be the limiting resource, CPU, heap or maybe network capacity? What does the hot threads API show?

kannan_raj · January 15, 2021, 9:12am

What seem to be the limiting resource, CPU, heap or maybe network capacity?

Got an circuit breaker exception from one of the ccs node
FATAL [circuit_breaking_exception] [parent] Data too large, data for [<http_request>] would be [42736789752/39.8gb], which is larger than the limit of [40802189312/38gb], real usage: [42736789752/39.8gb], new bytes reserved: [0/0b], usages [request=0/0b, fielddata=1285/1.2kb, in_flight_requests=0/0b, accounting=831566/812kb], with { bytes_wanted=42736789752 & bytes_limit=40802189312 & durability="PERMANENT" } :: {"path":"/.kibana","query":{},"statusCode":429,"response":"{"error":{"root_cause":[{"type":"circuit_breaking_exception","reason":"[parent] Data too large, data for [<http_request>] would be [42736789752/39.8gb], which is larger than the limit of [40802189312/38gb], real usage: [42736789752/39.8gb], new bytes reserved: [0/0b], usages [request=0/0b, fielddata=1285/1.2kb, in_flight_requests=0/0b, accounting=831566/812kb]","bytes_wanted":42736789752,"bytes_limit":40802189312,"durability":"PERMANENT"}],"type":"circuit_breaking_exception","reason":"[parent] Data too large, data for [<http_request>] would be [42736789752/39.8gb], which is larger than the limit of [40802189312/38gb], real usage: [42736789752/39.8gb], new bytes reserved: [0/0b], usages [request=0/0b, fielddata=1285/1.2kb, in_flight_requests=0/0b, accounting=831566/812kb]","bytes_wanted":42736789752,"bytes_limit":40802189312,"durability":"PERMANENT"},"status":429}"}

Are you sending requests to all CCS nodes?

we use CNAME for client to connect to cross cluster search

What seem to be the limiting resource

heap - 40GB
cpu - 16 core
memory - 64gb

Could you also elaborate, even though our cross-cluster search has two nodes why all the resource consumption is happening on single node only

kannan_raj · January 19, 2021, 6:00pm

@Christian_Dahlqvist Could you also elaborate, even though our cross-cluster search has two nodes why all the resource consumption is happening on single node only

Christian_Dahlqvist · January 19, 2021, 6:49pm

Are they configured the same way? Is traffic sent evenly to both of them?

kannan_raj · January 19, 2021, 7:02pm

configuration is same, we use CNAME for client to cross cluster search connection.
is cross cluster search can scale horizontal or vertical ?

system · February 16, 2021, 7:02pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Dec 6th, 2017: [KR][Elasticsearch] Cross Cluster Search로 할 수 있는 것들 Advent Calendar	1	2792	August 23, 2018
Cross cluster search with elastic search scalability tests / performance results Elasticsearch	1	494	September 12, 2017
Cross-cluster search questions Elasticsearch ccs-cross-cluster-search	10	1628	April 7, 2020
Search across multiple ES data sources Elasticsearch	44	2888	May 1, 2019
Cross-cluster search Elasticsearch	2	255	April 17, 2022

Cross Cluster Search horizontal scaling

Related topics