Higher search rate when added final data node to cluster

We've got a 5 data node v.6.8.23 cluster in production, that seems to increase search rate and cpu usage when the fifth node (exrhel0212) has joined the cluster, running on four nodes cluster is much calmer.

Hints on how to investigate what's going on and why search rate and cpu userland usage increases much appreciated, TIA!

from Kibana cluster monitoring:

Network activity on current master node:

CPU metrics from current master node:

6.X is EOL, please upgrade ASAP :slight_smile:

Can you share the output from the _cluster/stats?pretty&human API?

Perhaps an obvious observation when you add another node shards get shuffled to even the shards out and that uses CPU and IO.... Is it that??

1 Like

True, it could be transitory.

Thanks for replying, nope reshuffling has settled nicely reasonably fast. The higher search rate continues as a new baseline.

Now Operational staff have had the node out of the cluster for several days => considering added it to cluster as a new bare/naked data node thus to expand cluster...

Know about 6.x EOL, v.7 is on the roadmap... :slight_smile:

{
  "_nodes" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "cluster_name" : "RMElastic_prod",
  "cluster_uuid" : "cB79gjmQRRG9Tw9RKbyZog",
  "timestamp" : 1651133179324,
  "status" : "green",
  "indices" : {
    "count" : 1809,
    "shards" : {
      "total" : 8490,
      "primaries" : 4245,
      "replication" : 1.0,
      "index" : {
        "shards" : {
          "min" : 2,
          "max" : 10,
          "avg" : 4.693200663349917
        },
        "primaries" : {
          "min" : 1,
          "max" : 5,
          "avg" : 2.3466003316749586
        },
        "replication" : {
          "min" : 1.0,
          "max" : 1.0,
          "avg" : 1.0
        }
      }
    },
    "docs" : {
      "count" : 1053840751,
      "deleted" : 20366758
    },
    "store" : {
      "size" : "282.6gb",
      "size_in_bytes" : 303447749685
    },
    "fielddata" : {
      "memory_size" : "106.2mb",
      "memory_size_in_bytes" : 111398832,
      "evictions" : 0
    },
    "query_cache" : {
      "memory_size" : "41.4mb",
      "memory_size_in_bytes" : 43444818,
      "total_count" : 1186286,
      "hit_count" : 1132270,
      "miss_count" : 54016,
      "cache_size" : 1996,
      "cache_count" : 8684,
      "evictions" : 6688
    },
    "completion" : {
      "size" : "0b",
      "size_in_bytes" : 0
    },
    "segments" : {
      "count" : 63096,
      "memory" : "1.1gb",
      "memory_in_bytes" : 1220579609,
      "terms_memory" : "910.2mb",
      "terms_memory_in_bytes" : 954433089,
      "stored_fields_memory" : "92.3mb",
      "stored_fields_memory_in_bytes" : 96797568,
      "term_vectors_memory" : "0b",
      "term_vectors_memory_in_bytes" : 0,
      "norms_memory" : "79.4mb",
      "norms_memory_in_bytes" : 83277888,
      "points_memory" : "32.2mb",
      "points_memory_in_bytes" : 33850344,
      "doc_values_memory" : "49.8mb",
      "doc_values_memory_in_bytes" : 52220720,
      "index_writer_memory" : "5.1mb",
      "index_writer_memory_in_bytes" : 5414386,
      "version_map_memory" : "1.1mb",
      "version_map_memory_in_bytes" : 1250847,
      "fixed_bit_set" : "258.8mb",
      "fixed_bit_set_memory_in_bytes" : 271377488,
      "max_unsafe_auto_id_timestamp" : 1651104004845,
      "file_sizes" : { }
    }
  },
  "nodes" : {
    "count" : {
      "total" : 5,
      "data" : 4,
      "coordinating_only" : 0,
      "master" : 4,
      "ingest" : 5
    },
    "versions" : [
      "6.8.23"
    ],
    "os" : {
      "available_processors" : 18,
      "allocated_processors" : 18,
      "names" : [
        {
          "name" : "Linux",
          "count" : 5
        }
      ],
      "pretty_names" : [
        {
          "pretty_name" : "Red Hat Enterprise Linux",
          "count" : 5
        }
      ],
      "mem" : {
        "total" : "254.7gb",
        "total_in_bytes" : 273525952512,
        "free" : "2.5gb",
        "free_in_bytes" : 2708037632,
        "used" : "252.2gb",
        "used_in_bytes" : 270817914880,
        "free_percent" : 1,
        "used_percent" : 99
      }
    },
    "process" : {
      "cpu" : {
        "percent" : 48
      },
      "open_file_descriptors" : {
        "min" : 322,
        "max" : 17637,
        "avg" : 12823
      }
    },
    "jvm" : {
      "max_uptime" : "1.9d",
      "max_uptime_in_millis" : 165952585,
      "versions" : [
        {
          "version" : "1.8.0_322",
          "vm_name" : "OpenJDK 64-Bit Server VM",
          "vm_version" : "25.322-b06",
          "vm_vendor" : "Red Hat, Inc.",
          "count" : 5
        }
      ],
      "mem" : {
        "heap_used" : "52.1gb",
        "heap_used_in_bytes" : 55963970208,
        "heap_max" : "105.8gb",
        "heap_max_in_bytes" : 113659740160
      },
      "threads" : 568
    },
    "fs" : {
      "total" : "807.9gb",
      "total_in_bytes" : 867486879744,
      "free" : "490.9gb",
      "free_in_bytes" : 527122898944,
      "available" : "490.9gb",
      "available_in_bytes" : 527122898944
    },
    "plugins" : [ ],
    "network_types" : {
      "transport_types" : {
        "security4" : 5
      },
      "http_types" : {
        "security4" : 5
      }
    }
  }
}

Your average shard size is 0.03GB. You're likely wasting a tonne of resources there which won't help.

We are aware of this, on the roadmap are plan to aggregate many smaller indicies to monthly/yearly, only this doesn't explain why 5 nodes are running with so much higher load/search rate than 4 nodes for the same client load.

So attempted to rejoin node 212 as a new clean node today just before 12:00CEST, only it seems that after reshuffling has settled specific this node is sending approx 1K packets to each of the other 4 data nodes approx the same as the new baseline search rate. So my Q is howto figure out what's going on in this nodes brain :slight_smile:

Kibana Monitoring showing baselines before and after expand cluster with 5th. data node 212:

Network statistics for each cluster node:

Master electable node 217:

Data node 215:

Data node 214:

Data node 213:

Data node 212:

Data node 211:

Attaching jstack dump of node 212: Jstack dump

Would it be possible dynamically to increase search logging to debug or TRACE through cluster settings, eg.

PUT /_cluster/settings
{
  "transient": {
    "logger.search": "TRACE"
  }
}

But guess one needs a LogConfig context predefined first, maybe like this:

appender.search_rolling.type = RollingFile
appender.search_rolling.name = search_rolling
appender.search_rolling.fileName = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_search.log
appender.search_rolling.layout.type = PatternLayout
appender.search_rolling.layout.pattern = [%d{ISO8601}][%-5p][%-25c{1.}] [%node_name]%marker %.-10000m%n
appender.search_rolling.filePattern = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_search-%i.log.gz
appender.search_rolling.policies.type = Policies
appender.search_rolling.policies.size.type = SizeBasedTriggeringPolicy
appender.search_rolling.policies.size.size = 1GB
appender.search_rolling.strategy.type = DefaultRolloverStrategy
appender.search_rolling.strategy.max = 4

logger.search.name = org.elasticsearch.search
logger.search.level = warn
logger.search.appenderRef.search_rolling.ref = search_rolling
logger.search.additivity = false