Influence of Master Nodes and Large Cluster State on Search

Hey,

We are running in production an elasticsearch cluster. The setup is:

  • 10 data nodes - 8 cpus and 64G ram, half is heap
  • 3 master nodes - 4 cpus and 16g ram, half is heap
    In our cluster we have large number of indices, around 18,000 with 1 replicas (36,000 shards)

We have a problem of getting slow search responses once in a couple of search requests (investigate the profile api led to conclustion it's not someting about the search itself, as the profiling metrics looks good, but something else in the infra).

Our suspicion is that the problem is the large number of indices we have on the cluster, or maybe something similar which related to the master nodes. After increasing the cpu and memory of the master nodes we got much better search performance in our app.

Looking in the elastic docs we found this:
Elasticsearch is a peer to peer based system, in which nodes communicate with one another directly. The high-throughput APIs (index, delete, search) do not normally interact with the master node. The responsibility of the master node is to maintain the global cluster state and reassign shards when nodes join or leave the cluster. Each time the cluster state is changed, the new state is published to all nodes in the cluster as described above.

It seems that the master nodes are not involved in the high-throughput apis, so how our problems and the improvement in search due to the change in the master nodes can be explained?
We will be happy to understand all of this as it is not clear from the docs.

Thanks

Are you making sure you are not sending requests to the dedicated master nodes? Which version of Elasticsearch are you using?

Might it be something else that gets the master nodes busy, e.g. use of dynamic mappings that cause cluster updates in large numbers?

That's far too many for the number of nodes you have, ~3600 per node!

Yes that true, But I still want to understand how does it suppose to affect search latency and the reason for the weird behaviour of getting slow response once in a couple of requests.

We also got the weird behaviour of getting slow response once in a couple of requests while sending the request to non master nodes.

We are using Version 7.8.0.

I understand the master nodes are busy but I still don't get why should it affect search latency as long as the cluster is healthy.

What is the output from the _cluster/stats?pretty&human API?

{
"_nodes" : {
"total" : 16,
"successful" : 16,
"failed" : 0
},
"cluster_name" : "elastic-app-3",
"cluster_uuid" : "lHGYdcS3SZSommD90SjMsw",
"timestamp" : 1621333544411,
"status" : "green",
"indices" : {
"count" : 18675,
"shards" : {
"total" : 37350,
"primaries" : 18675,
"replication" : 1.0,
"index" : {
"shards" : {
"min" : 2,
"max" : 2,
"avg" : 2.0
},
"primaries" : {
"min" : 1,
"max" : 1,
"avg" : 1.0
},
"replication" : {
"min" : 1.0,
"max" : 1.0,
"avg" : 1.0
}
}
},
"docs" : {
"count" : 237942647,
"deleted" : 20796261
},
"store" : {
"size" : "3.3tb",
"size_in_bytes" : 3630614023017
},
"fielddata" : {
"memory_size" : "2.4mb",
"memory_size_in_bytes" : 2555664,
"evictions" : 0
},
"query_cache" : {
"memory_size" : "17.8mb",
"memory_size_in_bytes" : 18700027,
"total_count" : 2159081,
"hit_count" : 127302,
"miss_count" : 2031779,
"cache_size" : 7108,
"cache_count" : 9693,
"evictions" : 2585
},
"completion" : {
"size" : "0b",
"size_in_bytes" : 0
},
"segments" : {
"count" : 224177,
"memory" : "2.3gb",
"memory_in_bytes" : 2556240876,
"terms_memory" : "1.6gb",
"terms_memory_in_bytes" : 1778862952,
"stored_fields_memory" : "110.9mb",
"stored_fields_memory_in_bytes" : 116314216,
"term_vectors_memory" : "117mb",
"term_vectors_memory_in_bytes" : 122740800,
"norms_memory" : "300.4mb",
"norms_memory_in_bytes" : 315089664,
"points_memory" : "0b",
"points_memory_in_bytes" : 0,
"doc_values_memory" : "212.8mb",
"doc_values_memory_in_bytes" : 223233244,
"index_writer_memory" : "20.6gb",
"index_writer_memory_in_bytes" : 22121363846,
"version_map_memory" : "133.7mb",
"version_map_memory_in_bytes" : 140211008,
"fixed_bit_set" : "0b",
"fixed_bit_set_memory_in_bytes" : 0,
"max_unsafe_auto_id_timestamp" : -1,
"file_sizes" : { }
},
"mappings" : {
"field_types" : [
{
"name" : "boolean",
"count" : 18675,
"index_count" : 18675
},
{
"name" : "date",
"count" : 37350,
"index_count" : 18675
},
{
"name" : "flattened",
"count" : 18675,
"index_count" : 18675
},
{
"name" : "integer",
"count" : 18675,
"index_count" : 18675
},
{
"name" : "keyword",
"count" : 186750,
"index_count" : 18675
},
{
"name" : "object",
"count" : 74700,
"index_count" : 18675
},
{
"name" : "text",
"count" : 504225,
"index_count" : 18675
},
{
"name" : "token_count",
"count" : 18675,
"index_count" : 18675
}
]
},
"analysis" : {
"char_filter_types" : ,
"tokenizer_types" : [
{
"name" : "char_group",
"count" : 37350,
"index_count" : 18675
},
{
"name" : "edge_ngram",
"count" : 18675,
"index_count" : 18675
}
],
"filter_types" : [
{
"name" : "edge_ngram",
"count" : 18675,
"index_count" : 18675
},
{
"name" : "length",
"count" : 18675,
"index_count" : 18675
},
{
"name" : "shingle",
"count" : 18675,
"index_count" : 18675
},
{
"name" : "stemmer",
"count" : 18675,
"index_count" : 18675
},
{
"name" : "stop",
"count" : 18675,
"index_count" : 18675
}
],
"analyzer_types" : [
{
"name" : "custom",
"count" : 280125,
"index_count" : 18675
}
],
"built_in_char_filters" : ,
"built_in_tokenizers" : [
{
"name" : "standard",
"count" : 93375,
"index_count" : 18675
},
{
"name" : "whitespace",
"count" : 74700,
"index_count" : 18675
}
],
"built_in_filters" : [
{
"name" : "asciifolding",
"count" : 280125,
"index_count" : 18675
},
{
"name" : "lowercase",
"count" : 280125,
"index_count" : 18675
},
{
"name" : "remove_duplicates",
"count" : 37350,
"index_count" : 18675
},
{
"name" : "word_delimiter_graph",
"count" : 74700,
"index_count" : 18675
}
],
"built_in_analyzers" : [
{
"name" : "keyword",
"count" : 18675,
"index_count" : 18675
},
{
"name" : "standard",
"count" : 18675,
"index_count" : 18675
}
]
}
},
"nodes" : {
"count" : {
"total" : 16,
"coordinating_only" : 0,
"data" : 10,
"ingest" : 10,
"master" : 3,
"ml" : 16,
"remote_cluster_client" : 16,
"transform" : 10,
"voting_only" : 0
},
"versions" : [
"7.8.0"
],
"os" : {
"available_processors" : 116,
"allocated_processors" : 116,
"names" : [
{
"name" : "Linux",
"count" : 16
}
],
"pretty_names" : [
{
"pretty_name" : "CentOS Linux 7 (Core)",
"count" : 16
}
],
"mem" : {
"total" : "751gb",
"total_in_bytes" : 806380109824,
"free" : "88.3gb",
"free_in_bytes" : 94815694848,
"used" : "662.6gb",
"used_in_bytes" : 711564414976,
"free_percent" : 12,
"used_percent" : 88
}
},
"process" : {
"cpu" : {
"percent" : 341
},
"open_file_descriptors" : {
"min" : 630,
"max" : 29896,
"avg" : 18554
}
},
"jvm" : {
"max_uptime" : "68.7d",
"max_uptime_in_millis" : 5943619353,
"versions" : [
{
"version" : "14.0.1",
"vm_name" : "OpenJDK 64-Bit Server VM",
"vm_version" : "14.0.1+7",
"vm_vendor" : "AdoptOpenJDK",
"bundled_jdk" : true,
"using_bundled_jdk" : true,
"count" : 16
}
],
"mem" : {
"heap_used" : "242.7gb",
"heap_used_in_bytes" : 260628786424,
"heap_max" : "408gb",
"heap_max_in_bytes" : 438086664192
},
"threads" : 2688
},
"fs" : {
"total" : "22.9tb",
"total_in_bytes" : 25194268372992,
"free" : "19.5tb",
"free_in_bytes" : 21524646838272,
"available" : "19.5tb",
"available_in_bytes" : 21524646838272
},
"plugins" : [
{
"name" : "repository-azure",
"version" : "7.8.0",
"elasticsearch_version" : "7.8.0",
"java_version" : "1.8",
"description" : "The Azure Repository plugin adds support for Azure storage repositories.",
"classname" : "org.elasticsearch.repositories.azure.AzureRepositoryPlugin",
"extended_plugins" : ,
"has_native_controller" : false
},
{
"name" : "prometheus-exporter",
"version" : "7.8.0.0",
"elasticsearch_version" : "7.8.0",
"java_version" : "1.8",
"description" : "Export Elasticsearch metrics to Prometheus",
"classname" : "org.elasticsearch.plugin.prometheus.PrometheusExporterPlugin",
"extended_plugins" : ,
"has_native_controller" : false
},
{
"name" : "repository-s3",
"version" : "7.8.0",
"elasticsearch_version" : "7.8.0",
"java_version" : "1.8",
"description" : "The S3 repository plugin adds S3 repositories",
"classname" : "org.elasticsearch.repositories.s3.S3RepositoryPlugin",
"extended_plugins" : ,
"has_native_controller" : false
}
],
"network_types" : {
"transport_types" : {
"security4" : 16
},
"http_types" : {
"security4" : 16
}
},
"discovery_types" : {
"zen" : 16
},
"packaging_types" : [
{
"flavor" : "default",
"type" : "docker",
"count" : 16
}
],
"ingest" : {
"number_of_pipelines" : 2,
"processor_stats" : {
"gsub" : {
"count" : 0,
"failed" : 0,
"current" : 0,
"time" : "0s",
"time_in_millis" : 0
},
"script" : {
"count" : 0,
"failed" : 0,
"current" : 0,
"time" : "0s",
"time_in_millis" : 0
}
}
}
}
}

That's 3735 shards per node with an average shard size of less than a gig.
You are seriously wasting way too much heap managing those shards, which is likely the majority of the issue you have.

The only thing I can think of is that your master nodes can better handle the allocation tables of the sheer number of shards you have in your cluster with more resources. Frankly though, focussing on that is missing the forest for the trees, and if you don't fix that you are just asking for continued problems.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.