Kibana status become RED

Hi

Kibana status becomes RED.

{"type":"log","@timestamp":"2020-11-06T19:59:34Z","tags":["status","plugin:elasticsearch@undefined","error"],"pid":10,"state":"red","message":"Status changed from green to red - Request Timeout after 30000ms","prevState":"green","prevMsg":"Ready"}

Whenever Kibana status turns to RED, Below snippet seen ES Master pod.
ES Master: Log snippet

{"type":"log","host":"xxx-elasticsearch-master-0","level":"WARN","systemid":"4636c00bfc3849e0be179bc71cef17f8","system":"BELK","time": "2020-11-06T19:49:53.660Z","logger":"o.e.t.TransportService","timezone":"UTC","marker":"[xxx-elasticsearch-master-0] ","log":"Received response for a request that has timed out, sent [19655ms] ago, timed out [4606ms] ago, action [cluster:monitor/nodes/stats[n]], node [{xxx-elasticsearch-data-0}{_HfUz-16TjizENd4xV2_iw}{s3dX9Tu1QHWPDpes89SsWw}{10.244.2.154}{10.244.2.154:9300}], id [1497830]"}
{"type":"log","host":"xxx-elasticsearch-master-0","level":"WARN","systemid":"4636c00bfc3849e0be179bc71cef17f8","system":"BELK","time": "2020-11-06T19:59:41.327Z","logger":"o.e.t.TransportService","timezone":"UTC","marker":"[xxx-elasticsearch-master-0] ","log":"Received response for a request that has timed out, sent [22303ms] ago, timed out [7219ms] ago, action [cluster:monitor/nodes/stats[n]], node [{xxx-elasticsearch-data-0}{_HfUz-16TjizENd4xV2_iw}{s3dX9Tu1QHWPDpes89SsWw}{10.244.2.154}{10.244.2.154:9300}], id [1499291]"}
{"type":"log","host":"xxx-elasticsearch-master-0","level":"WARN","systemid":"4636c00bfc3849e0be179bc71cef17f8","system":"BELK","time": "2020-11-06T20:00:26.955Z","logger":"o.e.c.InternalClusterInfoService","timezone":"UTC","marker":"[xxx-elasticsearch-master-0] ","log":"Failed to update node information for ClusterInfoUpdateJob within 15s timeout"}
{"type":"log","host":"xxx-elasticsearch-master-0","level":"WARN","systemid":"4636c00bfc3849e0be179bc71cef17f8","system":"BELK","time": "2020-11-06T20:00:41.971Z","logger":"o.e.c.InternalClusterInfoService","timezone":"UTC","marker":"[xxx-elasticsearch-master-0] ","log":"Failed to update shard information for ClusterInfoUpdateJob within 15s timeout"}
{"type":"log","host":"xxx-elasticsearch-master-0","level":"WARN","systemid":"4636c00bfc3849e0be179bc71cef17f8","system":"BELK","time": "2020-11-06T20:00:42.503Z","logger":"o.e.t.TransportService","timezone":"UTC","marker":"[xxx-elasticsearch-master-0] ","log":"Received response for a request that has timed out, sent [30495ms] ago, timed out [15445ms] ago, action [cluster:monitor/nodes/stats[n]], node [{xxx-elasticsearch-data-0}{_HfUz-16TjizENd4xV2_iw}{s3dX9Tu1QHWPDpes89SsWw}{10.244.2.154}{10.244.2.154:9300}], id [1499424]"}
{"type":"log","host":"xxx-elasticsearch-master-0","level":"WARN","systemid":"4636c00bfc3849e0be179bc71cef17f8","system":"BELK","time": "2020-11-06T20:01:37.196Z","logger":"o.e.t.TransportService","timezone":"UTC","marker":"[xxx-elasticsearch-master-0] ","log":"Received response for a request that has timed out, sent [25121ms] ago, timed out [10070ms] ago, action [cluster:monitor/nodes/stats[n]], node [{xxx-elasticsearch-data-0}{_HfUz-16TjizENd4xV2_iw}{s3dX9Tu1QHWPDpes89SsWw}{10.244.2.154}{10.244.2.154:9300}], id [1499568]"}

Could you help us to get the exact reason of Why Kibana status turns RED and what is meaning of "Failed to update shard information for ClusterInfoUpdateJob within 15s timeout" ES Master logs.

What version are you running?
What is the output from _cluster/stats from Elasticsearch?

_cluster/stats

{
  "_nodes" : {
    "total" : 3,
    "successful" : 3,
    "failed" : 0
  },
  "cluster_name" : "default-xxx",
  "cluster_uuid" : "8a5UUxJPSpKoTWWu68iNiA",
  "timestamp" : 1604980146610,
  "status" : "yellow",
  "indices" : {
    "count" : 90,
    "shards" : {
      "total" : 106,
      "primaries" : 106,
      "replication" : 0.0,
      "index" : {
        "shards" : {
          "min" : 1,
          "max" : 5,
          "avg" : 1.1777777777777778
        },
        "primaries" : {
          "min" : 1,
          "max" : 5,
          "avg" : 1.1777777777777778
        },
        "replication" : {
          "min" : 0.0,
          "max" : 0.0,
          "avg" : 0.0
        }
      }
    },
    "docs" : {
      "count" : 10188400,
      "deleted" : 1455152
    },
    "store" : {
      "size_in_bytes" : 5674272181
    },
    "fielddata" : {
      "memory_size_in_bytes" : 5713520,
      "evictions" : 0
    },
    "query_cache" : {
      "memory_size_in_bytes" : 12899426,
      "total_count" : 38368247,
      "hit_count" : 14000427,
      "miss_count" : 24367820,
      "cache_size" : 12850,
      "cache_count" : 174046,
      "evictions" : 161196
    },
    "completion" : {
      "size_in_bytes" : 0
    },
    "segments" : {
      "count" : 767,
      "memory_in_bytes" : 15474527,
      "terms_memory_in_bytes" : 12877402,
      "stored_fields_memory_in_bytes" : 1938440,
      "term_vectors_memory_in_bytes" : 0,
      "norms_memory_in_bytes" : 269056,
      "points_memory_in_bytes" : 223695,
      "doc_values_memory_in_bytes" : 165934,
      "index_writer_memory_in_bytes" : 0,
      "version_map_memory_in_bytes" : 0,
      "fixed_bit_set_memory_in_bytes" : 79128,
      "max_unsafe_auto_id_timestamp" : 1603812816219,
      "file_sizes" : { }
    }
  },
  "nodes" : {
    "count" : {
      "total" : 3,
      "data" : 1,
      "coordinating_only" : 0,
      "master" : 1,
      "ingest" : 3
    },
    "versions" : [
      "7.0.1"
    ],
    "os" : {
      "available_processors" : 7,
      "allocated_processors" : 7,
      "names" : [
        {
          "name" : "Linux",
          "count" : 3
        }
      ],
      "pretty_names" : [
        {
          "pretty_name" : "CentOS Linux 7 (Core)",
          "count" : 3
        }
      ],
      "mem" : {
        "total_in_bytes" : 139236974592,
        "free_in_bytes" : 8476033024,
        "used_in_bytes" : 130760941568,
        "free_percent" : 6,
        "used_percent" : 94
      }
    },
    "process" : {
      "cpu" : {
        "percent" : 6
      },
      "open_file_descriptors" : {
        "min" : 265,
        "max" : 24360,
        "avg" : 8321
      }
    },
    "jvm" : {
      "max_uptime_in_millis" : 853063992,
      "versions" : [
        {
          "version" : "11.0.7",
          "vm_name" : "OpenJDK 64-Bit Server VM",
          "vm_version" : "11.0.7+10-LTS",
          "vm_vendor" : "Oracle Corporation",
          "bundled_jdk" : true,
          "using_bundled_jdk" : false,
          "count" : 3
        }
      ],
      "mem" : {
        "heap_used_in_bytes" : 3803728472,
        "heap_max_in_bytes" : 9602662400
      },
      "threads" : 118
    },
    "fs" : {
      "total_in_bytes" : 1036857524224,
      "free_in_bytes" : 649065615360,
      "available_in_bytes" : 606769131520
    },
    "plugins" : [
      {
        "name" : "search-guard-7",
        "version" : "7.0.1-35.0.0",
        "elasticsearch_version" : "7.0.1",
        "java_version" : "1.8",
        "description" : "Provide access control related features for Elasticsearch 7",
        "classname" : "com.floragunn.searchguard.SearchGuardPlugin",
        "extended_plugins" : [ ],
        "has_native_controller" : false
      },
      {
        "name" : "prometheus-exporter",
        "version" : "7.0.1.0",
        "elasticsearch_version" : "7.0.1",
        "java_version" : "1.8",
        "description" : "Export Elasticsearch metrics to Prometheus",
        "classname" : "org.elasticsearch.plugin.prometheus.PrometheusExporterPlugin",
        "extended_plugins" : [ ],
        "has_native_controller" : false
      },
      {
        "name" : "ingest-attachment",
        "version" : "7.0.1",
        "elasticsearch_version" : "7.0.1",
        "java_version" : "1.8",
        "description" : "Ingest processor that uses Apache Tika to extract contents",
        "classname" : "org.elasticsearch.ingest.attachment.IngestAttachmentPlugin",
        "extended_plugins" : [ ],
        "has_native_controller" : false
      }
    ],
    "network_types" : {
      "transport_types" : {
        "com.floragunn.searchguard.ssl.http.netty.SearchGuardSSLNettyTransport" : 3
      },
      "http_types" : {
        "com.floragunn.searchguard.http.SearchGuardHttpServerTransport" : 3
      }
    },
    "discovery_types" : {
      "zen" : 3
    }
  }
}

Is this open distro?
Are all your nodes in the same location/datacentre/region?

It is not OpenDistro.
Its local same location minikube setup.

@warkolm , Please suggest on this scenaio.

I don't know sorry, it looks like your nodes are losing their connection for some reason. Is there a firewall in use between them?

Also if you upgrade to 7.1, you get Security for free - https://www.elastic.co/blog/security-for-elasticsearch-is-now-free :slight_smile:

No firewall exist between nodes.
What is meaning of "Failed to update shard information for ClusterInfoUpdateJob within 15s timeout" ES Master logs.

We are using ES OPEN SOURCE.

It means the node(s) couldn't talk to each other for some reason.

Thanks a lot @warkolm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.