Cant get cluster health

Hello everyone!

A couple of days ago, the monitoring cluster suddenly broke down. It just stopped showing the latest data at some point. Message standard

When you try to enable the built-in monitor nothing happens

If you select interval in the last 7 days and try Set up monitoring with Metricbeat, you can see the old data and the error cluster

If the pre-change interval for the last hour, error of the cluster doesnt show up, but metrics like still not come

I set up Metricbeat, everything seems to be working fine. In any case, everything starts and seems to transmit data

Metrics indicies are exist

ES cluster and KIbana versions are 7.9.1, Metricbeat 7.10

Please tell me what it can be and in which direction to look?

I tried

PUT _settings
 {
  "index": {
    "blocks": {
      "read_only_allow_delete": "false"
     }
   }
 }

Nothing changed. By the way, logs are coming in indicies without any problems, so there is no free space problems.

Metricbeat.yml:

metricbeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false
  
setup.template.settings:
  index.number_of_shards: 1
  index.codec: best_compression

setup.kibana:
  # Kibana Host
  host: "https://10.199.5.107:5601/"

output.elasticsearch:
  # Array of hosts to connect to.
  hosts: ["https://10.199.5.104:9200", "https://10.199.5.105:9200", "https://10.199.5.106:9200"]

  # Optional protocol and basic auth credentials.
  #protocol: "https"
  username: "elastic"
  password: "qTwQGayENAo1fbVpepzt"
  ssl.certificate_authorities: ["/etc/elasticsearch/certs/ca.crt"]

processors:
  - add_host_metadata: ~
  - add_cloud_metadata: ~
  - add_docker_metadata: ~
  - add_kubernetes_metadata: ~

monitoring.enabled: true

setup.template.settings:
  index.number_of_shards: 1
  index.codec: best_compression

elasticsearch-xpack.yml:

- module: elasticsearch
  metricsets:
    - ccr
    - cluster_stats
    - enrich
    - index
    - index_recovery
    - index_summary
    - ml_job
    - node_stats
    - shard
  period: 10s
  hosts: ["https://10.199.5.104:9200"]
  username: "elastic"
  password: "qTwQGayENAo1fbVpepzt"
  ssl.certificate_authorities: ["/etc/elasticsearch/certs/ca.crt"]
  xpack.enabled: true

Sorry, forgot about one thing. A couple of months ago i've updated cluster from 7.1 to 7.9.1. Three days ago i noticed that old index lifecycle policices doesnt work anymore and created new ILM policy named [delete_old_indicies].

I tried
cat /var/log/elasticsearch/elk-cluster.mosgortrans.com.log | grep errors and there is nothing. cat /var/log/elasticsearch/elk-cluster.mosgortrans.com.log | grep monit show this

[2020-11-14T20:14:22,336][WARN ][o.e.x.i.IndexLifecycleRunner] [elk-es-node-03] current step [null] for index [.monitoring-kibana-7-2020.11.13] with policy [delete_old_indicies] is not recognized
[2020-11-14T20:14:22,361][WARN ][o.e.x.i.IndexLifecycleRunner] [elk-es-node-03] current step [null] for index [.monitoring-logstash-7-2020.11.13] with policy [delete_old_indicies] is not recognized
[2020-11-14T20:14:22,392][WARN ][o.e.x.i.IndexLifecycleRunner] [elk-es-node-03] current step [null] for index [.monitoring-es-7-2020.11.14] with policy [delete_old_indicies] is not recognized
[2020-11-14T20:14:22,424][WARN ][o.e.x.i.IndexLifecycleRunner] [elk-es-node-03] current step [null] for index [.monitoring-es-7-2020.11.12] with policy [delete_old_indicies] is not recognized
[2020-11-14T20:14:22,447][WARN ][o.e.x.i.IndexLifecycleRunner] [elk-es-node-03] current step [null] for index [.monitoring-kibana-7-2020.11.14] with policy [delete_old_indicies] is not recognized
[2020-11-14T20:14:22,461][WARN ][o.e.x.i.IndexLifecycleRunner] [elk-es-node-03] current step [null] for index [.monitoring-es-7-2020.11.13] with policy [delete_old_indicies] is not recognized
[2020-11-14T20:14:22,514][WARN ][o.e.x.i.IndexLifecycleRunner] [elk-es-node-03] current step [null] for index [.monitoring-beats-7-2020.11.14] with policy [delete_old_indicies] is not recognized
[2020-11-14T20:14:22,531][WARN ][o.e.x.i.IndexLifecycleRunner] [elk-es-node-03] current step [null] for index [.monitoring-es-7-mb-2020.11.14] with policy [delete_old_indicies] is not recognized
[2020-11-14T20:14:25,193][INFO ][o.e.i.s.IndexShard       ] [elk-es-node-03] [.monitoring-logstash-7-2020.11.13][1] primary-replica resync completed with 0 operations
[2020-11-14T20:14:26,291][INFO ][o.e.i.s.IndexShard       ] [elk-es-node-03] [.monitoring-es-7-2020.11.11][0] primary-replica resync completed with 0 operations
[2020-11-14T20:14:26,299][INFO ][o.e.i.s.IndexShard       ] [elk-es-node-03] [.monitoring-es-7-2020.11.14][0] primary-replica resync completed with 0 operations
[2020-11-14T20:14:26,979][WARN ][o.e.a.b.TransportShardBulkAction] [elk-es-node-03] [[.monitoring-es-7-mb-2020.11.14][1]] failed to perform indices:data/write/bulk[s] on replica [.monitoring-es-7-mb-2020.11.14][1], node[IvxKqzsjRmCnOwJNqsLOQg], [R], s[STARTED], a[id=UnXFmesbR56lC-K-GBx66A]
[2020-11-14T20:14:27,386][WARN ][o.e.c.r.a.AllocationService] [elk-es-node-03] [.monitoring-es-7-mb-2020.11.14][0] marking unavailable shards as stale: [zE8Q1fIbQ2ag55TqVmgCTg]
[2020-11-14T20:14:27,386][WARN ][o.e.c.r.a.AllocationService] [elk-es-node-03] [.monitoring-es-7-mb-2020.11.14][1] marking unavailable shards as stale: [UnXFmesbR56lC-K-GBx66A]
[2020-11-14T20:14:27,730][WARN ][o.e.c.r.a.AllocationService] [elk-es-node-03] [.monitoring-kibana-7-2020.11.14][1] marking unavailable shards as stale: [Eb2VQq_jSYayqdSwtaNnJw]
[2020-11-14T20:14:29,801][WARN ][o.e.c.r.a.AllocationService] [elk-es-node-03] [.monitoring-beats-7-2020.11.14][1] marking unavailable shards as stale: [Dj8qZSUER2K9MT5hO6yibQ]

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.