Unable to fetch data from rollups collector

HI All,

My kibana logs are flooded with the following error messages:

{"type":"error","@timestamp":"2019-01-25T11:22:34Z","tags":["warning","stats-collection"],"pid":31913,"level":"error","error":{"message":"Request Timeout after 30000ms","name":"Error","stack":"Error: Request Timeout after 30000ms\n    at /usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:355:15\n    at Timeout.<anonymous> (/usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:384:7)\n    at ontimeout (timers.js:498:11)\n    at tryOnTimeout (timers.js:323:5)\n    at Timer.listOnTimeout (timers.js:290:5)"},"message":"Request Timeout after 30000ms"}
{"type":"log","@timestamp":"2019-01-25T11:22:34Z","tags":["warning","stats-collection"],"pid":31913,"message":"Unable to fetch data from rollups collector"}

and this one:

{"type":"log","@timestamp":"2019-01-25T11:22:34Z","tags":["warning","stats-collection"],"pid":31913,"message":"Unable to fetch data from reporting collector"}
{"type":"error","@timestamp":"2019-01-25T11:22:34Z","tags":["warning","stats-collection"],"pid":31913,"level":"error","error":{"message":"Request Timeout after 30000ms","name":"Error","stack":"Error: Request Timeout after 30000ms\n    at /usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:355:15\n    at Timeout.<anonymous> (/usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:384:7)\n    at ontimeout (timers.js:498:11)\n    at tryOnTimeout (timers.js:323:5)\n    at Timer.listOnTimeout (timers.js:290:5)"},"message":"Request Timeout after 30000ms"}

and this one:

{"type":"log","@timestamp":"2019-01-25T11:22:33Z","tags":["warning","stats-collection"],"pid":31913,"message":"Unable to fetch data from kql collector"}
{"type":"error","@timestamp":"2019-01-25T11:22:34Z","tags":["warning","stats-collection"],"pid":31913,"level":"error","error":{"message":"Request Timeout after 30000ms","name":"Error","stack":"Error: Request Timeout after 30000ms\n    at /usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:355:15\n    at Timeout.<anonymous> (/usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:384:7)\n    at ontimeout (timers.js:498:11)\n    at tryOnTimeout (timers.js:323:5)\n    at Timer.listOnTimeout (timers.js:290:5)"},"message":"Request Timeout after 30000ms"}

I have no idea where to start troubleshooting..

Please advice..

Hi

Are there any errors in ES logs too? How is the health of your cluster ?
GET _cluster/health

Cheers
Rashmi

Hi Rashmi,

Thank you for your time..

GET _cluster/health

{
  "cluster_name" : "clog",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 27,
  "number_of_data_nodes" : 20,
  "active_primary_shards" : 295,
  "active_shards" : 595,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

During the time of the errors I do see some warnings on my data nodes, for example this:

[2019-01-25T13:02:28,491][WARN ][o.e.x.s.t.n.SecurityNetty4ServerTransport] [tb-clog-esm1.tb.iss.local] send message failed [channel: NettyTcpChannel{localAddress=0.0.0.0/0.0.0.0:46006, remoteAddress=10.80.3.7/10.80.3.7:9300}]
javax.net.ssl.SSLException: handshake timed out
	at io.netty.handler.ssl.SslHandler.handshake(...)(Unknown Source) ~[?:?]

But those are sporadically.

Regards,
Paul.

Hi Paul
With the warning messages on your data nodes- am thinking that there is a problem with the master data nodes. Can you please let me know how many master node you have in your cluster? And what the value of discovery.zen.minimum_master_nodes have you set in your elasticsearch.yml file. On election, the new master seems disconnected from the data nodes and hence the error in the logs.

Cheers
Rashmi

Hi Rashmi,

The value is 2, here is my elasticsearch.yml file. I know I can configure discovery.zen.ping.unicast.hosts differently but that how we do things using salt.

cluster.name: clog
node.name: tb-clog-esm2.tb.iss.local
path.data: /opt/elasticdb/data
path.logs: /var/log/elasticsearch
bootstrap.memory_lock: true
network.host: ['10.80.3.9', '127.0.0.1']
discovery.zen.master_election.ignore_non_master_pings: true
discovery.zen.ping.unicast.hosts: ["10.80.3.10","10.80.3.11","10.80.3.12","10.80.3.13","10.80.3.14","10.80.3.15","10.80.3.16","10.80.3.17","10.80.3.18","10.80.3.19","10.80.3.20","10.80.3.21","10.80.3.22","10.80.3.23","10.80.3.24","10.80.3.25","10.80.3.26","10.80.3.27","10.80.3.28","10.80.3.29","10.80.3.30","10.80.3.4","10.80.3.5","10.80.3.6","10.80.3.7","10.80.3.8","10.80.3.9"]
discovery.zen.minimum_master_nodes: 2
gateway.recover_after_time: 10m
gateway.recover_after_nodes: 12
gateway.expected_data_nodes: 20
node.master: true
node.data: false
node.ingest: false
node.ml: false
xpack.http.ssl.verification_mode: certificate
xpack.watcher.index.rest.direct_access: 'true'
xpack.monitoring.enabled: 'true'
xpack.monitoring.collection.indices: '*'
# Reporting settings
xpack.notification.email.account:
  standard_account:
    profile: standard
    email_defaults:
      from: <our working email address>
    smtp:
      auth: false
      starttls.enable: false
      host: smtp.oss.local
      port: 25
# Transport encryption
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.key: /etc/elasticsearch/certs/tb-clog-esm2.key
xpack.security.transport.ssl.certificate: /etc/elasticsearch/certs/tb-clog-esm2.crt
xpack.security.transport.ssl.certificate_authorities: [ "/etc/elasticsearch/certs/ca.crt" ]
#
# Http client encryption
xpack.security.http.ssl.enabled: false
xpack.security.http.ssl.key:  /etc/elasticsearch/certs/tb-clog-esm2.key
xpack.security.http.ssl.certificate: /etc/elasticsearch/certs/tb-clog-esm2.crt
xpack.security.http.ssl.certificate_authorities: [ "/etc/elasticsearch/certs/ca.crt" ]
#
# Security settings
xpack.security.enabled: true
xpack:
  security:
    authc:
      realms:
        native1:
          type: native
          order: 0
        ldap1:
          type: ldap
          order: 1
          url: "ldap://ldapc.oss.local:489"
          user_search:
            base_dn: "ou=people,dc=oss,dc=local"
            attribute: uid
          group_search:
            base_dn: "ou=groups,dc=oss,dc=local"
          files:
            role_mapping: "/etc/elasticsearch/role_mapping.yml"
          unmapped_groups_as_roles: false
#
http.cors.enabled: true
http.cors.allow-origin: "/.*/"
transport.tcp.compress: true
node.attr.box_type: no_data

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.