Cause of i/o timeout errors

Hi, I'm trying to figure out the root cause of this filebeat issue.
Basically filebeat 7.15.2 connecting to Kafka. When filebeat starts it sends for approx
5 to 10 min then, at a certin point the following happens:

69", "finished": false, "os_id": "302622-64769", "old_source": "/opt/couchbase/var/lib/couchbase/logs/goxdcr.log", "old_finished": true, "old_os_id": "302622-64769", "harvester_id": "2d52bcaa-46dd-4fbf-a7bb-8f60c9d8b521"}
2022-04-01T11:58:28.064Z        DEBUG   [kafka] kafka/client.go:371     finished kafka batch
2022-04-01T11:58:28.064Z        DEBUG   [kafka] kafka/client.go:385     Kafka publish failed with: dial tcp 999.999.9.99:9093: i/o timeout
2022-04-01T11:58:28.065Z        INFO    [publisher]     pipeline/retry.go:219   retryer: send unwait signal to consumer
2022-04-01T11:58:28.065Z        INFO    [publisher]     pipeline/retry.go:223     done
2022-04-01T11:58:28.065Z        DEBUG   [kafka] kafka/client.go:177     got event.Meta["partition"] = 4
2022-04-01T11:58:28.065Z        DEBUG   [kafka] kafka/client.go:177     got event.Meta["partition"] = 3
2022-04-01T11:58:28.065Z        DEBUG   [kafka] kafka/client.go:177     got event.Meta["partition"] = 2
2022

2022-04-01T11:58:35.680Z        DEBUG   [kafka] kafka/client.go:371     finished kafka batch
2022-04-01T11:58:35.681Z        DEBUG   [kafka] kafka/client.go:177     got event.Meta["partition"] = 2
2022-04-01T11:58:35.681Z        DEBUG   [kafka] kafka/client.go:385     Kafka publish failed with: dial tcp 999.99.9.999:9093: i/o timeout
2022-04-01T11:58:35.681Z        INFO    [publisher]     pipeline/retry.go:213   retryer: send wait signal to consumer
2022-04-01T11:58:35.681Z        INFO    [publisher]     pipeline/retry.go:217     done
2022-04-01T11:58:35.681Z        DEBUG   [kafka] kafka/client.go:177     got event.Meta["partition"] = 9
2022-04-01T11:58:35.681Z        DEBUG   [kafka] kafka/client.go:177     got event.Meta["partition"] = 4
2022-04-01T11:58:35.681Z        INFO    [publisher]     pipeline/retry.go:219   retryer: send unwait signal to consumer
2022-04-01T11:58:35.681Z        INFO    [publisher]     pipeline/retry.go:223     done
2022-04-01T11:58:35.682Z        DEBUG   [kafka] kafka/client.go:177     got event.Meta["partition"] = 0
2022-04-01T11:58:35.682Z        DEBUG   [kafka] kafka/client.go:177     got event.Meta["partition"] = 11
2022-04-01T11:58:35.682Z        DEBUG   [kafka] kafka/client.go:177     got event.Meta["partition"] = 1
2022-04-01T11:58:35.682Z        DEBUG   [kafka] kafka/client.go:177     got event.Meta["partition"] = 7
2022-04-01T11:58:35.682Z        DEBUG   [kafka] kafka/client.go:177     got event.Meta["partition"] = 10
2022-04-01T11:58:35.683Z        DEBUG   [kafka] kafka/client.go:177     got event.Meta["partition"] = 3
2022-04-01T11:58:35.683Z        DEBUG   [kafka] kafka/client.go:177     got event.Meta["partition"] = 2
2022-

Filebeat config is as follows:

logging.level: debug
filebeat.inputs:
- type: log
  enable: true
  paths:
    - /var/log/*.log
    - /var/log/syslog

    - /opt/couchbase/logs/*.log
  exclude_files: ['\.gz$','\.1$','\.txt$']
  ignore_older: 24h
  fields_under_root: true
  fields:
    topic: linux-logs
  scan_frequency: 20s
  tail_files: true

output.kafka:
  enabled: true
  hosts: 
  topic: 
  required_acks: '1'
  compression: snappy
 
  client_id: xxx
  partition.round_robin: xx
    reachable_only: false
  kerberos:
    enabled: true
    auth_type:  xxx
    config_path:  xxx

    realm: xxx

I guess my main question is how can I determine if the problem is kafka, or a networking issue?
I don't have access to kafka or ELK so are there config options I use to help trouble shoot this?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.