FileBeat not scrapping log lines

Hello!

I am new to FileBeat, trying to scrap INFO log lines from log files and send to kafka.

filebeat.inputs:
- type: log
  enabled: true
  tags:
    - myservice
  paths:
    - /usr/share/services/myservice/*.log
  include_lines: ['^INFO']
#tried this also
  processors:
  - drop_event:
      when:
        not:
          contains:
            log.level: "INFO"

output.kafka:
  hosts: ["localhost:9092"]
  topic: 'applogs'
  partition.round_robin:
    reachable_only: false
  required_acks: 1
  compression: gzip
  max_message_bytes: 1000000

this is my log file:

22:07:05.897 [main] INFO  c.a.a.r.k.KafkaCopyApplication - No active profile set, falling back to 1 default profile: "default"
22:07:10.281 [main] INFO  c.a.aia.rda.kafkacopy.KafkaConsumer - TEST

but nothing scrapped, though log file is picked up

filebeat  | {"log.level":"info","@timestamp":"2024-03-21T20:07:03.678Z","log.logger":"input.harvester","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/log.(*Harvester).Run","file.name":"log/harvester.go","file.line":311},"message":"Harvester started for paths: [/usr/share/services/myservice/*.log]","service.name":"filebeat","input_id":"00748f4b-c724-41bb-bc41-adbc87004920","source_file":"/usr/share/services/myservice/thelog1.log","state_id":"native::88544381-132","finished":false,"os_id":"88544381-132","harvester_id":"e0405f9e-7f84-4b08-ab81-959cb00c5312","ecs.version":"1.6.0"}
filebeat  | {"log.level":"info","@timestamp":"2024-03-21T20:07:23.603Z","log.logger":"monitoring","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/monitoring/report/log.(*reporter).logSnapshot","file.name":"log/log.go","file.line":187},"message":"Non-zero metrics in the last 30s","service.name":"filebeat","monitoring":{"metrics":{"beat":{"cgroup":{"memory":{"mem":{"usage":{"bytes":36818944}}}},"cpu":{"system":{"ticks":90,"time":{"ms":10}},"total":{"ticks":210,"time":{"ms":20},"value":210},"user":{"ticks":120,"time":{"ms":10}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":10},"info":{"ephemeral_id":"351093e2-644d-490f-8bbc-1f1fcdde75d7","uptime":{"ms":210084},"version":"8.12.2"},"memstats":{"gc_next":36900512,"memory_alloc":18923368,"memory_total":63575088,"rss":102920192},"runtime":{"goroutines":31}},"filebeat":{"events":{"active":0,"added":1,"done":1},"harvester":{"open_files":1,"running":1,"started":1}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"active":0}},"pipeline":{"clients":1,"events":{"active":0,"filtered":1,"total":1}}},"registrar":{"states":{"current":3,"update":1},"writes":{"success":1,"total":1}},"system":{"load":{"1":0.09,"15":0.09,"5":0.11,"norm":{"1":0.0225,"15":0.0225,"5":0.0275}}}},"ecs.version":"1.6.0"}}

please

if i enable debug i just see:


filebeat  | {"log.level":"debug","@timestamp":"2024-03-21T20:49:13.041Z","log.logger":"input","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/log.(*Input).harvestExistingFile","file.name":"log/input.go","file.line":627},"message":"Harvester for file is still running: /usr/share/services/myservice/thelog1.log","service.name":"filebeat","input_id":"3cc3e3c7-8018-4f44-875b-b8682998d36e","source_file":"/usr/share/services/myservice/thelog1.log","state_id":"native::88544381-132","finished":false,"os_id":"88544381-132","old_source":"/usr/share/services/myservice/thelog1.log","old_finished":false,"old_os_id":"88544381-132","ecs.version":"1.6.0"}

This regex requires each line to start with the word INFO to be scraped but your log lines start with timestamps

Your filter will then drop any lines picked up because you haven't parsed out the log level field from your log line yet

1 Like

Thanks!
I removed the filter and changed the

include_lines: ['INFO']

It picks the line now.

Now have other issue,

kafka runs as docker (3 replicas) as well.

with
hosts: ["localhost:9092"]

it didn't connect

When I changed to
hosts: ["host.docker.internal:9092"]

it connected. it created a topic 'applogs'

but failed sending the data:


23T14:37:22.555Z","log.logger":"kafka","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/outputs/kafka.(*msgRef).dec","file.name":"kafka/client.go","file.line":406},"message":"Kafka publish failed with: dial tcp 127.0.0.1:9093: connect: connection refused","service.name":"filebeat","ecs.version":"1.6.0"}

it tries to send to port 9093

Using localhost always means that you are trying to connect to a server on the same host, so if your filebeat is not running on the same host/container then your kafka, then this is expected to not work.

I think this is related to your Kafka Cluster configuration, the clients use the advertised.listeners settting to connect, so you need to check what are the configured advertised listeners, it cannot be localhost by the same reasons mentioned before.