Logstash 5.2 input kafka working but no output to elasticsearch

camilosantana · February 10, 2017, 10:42pm

Issue Summary:

filebeat outputs to kafka 0.10 topic
logstash indexer inputs from kafka 0.10
outputs to elasticsearch

Problem: document count on elasticsearch doesn't change

Note: I'm able to get filebeat=>logstash=>elasticsearch working ok. When I try to add the kafka broker and adjust input/output accordingly, the content consumed by logstash doesn't output to elasticsearch.

Versions

logstash version

# /usr/share/logstash/bin/logstash -V
logstash 5.2.0

installed from elastic.co repository on CentOS

# grep baseurl /etc/yum.repos.d/elastic-co.repo
baseurl=https://artifacts.elastic.co/packages/5.x/yum

# cat /etc/redhat-release
CentOS Linux release 7.3.1611 (Core)

Configurations

FileBeat agent

filebeat.yml

filebeat.prospectors:
- input_type: log
  paths:
    - /var/log/app1/app1.log
    - /var/log/app2/debug.log
    - /var/log/app2/system.log
    - /var/log/app2/app2.log
  document_type: clusterlogs
#================================ Outputs =====================================
output:
  logstash:
    enabled: false 
    hosts: ["logstash-indexer1-fqdn:5044","logstash-indexer2-fqdn:5044"]
    loadbalance: true

  kafka:
    enabled: true
    hosts: ["kafka3-fqdn:9092", "kafka1-fqdn:9092", "kafka2-fqdn:9092"]
    topic: '%{[type]}'
#================================ Logging =====================================
logging.level: warning
logging.to_files: true
logging.to_syslog: false
logging.files:
  path: /var/log
  name: filebeat.log
  keepfiles: 3
logging.selectors: ["*"]

logstash.yml

path.data: /var/lib/logstash
pipeline.workers: 6
path.config: /etc/logstash/conf.d
log.level: warn
path.logs: /var/log/logstash

files in /etc/logstash/conf.d

09_input_kafka_cluster.yml

input {
  kafka {
    bootstrap_servers => "kafka3-fqdn:9092,kafka1-fqdn:9092,kafka1-fqdn:9092"
    topics => ["clusterlogs"]
    client_id => "logstash-hostname"
    group_id => "logstash_indexer"
    add_field => {
      "log_origin" => "kafka"
    }
  }
}

50_filter_pending.yml

# this file intentionally left blank

80_output_elastic.yml

output {
  elasticsearch {
    hosts => "elastic-search-client-node:9200"
    manage_template => false
    index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
    document_type => "%{[@metadata][type]}"
  }
}

Logging

No errors logged.
here is the /var/log/logstash/logstash-plain.log
(I changed log.level to 'debug')

camilosantana · February 11, 2017, 12:15am

Note that I cant switch from kafka to logstash output on filebeat and everything works as expected.

warkolm · February 13, 2017, 8:32am

Is the data in kafka? Is it pulling it from there?

camilosantana · February 13, 2017, 7:32pm

Is the data in kafka? Is it pulling it from there?

filebeat pushes to kafka. the topic 'clusterlogs' was automatically created in kind.
the data is in kafka as shown via bin/kafka-console-consumer.sh
when i turn on debugging in logstash, i can see what's mentioned in the DEBUG log that i linked to. the DEBUG log, to me, plainly shows data being pulled. Example:

[2017-02-10T22:11:30,199][DEBUG][logstash.pipeline        ] filter received {"event"=>
{"@timestamp"=>2017-02-10T22:11:30.159Z, "log_origin"=>"kafka", "@version"=>"1", 
"message"=>"{\"@timestamp\":\"2017-02-10T22:11:16.991Z\",\"beat\":
{\"hostname\":\"applicationbhv6.\",\"name\":\"applicationbhv6.\",\"version\":\"5.2.0\"},
\"input_type\":\"log\",\"message\":\" WARN [ScheduledTasks:1] 2017-02-10 16:11:16,486
GCInspector.java (line 142) Heap is 0.981591859186788 full.  You may need to reduce memtable
and/or cache sizes.  app2 will now flush up to the two largest memtables to free up memory.
Adjust flush_largest_memtables_at threshold in app2.yaml if you don't want app2 to do this
automatically\",\"offset\":9778327,\"source\":\"/var/log/app2/system.log\",
\"type\":\"clusterlogs\"}"}}

the issue is the logstash doesn't output to elasticsearch.

warkolm · February 13, 2017, 10:20pm

Add a stdout and see if anything makes it through your pipeline then.

camilosantana · February 15, 2017, 3:29am

would there be a difference between stdout and the debug logging already posted?

here's the exposed link http://pastebin.com/ibmVj3GJ

reference:

Logging

No errors logged.
here is the /var/log/logstash/logstash-plain.log
(I changed log.level to 'debug')

camilosantana · February 16, 2017, 2:18am

if anyone is interested, there seems to be an issue when using the kafka plugin.

my logstash output to elastic is

output {
  elasticsearch {
    hosts => "elastic-search-client-node:9200"
    manage_template => false
    index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
    document_type => "%{[@metadata][type]}"
  }
}

which ended up creating the following index in elasticsearch.

(es index when using kafka input plugin)
%{[@metadata][beat]}-2017.02.15

when i use the filebeat input, i get

(es index when using filebeat input plugin)
filebeat-2017.02.15

i'm currently in the middle of setting document_types and index/topics manually to get everything to line up and will report back, if this works.

camilosantana · February 16, 2017, 9:18pm

solution

in the event someone else runs into this it looks like when one uses the kafka input plugin, logstash doesn't expand @metadata fields anymore. to work around this, i'm using ansible vars since i'm templating these files out from an ansible playbook. You may have to set them to static values and manage accordingly.

logstash config:

output_elastic (@metadata doesn't parse):

#    index => "%{[@metadata][type]}-%{+YYYY.MM.dd}"
# ^^^ changed to ...
index => "filebeat-%{+YYYY.MM.dd}"

input kafka
the doc_and_topic_via_ansible var is set via ansible.

# make sure 'topics' field matches 'doctype' throughout pipeline
input {
  kafka {
    bootstrap_servers => "blablabla"
    topics => ["{{ doc_and_topic_via_ansible }}"]
    client_id => "{{ ansible_hostname }}"
    codec => json
    group_id => "logstash_indexer"
    add_field => {
      "broker" => "input_from_kafka"
      }
  }
}

this creates the elasticsearch index as expected and everything works with one minor exception. Events go into elasticsearch with _type set as the non-expanded value below:

_type	   	%{[@metadata][type]}

instead of the expected value of "filebeat-" as expected by setting that field in filebeat.yml shipper running on endpoint machines with document_type in the input section.

system · March 16, 2017, 9:18pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash consuming Kafka Topic but don't output document Logstash	4	1810	November 6, 2020
Logstash, Kafka & Elasticsearch Logstash	12	1850	October 17, 2017
Event send Kafka to logstash Logstash	4	456	October 2, 2018
Logstash 5 is not outputting filebeat input and Kafka input Logstash	1	440	July 6, 2017
Unable to get Logstash to read from kafka input Logstash	4	3017	May 12, 2017

Logstash 5.2 input kafka working but no output to elasticsearch

Issue Summary:

Versions

Configurations

FileBeat agent

09_input_kafka_cluster.yml

50_filter_pending.yml

80_output_elastic.yml

Logging

Logging

solution

logstash config:

Related topics