Missing Elasticsearch Output Running Logstash in Kubernetes


#1

Hello,

I've been running logstash on virtual machines for a bit and am trying to move it to Kubernetes to help scale and better manage configuration. When running logstash in Kubernetes events bound for elasticsearch are not making it when I try to set the timestamp from a syslog_timestamp field provided by grok. If I remove the date filter the events are sent, but if it is enabled they do not. The same config dropped into a VM with the same version of elasticsearch works.

I've run logstash with debug enabled and can see that the date is parsed successfully, but there is no output sent to elasticsearch. Are there further steps I can take to troubleshoot? I've tried several 2.x versions of logstash and different versions of the logstash-filter-date plugin to no avail.

Version:

[root@logstash-syslog-df3r2 /]# /opt/logstash/bin/logstash --version
logstash 2.2.0

Config:

input {
  kafka {
    zk_connect       => 'zookeeper:2181'
    group_id         => 'logstash_consumer-linux_syslog'
    consumer_threads => 1
    decorate_events  => true
    topic_id         => 'linux_syslog'
  }
}
filter {
  if [type] == "linux_syslog" {
    grok {
      match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
    }
    date {
      match => [ "syslog_timestamp", "MMM  d HH:mm:ss"]
    }
  }
}
output {
  if [type] == "linux_syslog" {
    elasticsearch {
      index              => "linux-%{+YYYY.MM.dd}"
      hosts              => ["elasticsearch00", "elasticsearch01", "elasticsearch02"]
    }
  } else {
    elasticsearch {
      index              => "logstash-%{+YYYY.MM.dd}"
      hosts              => ["elasticsearch00", "elasticsearch01", "elasticsearch02"]
    }
  }
  stdout {
    codec => rubydebug { metadata => true }
  }
}

(Joe Lawson) #2

Is type ever set on the input? I see the topic but not a type setting.


#3

I'm setting type with the document_type field in filebeat. Filebeat points to "forwarders" that place the event on an appropriate Kafka topic based on the type. I assume the event makes it to the output process because the rubydebug output works even when elasticsearch doesn't.

Below is rubydebug output when the data filter has replaced the timestamp but no output is sent to elasticsearch

{
             "message" => "Feb  3 10:25:01 rutschman-desktop CRON[21661]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)",
            "@version" => "1",
          "@timestamp" => "2016-02-03T10:25:01.000Z",
                "beat" => {
        "hostname" => "olaxpa-syslog10",
            "name" => "olaxpa-syslog10"
    },
               "count" => 1,
              "fields" => nil,
          "input_type" => "log",
              "offset" => 26369,
              "source" => "/srv/logs/rutschman-desktop/2016-02-03.log",
                "type" => "linux_syslog",
                "host" => "olaxpa-syslog10",
               "kafka" => {
              "msg_size" => 402,
                 "topic" => "linux_syslog",
        "consumer_group" => "logstash_consumer-linux_syslog",
             "partition" => 2,
                   "key" => nil
    },
    "syslog_timestamp" => "Feb  3 10:25:01",
     "syslog_hostname" => "rutschman-desktop",
      "syslog_program" => "CRON",
          "syslog_pid" => "21661",
      "syslog_message" => "(root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)",
         "received_at" => "2016-02-03T16:25:06.997Z",
       "received_from" => "olaxpa-syslog10",
           "@metadata" => {
        "logstash_host" => "logstash-syslog-4py8a"
    }
}

(Joe Lawson) #4

Yeah it looks pretty good. Are you searching the linux-{date} indexes for
the message? It could be that Kibana is just hitting logstash-{date}


#5

Looks like the container was set to use UTC by default and converting my timestamp. Setting the timezone in the date filter to the appropriate tz worked! The events were just being indexed 6 hours before I was expecting them.

Thanks for looking at it.


(Joe Lawson) #6

Great!


(system) #7