Filebeat messages reaching logstash server, but then not ES


(Mark Hanford) #1

I've got an odd situation I'm not sure how to resolve.
I have an existing ELK setup, mostly using Filebeat to ship web logs.
I am now trying to ship some CSV-based data into ES, so I'm using the csv{} stuff in logstash.
My Filebeat is apparently working fine, because on my server's /var/log/logstash.stdout I see the messages coming in, but no index is ever being created and no data is coming in.

Here's my filebeat input:

filebeat:
  prospectors:

	-
	  type: log
	  document_type: csvinput
	  paths:
		- "E:/csv/log/test-*.csv"
	  fields:
		test: csvinput


  registry_file: "C:/ProgramData/filebeat/registry"

output:
  logstash:
	hosts: ["elastic2:5044"]

Here's my 50-csvinput.conf:

filter {
  if [type] == "csvinput" {
	csv {
	  columns => [ "field1", "field2", "field3" ]
	}
	
	date {
	  match => [ "timeLogged", "YYYY-MM-dd HH:mm:ssZ" ]
		timezone => "Etc/UTC"
	}
  }
}

output {
	if [type] == "csvinput" {
		elasticsearch {
			hosts => [ "elastic2:9200" ]
			index => "csvinput"
		}
		stdout { codec => rubydebug }
	}
}

Here's the output I see in /var/log/logstash.stdout:

{
	"message" => [
		[0] "d,2015-11-20 23:13:02+0000,field1data, field2data, field3data,"
	],
	"@version" => "1",
	"@timestamp" => "2015-11-20T23:13:02.000Z",
	"beat" => {
		"hostname" => "myhostname",
		"name" => "myhostname",
		"version" => "1.0.0-rc1"
	},
	"count" => 1,
	"fields" => {
		"test" => "csvinput"
	},
	"fileinfo" => {},
	"input_type" => "",
	"line" => 0,
	"offset" => 1218,
	"source" => "E:\\csv\\log\\test-2000.csv",
	"type" => "d",
	"host" => "myhostname",
	"timeLogged" => "2015-11-20 23:13:02+0000",
	"timeQueued" => "2015-11-20 18:09:38+0000",
	"field1" => "field1data",
	"field2" => "field2data",
	"field3" => "field3data"
}

My /var/log/logstash.log is being filled constantly with buffer-flushing entries, but nothing else:

{:timestamp=>"2015-12-09T09:49:03.650000+0000", :message=>"Flushing buffer at interval", :instance=>"#<LogStash::Outputs::ElasticSearch::Buffer:0x4e831771 @operations_mutex=#<Mutex:0x4abc9a79>, @max_size=500, @operations_lock=#<Java::JavaUtilConcurrentLocks::ReentrantLock:0xfbe1d4a>, @submit_proc=#<Proc:0x3efbd3f1@/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-2.1.4-java/lib/logstash/outputs/elasticsearch/common.rb:54>, @logger=#<Cabin::Channel:0x4b49c5bd @metrics=#<Cabin::Metrics:0x2dd71494 @metrics_lock=#<Mutex:0x1ad646b4>, @metrics={}, @channel=#<Cabin::Channel:0x4b49c5bd ...>>, @subscriber_lock=#<Mutex:0x2654b55f>, @level=:info, @subscribers={12450=>#<Cabin::Outputs::IO:0x36d33b43 @io=#<File:/var/log/logstash/logstash.log>, @lock=#<Mutex:0x1e9838cb>>}, @data={}>, @last_flush=2015-12-09 09:49:02 +0000, @flush_interval=1, @stopping=#<Concurrent::AtomicBoolean:0x103af905>, @buffer=[], @flush_thread=#<Thread:0x6f91e205 run>>", :interval=>1, :level=>:info}

My ES log has nothing in it after startup at all.

I just don't know where to look next. It's getting as far as logstash, then disappearing, I don't want to have to build a grok rule to handle the CSV if I don't have to.


(Magnus B├Ąck) #2

I would try upping the loglevel with --verbose or --debug to better understand what's happening in the interaction with ES. If that wouldn't give me any useful clues I'd Wireshark the traffic. Which version of Logstash?


(Mark Hanford) #3

Using --debug, I get this set of entries for each message. They have a lot of data attached, but none of them seem to be error info, just output.

{:timestamp=>"2015-12-09T10:39:18.060000+0000", :message=>"filter received",
{:timestamp=>"2015-12-09T10:39:18.063000+0000", :message=>"Running grok filter",
{:timestamp=>"2015-12-09T10:39:18.384000+0000", :message=>"Regexp match object",
{:timestamp=>"2015-12-09T10:39:18.386000+0000", :message=>"Event now: ",
{:timestamp=>"2015-12-09T10:39:18.394000+0000", :message=>"Date filter: received event",
{:timestamp=>"2015-12-09T10:39:18.395000+0000", :message=>"Date filter looking for field",
{:timestamp=>"2015-12-09T10:39:18.395000+0000", :message=>"output received",

Weirdly, whilst building a grok-based parser for these files, I noticed that I get the same mysterious missing data now that it's complete. It was fine while it was grokparsefailing half way through. Perhaps it's something in the data.

logstash 2.1.0
elasticsearch 2.1.0


(Mark Hanford) #4

Okay, I've tried to pare this down to a minimally-failing sample. Here's a line in my log file:

d,2015-11-20 23:13:02+0000,2015-11-20 18:10:12+0000,

This is my logstash config:

filter {
  if [type] == "powermta_acct" {
    grok {
      match => ["message", "%{NOTSPACE:type},%{TIMESTAMP_ISO8601:timeLogged},%{TIMESTAMP_ISO8601:timeQueued},"
    }

    date {
      match => [ "timeLogged", "YYYY-MM-dd HH:mm:ssZ" ]
        timezone => "Etc/UTC"
    }
  }
}

output {
    if [type] == "powermta_acct" {
        elasticsearch {
            hosts => [ "10.10.0.88:9200" ]
            index => "powermta-%{+YYYY.MM.dd}"
        }
        stdout { codec => rubydebug }
    }
}

This is the stdout log:

{
   "message" => "d,2015-11-20 23:13:02+0000,2015-11-20 18:10:12+0000,",
  "@version" => "1",
"@timestamp" => "2015-11-20T23:13:02.000Z",
      "beat" => {
    "hostname" => "mta01",
        "name" => "mta01",
     "version" => "1.0.0-rc1"
},
     "count" => 1,
    "fields" => {
    "test" => "powermtatest"
},
  "fileinfo" => {},
"input_type" => "",
      "line" => 0,
    "offset" => 15939,
    "source" => "E:\\log\\test-acct-2000.csv",
      "type" => [
     [0] "powermta_acct",
    [1] "d"
],
      "host" => "mta01",
"timeLogged" => "2015-11-20 23:13:02+0000",
"timeQueued" => "2015-11-20 18:10:12+0000"
}

And then the debug log is rather wordy, so I've put it on pastebin. Can't see anything scary in there though.

Nothing in the ES log, but the log level of that is probably set to Warning, not Debug or anything heavy like that.


(Mark Hanford) #5

Dammit it was a stupid mistake. Note that in my logstash match I call the first field I get from my logs (a single char that identifies the record type) simply type - of course that conflicts with the document type already present, and my subsequent output then doesn't match any more.


(system) #6