File input not working despite all efforts in both Windows and Linux

Right now I'm working with an instance in Ubuntu 14.04, but I've also tried with the SIFT Workstation embodiment of 14.04, Windows 10, and Ubuntu 15. No matter what, and despite days tinkering, I still cannot get logstash to put a CSV file into elasticsearch. Please help!!!! :slight_smile:

Here's my configuration file, logstash-l2t.conf, which is stored in /opt/logstash. I've changed /var/log ownership to my username, chopomatic.

`input {
file {
path => ["/home/FeedMeLog2Timeline/*.csv"]
type => "timeline"
start_position => "beginning"
}
}

filter {
csv {
separator => ","
}
}

output {
elasticsearch {hosts => ["127.0.0.1:9200"]
index => "log2timeline"
}
stdout { codec => rubydebug }
}`

I can hit elasticsearch without issue in my browser at localhost:9200 or 127.0.0.1:9200.

I can access kibana without issue in my browser at localhost:5601. It's working, it shows the index I created (log2timeline), and it shows the one record in that index that I sent manually to elasticsearch (via Sense).

Nothing helps. When I go to /opt/logstash and run 'bin/logstash agent -f logstash-l2t.conf --configtest, it returns 'Configuration OK.' But when I run the same command (without --configtest), I get this, and never anything more. It hangs at 'logstash startup completed.' I've left it there as long as overnight with no change and no records added.

chopomatic@ubuntu:~$ cd /opt/logstash chopomatic@ubuntu:/opt/logstash$ sudo bin/logstash agent -f logstash-l2t.conf --verbosesudo: /var/lib/sudo owned by uid 1000, should be uid 0 [sudo] password for chopomatic: Settings: Default filter workers: 2 Registering file input {:path=>["/home/FeedMeLog2Timeline/*.csv"], :level=>:info} No sincedb_path set, generating one based on the file path {:sincedb_path=>"/home/chopomatic/.sincedb_bd0159e016db5402f5fc95c9198868c2", :path=>["/home/FeedMeLog2Timeline/*.csv"], :level=>:info} Worker threads expected: 2, worker threads started: 2 {:level=>:info} Using mapping template from {:path=>nil, :level=>:info} Attempting to install template {:manage_template=>{"template"=>"logstash-*", "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"_all"=>{"enabled"=>true, "omit_norms"=>true}, "dynamic_templates"=>[{"message_field"=>{"match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"string", "index"=>"analyzed", "omit_norms"=>true, "fielddata"=>{"format"=>"disabled"}}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"string", "index"=>"analyzed", "omit_norms"=>true, "fielddata"=>{"format"=>"disabled"}, "fields"=>{"raw"=>{"type"=>"string", "index"=>"not_analyzed", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"string", "index"=>"not_analyzed"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"float"}, "longitude"=>{"type"=>"float"}}}}}}}, :level=>:info} New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["127.0.0.1:9200"], :level=>:info} Pipeline started {:level=>:info} Logstash startup completed

ANY Ideas? (I'll be out for a couple hours but will be checking this thread and answering any responses the moment I return.

Thanks!

Chop

No ideas at all? Anyone? I switched back to a fresh build of Win10. Here's my .conf file:

`input {
file {
path => ["C:/Users/chopo/Desktop/FeedMeTimelines/*.csv"]
start_position => "beginning"
}
}

filter {
csv {
}
}

output {
elasticsearch {
hosts => ["localhost:9200"]
sniffing => true
manage_template => false
index => "log2timeline"
}
}
`
And while I still get the same number of records into elasticsearch, I do get some slightly different output to share:

`C:\Elk\logstash>bin\logstash -f logstash-l2t.conf --verbose --debug

io/console not supported; tty will not be manipulated

New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["localhost:9200"], :level=>:info}

New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["localhost:9200"], :level=>:info}

Settings: Default pipeline workers: 2

Registering file input {:path=>["C:/Users/chopo/Desktop/FeedMeTimelines/*.csv"], :level=>:info}

No sincedb_path set, generating one based on the file path {:sincedb_path=>"C:\Users\chopo/.sincedb_1b9eefba15050931477f321e2ef22ce7", :path=>["C:/Users/chopo/Desktop/FeedMeTimelines/*.csv"], :level=>:info}

New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["localhost:9200"], :level=>:info}

New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["localhost:9200"], :level=>:info}

Starting pipeline {:id=>"base", :pipeline_workers=>2, :batch_size=>125, :batch_delay=>5, :max_inflight=>250, :level=>:info}
Pipeline started {:level=>:info}

Logstash startup completed
`

After "Logstash startup completed," nothing. No records. No more feedback. It will stay like that as long as I leave it. This persists across Win10, Ubuntu 14.04 scratch build, Ubuntu 14.04 SIFT Workstation build, all exactly the same thing.

Surely I can't be the only one?

Sounds like https://www.elastic.co/guide/en/logstash/current/plugins-inputs-file.html#_tracking_of_current_position_in_watched_files

Try checking for the file and/or specifying your own.

Surely I can't be the only one?

Indeed not. In fact, I'm pretty sure this is the most frequently asked question around here and I think you'll find many suggestions in the archives, including to check the ignore_older option of the file input in case your input files are older than 24 hours. If you increase the logging even more by starting Logstash with --debug I think it'll log a message about files being ignored because they're too old.

I've attached a truncated log of its output with --debug turned on: TRUNCATED DEBUG LOG

One thing I immediately noticed is that it's still using the default "ignore_older" value instead of the one in my .conf file. Here's my new .conf file:

[BEGIN INSERT]
input {
file {
path => ["/home/FeedMeLog2Timeline/*.csv"]
ignore_older => "315360000"
sincedb_path => "c:/elk/logstash/sincedb/l2t"
start_position => "beginning"
}
}

filter {
csv {
}
}

output {
elasticsearch {
index => "log2timeline"
document_type => "timeline"
}

stdout {debug => true codec => "rubydebug"
}
}
[END INSERT]

The first thing I did after this first run was open and modify the source file. To my shock, when I ran logstash again, it actually started writing to elasticsearch. (First time ever, HOORAY!) Apparently it was indeed the "ignore_older" issue, so now I need to figure out why it's using the default instead of the one I specified.

But I also see that the records from the source file are being input as logs. Instead of using the existing log2timeline index and mapping, it's storing all the fields in a single "message" field.

Thoughts on this?

As it turns out, that one time that logstash wrote records to elasticsearch was working on a config of which I have no clue. The one I was constantly tweaking as I tried to get it right, was in the wrong location, nowhere near where it should have been.

I moved it to the proper location but haven't been able to repeat the writing of records. I do now see the data from my CSV scrolling by, then it looks like logstash gets to the point where it should be writing data, and it tries and tries and tries without success.

Unfortunately, I don't see anything in the output that gives me any clue what the issue is. I'm attaching a log here, in hopes that someone will be kind enough to take a look and see something I don't.

LATEST LOG

As it turns out, that one time that logstash wrote records to elasticsearch was working on a config of which I have no clue. The one I was constantly tweaking as I tried to get it right, was in the wrong location, nowhere near where it should have been.

Yeah, that's why it's useful to look at the debug-level logs where the exact configuration is listed. Sometimes you're surprised by what you see.

I moved it to the proper location but haven't been able to repeat the writing of records. I do now see the data from my CSV scrolling by, then it looks like logstash gets to the point where it should be writing data, and it tries and tries and tries without success.

I don't see any errors or retries from the elasticsearch output so I suspect data is being written to ES but you're looking in the wrong place. The next thing I'd try is sniffing the network traffic to see exactly what's going on.