Logstash not reading from file input

I'm running this implementation of the ELK stack, which is pretty straightforward and easy to configure.

I can push TCP input through the stack using netcat like so:

nc localhost 5000 < /Users/me/path/to/logs/appOne.log
nc localhost 5000 < /Users/me/path/to/logs/appOneStackTrace.log
nc localhost 5000 < /Users/me/path/to/logs/appTwo.log
nc localhost 5000 < /Users/me/path/to/logs/appTwoStackTrace.log

But I cannot get the Logstash to read the file paths I specify in the config:

input {
        tcp {
		port => 5000
	}

        file {
                path => [
                    "/Users/me/path/to/logs/appOne.log",
                    "/Users/me/path/to/logs/appOneStackTrace.log",
                    "/Users/me/path/to/logs/appTwo.log",
                    "/Users/me/path/to/logs/appTwoStackTrace.log"
                ]
                type => "log"
                start_position => "beginning"
        }
}

output {
	elasticsearch {
		hosts => "elasticsearch:9200"
	}
}

Here is the startup output from the stack regarding logstash input:

logstash_1       | [2019-01-28T17:44:33,206][INFO ][logstash.inputs.tcp      ] Starting tcp input listener {:address=>"0.0.0.0:5000", :ssl_enable=>"false"}
logstash_1       | [2019-01-28T17:44:34,037][INFO ][logstash.inputs.file     ] No sincedb_path set, generating one based on the "path" setting {:sincedb_path=>"/usr/share/logstash/data/plugins/inputs/file/.sincedb_a1605b28f1bc77daf785a8805c32f578", :path=>["/Users/me/path/to/logs/appOne.log", "/Users/me/path/to/logs/appOneStackTrace.log", "/Users/me/path/to/logs/appTwo.log", "/Users/me/path/to/logs/appTwoStackTrace.log"]}

There is no indication the pipeline has any issues starting.

I've also checked that the log files have been updated since the display of TCP input, and they have. The last Logstash-specific log from the ELK stack comes from either startup or the TCP input.

Here is my entire Logstash start-up logging in case that's helpful:

logstash_1       | Sending Logstash logs to /usr/share/logstash/logs which is now configured via log4j2.properties
logstash_1       | [2019-01-29T13:32:19,391][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
logstash_1       | [2019-01-29T13:32:19,415][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"6.5.4"}
logstash_1       | [2019-01-29T13:32:23,989][INFO ][logstash.pipeline        ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
logstash_1       | [2019-01-29T13:32:24,648][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://elasticsearch:9200/]}}
logstash_1       | [2019-01-29T13:32:24,908][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://elasticsearch:9200/"}
logstash_1       | [2019-01-29T13:32:25,046][INFO ][logstash.outputs.elasticsearch] ES Output version determined {:es_version=>6}
logstash_1       | [2019-01-29T13:32:25,051][WARN ][logstash.outputs.elasticsearch] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>6}
logstash_1       | [2019-01-29T13:32:25,108][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//elasticsearch:9200"]}
logstash_1       | [2019-01-29T13:32:25,229][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil}
logstash_1       | [2019-01-29T13:32:25,276][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}}
logstash_1       | [2019-01-29T13:32:25,327][INFO ][logstash.inputs.tcp      ] Starting tcp input listener {:address=>"0.0.0.0:5000", :ssl_enable=>"false"}
logstash_1       | [2019-01-29T13:32:25,924][INFO ][logstash.inputs.file     ] No sincedb_path set, generating one based on the "path" setting {:sincedb_path=>"/usr/share/logstash/data/plugins/inputs/file/.sincedb_143c07d174c46eeab78b902edb3b1289", :path=>["/Users/me/path/to/logs/appOne.log", "/Users/me/path/to/logs/appOneStackTrace.log", "/Users/me/path/to/logs/appTwo.log", "/Users/me/path/to/logs/appTwoStackTrace.log"]}
logstash_1       | [2019-01-29T13:32:25,976][INFO ][logstash.pipeline        ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x4d1515ce run>"}
logstash_1       | [2019-01-29T13:32:26,088][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
logstash_1       | [2019-01-29T13:32:26,106][INFO ][filewatch.observingtail  ] START, creating Discoverer, Watch with file and sincedb collections
logstash_1       | [2019-01-29T13:32:26,432][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}

Your file input is using a default sincedb, so if you have run logstash before it is now tailing the files waiting for new data to be appended to them. If you do not see events when data is appended then enable "--log.level debug" and see what filewatch has to say.

Thanks, @Badger! New data has been appended to the specified log files. Also, doesn't start_position => "beginning" stop the tailing? Thanks!

Not if you have a sincedb. In that case the file input reads from the beginning of the file the first time logstatsh executes, but subsequent executions start at the point it has read up to, which if often the end-of-file.

Thanks, again @Badger. Should I be specifying a sincedb? I'm currently running locally on a Mac but will be deploying to a linux machine once I get things sorted.

No need. It generates one by default...

No sincedb_path set, generating one based on the "path" setting {:sincedb_path=>"/usr/share/logstash/data/plugins/inputs/file/.sincedb_a1605b28f1bc77daf785a8805c32f578"

Right, but what I meant was should I be specifying one explicitly in such a way to avoid the tailing effect you mentioned earlier? I've seen it specified as /dev/null but when I implemented that the system ignores the file input altogether (it doesn't even show it being configured in the startup logs).

If you specify /dev/null then in tail mode the file input should be reading from the beginning of the file every time, leading to duplicate events if you restart logstash. Setting it to /dev/null (or NUL on Windows) is pretty the only use I have ever had for sincedb_path. If I am using a sincedb then I don't care where it gets written to -- the default is just fine.

I think I just made a connection form reading the docs again. By 'tail mode' you mean NOT specifying start_position => "beginning", which overrides the default setting. Default is to treat the files as streams and act when they are updated, right?

Pretty much, yes. However, that's not exactly what I meant by tail mode. For a very long time people have been wanting to consume files. Not log files that are being extended, which is what the file input is good at, but, for example, a bunch of files that contain json or xml. The file input could do it, but it wasn't very good at it. So recently, a new mode was added to the file input called 'read', which given a path, would read the file from beginning to end and delete it. That's read mode. The default is tail mode.

Thanks for sticking with me here. I've removed the start_position def since I want the default behavior, and still no dice. I'm also not specifying the since_db path and going with the default it creates. I think my config is pretty rudimentary, so I'm still not sure what's going on. I am having trouble setting the log level for logstash, though. I need to add a command: --log.level debug line to the logstash section of docker-compose.yml but that requires another step I'm unclear on, specifically addressing the enttrypoint.

I would start a new question about how to add --log.level debug to a docker image, then come back to this thread once you have debug output.

I actually figured that part out just now, and when I hit the app endpoints that update logs, I do not see any activity from filewatch. Here is what I see after the logstash instance starts up:
Filewatch starts fine:

logstash_1       | [2019-01-29T16:13:10,652][INFO ][filewatch.observingtail  ] START, creating Discoverer, Watch with file and sincedb collections

Then this is all I see after:

logstash_1       | [2019-01-29T16:10:27,731][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
logstash_1       | [2019-01-29T16:10:27,731][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
logstash_1       | [2019-01-29T16:10:29,966][DEBUG][logstash.pipeline        ] Pushing flush onto pipeline {:pipeline_id=>"main", :thread=>"#<Thread:0x11f2015d sleep>"}

This repeats periodically ( as indicated in the poller name), but filewatch isn't picking anything up.

Here is what I would have expected. I configure a file input reading two files. Then I append "Foo" to one of them. I "egrep 'file|received' | grep -v configpathloader" in the output to get

[...]
[2019-01-29T11:27:53,702][DEBUG][logstash.inputs.file     ] config LogStash::Inputs::File/@file_sort_direction = "asc"
[2019-01-29T11:27:54,173][INFO ][logstash.inputs.file     ] No sincedb_path set, generating one based on the "path" setting {:sincedb_path=>"/tmp/logstash-data/29396/plugins/inputs/file/.sincedb_1042830aa90348ed174dc186faffbb0a", :path=>["/tmp/foo/a", "/tmp/foo/b"]}
[2019-01-29T11:27:54,316][INFO ][filewatch.observingtail  ] START, creating Discoverer, Watch with file and sincedb collections
[2019-01-29T11:28:08,038][DEBUG][filewatch.tailmode.handlers.grow] read_to_eof: get chunk
[2019-01-29T11:28:08,060][DEBUG][logstash.inputs.file     ] Received line {:path=>"/tmp/foo/a", :text=>"Foo"}
[2019-01-29T11:28:08,215][DEBUG][filewatch.sincedbcollection] writing sincedb (delta since last write = 1548779288)
[2019-01-29T11:28:08,308][DEBUG][logstash.pipeline        ] filter received {"event"=>{"path"=>"/tmp/foo/a", "message"=>"Foo", "@timestamp"=>2019-01-29T16:28:08.180Z, [...]}}
[2019-01-29T11:28:08,309][DEBUG][logstash.pipeline        ] output received {"event"=>{"path"=>"/tmp/foo/a", "message"=>"Foo", "@timestamp"=>2019-01-29T16:28:08.180Z, [...]}}

If you are not seeing that then step the debug level up to trace. Here is an example of it reading a file from the beginning

[2019-01-29T11:44:36,567][TRACE][filewatch.tailmode.handlers.createinitial] handling: a
[2019-01-29T11:44:36,568][TRACE][filewatch.tailmode.handlers.createinitial] opening a
[2019-01-29T11:44:36,569][TRACE][filewatch.tailmode.handlers.createinitial] handle_specifically opened file handle: 86, path: a
[2019-01-29T11:44:36,571][TRACE][filewatch.tailmode.handlers.createinitial] add_new_value_sincedb_collection {"position"=>0, "watched_file details"=>"<FileWatch::WatchedFile: @filename='a', @state='active', @recent_states='[:watched, :watched]', @bytes_read='0', @bytes_unread='8', current_size='8', last_stat_size='8', file_open?='true', @initial=true, @sincedb_key='13268895 0 51714'>"}
[2019-01-29T11:44:36,605][TRACE][filewatch.tailmode.processor] Active - no change {"watched_file"=>"<FileWatch::WatchedFile: @filename='b', @state='active', @recent_states='[:watched, :watched]', @bytes_read='0', @bytes_unread='0', current_size='0', last_stat_size='0', file_open?='true', @initial=false, @sincedb_key='13268896 0 51714'>"}
[2019-01-29T11:44:36,625][TRACE][filewatch.tailmode.processor] Active - file grew: a: new size is 8, bytes read 0
[2019-01-29T11:44:36,627][TRACE][filewatch.tailmode.handlers.grow] handling: a
[2019-01-29T11:44:36,655][TRACE][filewatch.tailmode.handlers.grow] reading... {"iterations"=>1, "amount"=>8, "filename"=>"a"}
[2019-01-29T11:44:36,657][DEBUG][filewatch.tailmode.handlers.grow] read_to_eof: get chunk
[2019-01-29T11:44:36,693][DEBUG][logstash.inputs.file     ] Received line {:path=>"/tmp/foo/a", :text=>"Foo"}
[2019-01-29T11:44:36,847][DEBUG][logstash.inputs.file     ] Received line {:path=>"/tmp/foo/a", :text=>"Foo"}
[2019-01-29T11:44:36,866][DEBUG][filewatch.sincedbcollection] writing sincedb (delta since last write = 1548780276)
[2019-01-29T11:44:36,870][TRACE][filewatch.sincedbcollection] sincedb_write: to: /tmp/logstash-data/29561/plugins/inputs/file/.sincedb_1042830aa90348ed174dc186faffbb0a
[2019-01-29T11:44:36,985][DEBUG][logstash.pipeline        ] filter received {"event"=>{"path"=>"/tmp/foo/a", "@timestamp"=>2019-01-29T16:44:36.812Z, "message"=>"Foo", [...]}}
[2019-01-29T11:44:36,986][DEBUG][logstash.pipeline        ] filter received {"event"=>{"path"=>"/tmp/foo/a", "@timestamp"=>2019-01-29T16:44:36.849Z, "message"=>"Foo", [...]}}

Then about once a second each file monitor says it is still tailing

[2019-01-29T11:45:08,068][TRACE][filewatch.tailmode.processor] Active - no change {"watched_file"=>"<FileWatch::WatchedFile: @filename='b', @state='active', @recent_states='[:watched, :watched]', @bytes_read='0', @bytes_unread='0', current_size='0', last_stat_size='0', file_open?='true', @initial=false, @sincedb_key='13268896 0 51714'>"}
[2019-01-29T11:45:08,069][TRACE][filewatch.tailmode.processor] Active - no change {"watched_file"=>"<FileWatch::WatchedFile: @filename='a', @state='active', @recent_states='[:watched, :watched]', @bytes_read='8', @bytes_unread='0', current_size='8', last_stat_size='8', file_open?='true', @initial=false, @sincedb_key='13268895 0 51714'>"}

And when I append a line I get this

[2019-01-29T11:45:19,087][TRACE][filewatch.tailmode.processor] Active - file grew: a: new size is 12, bytes read 8
[2019-01-29T11:45:19,088][TRACE][filewatch.tailmode.handlers.grow] handling: a
[2019-01-29T11:45:19,090][TRACE][filewatch.tailmode.handlers.grow] reading... {"iterations"=>1, "amount"=>4, "filename"=>"a"}
[2019-01-29T11:45:19,090][DEBUG][filewatch.tailmode.handlers.grow] read_to_eof: get chunk
[2019-01-29T11:45:19,093][DEBUG][logstash.inputs.file     ] Received line {:path=>"/tmp/foo/a", :text=>"Bar"}
[2019-01-29T11:45:19,129][DEBUG][filewatch.sincedbcollection] writing sincedb (delta since last write = 43)
[2019-01-29T11:45:19,129][TRACE][filewatch.sincedbcollection] sincedb_write: to: /tmp/logstash-data/29561/plugins/inputs/file/.sincedb_1042830aa90348ed174dc186faffbb0a
[2019-01-29T11:45:19,241][DEBUG][logstash.pipeline        ] filter received {"event"=>{"path"=>"/tmp/foo/a", "@timestamp"=>2019-01-29T16:45:19.125Z, "message"=>"Bar", [...]

Set the log level to TRACE, and am not seeing any acticity when I manually update the log files or hit app endpoints. File watch is alive though:

logstash_1       | [2019-01-29T18:45:55,067][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
logstash_1       | [2019-01-29T18:45:55,068][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
logstash_1       | [2019-01-29T18:45:55,298][DEBUG][logstash.pipeline        ] Pushing flush onto pipeline {:pipeline_id=>"main", :thread=>"#<Thread:0x409ce054 sleep>"}
logstash_1       | [2019-01-29T18:45:59,527][TRACE][filewatch.discoverer     ] discover_files {"count"=>0}
logstash_1       | [2019-01-29T18:45:59,527][TRACE][filewatch.discoverer     ] discover_files {"count"=>0}
logstash_1       | [2019-01-29T18:45:59,528][TRACE][filewatch.discoverer     ] discover_files {"count"=>0}
logstash_1       | [2019-01-29T18:45:59,529][TRACE][filewatch.discoverer     ] discover_files {"count"=>0}

Is there anything glaringly missing from my config?

Thanks again for continuing to work with me.

It's not finding files that match the entries in the path array you supplied in your configuration.

            path => [
                "/Users/me/path/to/logs/appOne.log",
                "/Users/me/path/to/logs/appOneStackTrace.log",
                "/Users/me/path/to/logs/appTwo.log",
                "/Users/me/path/to/logs/appTwoStackTrace.log"
            ]

I would check the basics. Do an ls -l on each file as the user running logstash. Does the case in the path array entry match the case of the actual file?

I found the issue. Being able to netcat the host's log files in directly via TCP led me astray. I needed to map the local log files into the container. It was a docker noob issue. Thanks for sticking with me, though - your last question about the path entries is what cued me!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.