Logstash 7.8 doesn't re-parse existing file

Hi team,

I'm using Elastic Stack 7.8, below logstash doesn't read old files in /usr/data/nsg/ folder.

input {
   file
   {
      path => "/usr/data/nsg/*"
      start_position => "beginning"
      codec => "json"
      sincedb_path => "/dev/null"
      ignore_older => 0
 }
}
output {
  stdout {
   codec => rubydebug
  }
}

Here is the debug log below:

    [DEBUG] 2020-07-10 09:49:14.186 [Converge PipelineAction::Create<main>] json - config LogStash::Codecs::JSON/@id = "json_82dfaaf6-5a13-49ce-8d4b-10608037a9ef"
    [DEBUG] 2020-07-10 09:49:14.189 [Converge PipelineAction::Create<main>] json - config LogStash::Codecs::JSON/@enable_metric = true
    [DEBUG] 2020-07-10 09:49:14.191 [Converge PipelineAction::Create<main>] json - config LogStash::Codecs::JSON/@charset = "UTF-8"
    [DEBUG] 2020-07-10 09:49:14.232 [Converge PipelineAction::Create<main>] file - config LogStash::Inputs::File/@start_position = "beginning"
    [DEBUG] 2020-07-10 09:49:14.232 [Converge PipelineAction::Create<main>] file - config LogStash::Inputs::File/@path = ["/usr/data/nsg/*"]
    [DEBUG] 2020-07-10 09:49:14.240 [Converge PipelineAction::Create<main>] file - config LogStash::Inputs::File/@codec = <LogStash::Codecs::JSON id=>"json_82dfaaf6-5a13-49ce-8d4b-10608037a9ef", enable_metric=>true, charset=>"UTF-8">
    [DEBUG] 2020-07-10 09:49:14.240 [Converge PipelineAction::Create<main>] file - config LogStash::Inputs::File/@id = "d3310e797726d05295337b7ef7997957226f86065ef90cb6324230db58e707fa"
    [DEBUG] 2020-07-10 09:49:14.240 [Converge PipelineAction::Create<main>] file - config LogStash::Inputs::File/@sincedb_path = "/dev/null"
    [DEBUG] 2020-07-10 09:49:14.241 [Converge PipelineAction::Create<main>] file - config LogStash::Inputs::File/@ignore_older = 0.0
    [DEBUG] 2020-07-10 09:49:14.241 [Converge PipelineAction::Create<main>] file - config LogStash::Inputs::File/@enable_metric = true
    [DEBUG] 2020-07-10 09:49:14.241 [Converge PipelineAction::Create<main>] file - config LogStash::Inputs::File/@add_field = {}
    [DEBUG] 2020-07-10 09:49:14.242 [Converge PipelineAction::Create<main>] file - config LogStash::Inputs::File/@stat_interval = 1.0
    [DEBUG] 2020-07-10 09:49:14.242 [Converge PipelineAction::Create<main>] file - config LogStash::Inputs::File/@discover_interval = 15
    [DEBUG] 2020-07-10 09:49:14.242 [Converge PipelineAction::Create<main>] file - config LogStash::Inputs::File/@sincedb_write_interval = 15.0
    [DEBUG] 2020-07-10 09:49:14.242 [Converge PipelineAction::Create<main>] file - config LogStash::Inputs::File/@delimiter = "\n"
    [DEBUG] 2020-07-10 09:49:14.242 [Converge PipelineAction::Create<main>] file - config LogStash::Inputs::File/@close_older = 3600.0
    [DEBUG] 2020-07-10 09:49:14.242 [Converge PipelineAction::Create<main>] file - config LogStash::Inputs::File/@mode = "tail"
    [DEBUG] 2020-07-10 09:49:14.246 [Converge PipelineAction::Create<main>] file - config LogStash::Inputs::File/@file_completed_action = "delete"
    [DEBUG] 2020-07-10 09:49:14.246 [Converge PipelineAction::Create<main>] file - config LogStash::Inputs::File/@sincedb_clean_after = 1209600.0
    [DEBUG] 2020-07-10 09:49:14.246 [Converge PipelineAction::Create<main>] file - config LogStash::Inputs::File/@file_chunk_size = 32768
    [DEBUG] 2020-07-10 09:49:14.246 [Converge PipelineAction::Create<main>] file - config LogStash::Inputs::File/@file_chunk_count = 140737488355327
    [DEBUG] 2020-07-10 09:49:14.247 [Converge PipelineAction::Create<main>] file - config LogStash::Inputs::File/@file_sort_by = "last_modified"
    [DEBUG] 2020-07-10 09:49:14.247 [Converge PipelineAction::Create<main>] file - config LogStash::Inputs::File/@file_sort_direction = "asc"
    [DEBUG] 2020-07-10 09:49:14.247 [Converge PipelineAction::Create<main>] file - config LogStash::Inputs::File/@exit_after_read = false
    [DEBUG] 2020-07-10 09:49:14.247 [Converge PipelineAction::Create<main>] file - config LogStash::Inputs::File/@check_archive_validity = false
    [DEBUG] 2020-07-10 09:49:14.285 [Converge PipelineAction::Create<main>] registry - On demand adding plugin to the registry {:name=>"stdout", :type=>"output", :class=>LogStash::Outputs::Stdout}
    [DEBUG] 2020-07-10 09:49:14.307 [Converge PipelineAction::Create<main>] registry - On demand adding plugin to the registry {:name=>"rubydebug", :type=>"codec", :class=>LogStash::Codecs::RubyDebug}
    [DEBUG] 2020-07-10 09:49:14.313 [Converge PipelineAction::Create<main>] rubydebug - config LogStash::Codecs::RubyDebug/@id = "rubydebug_3dd38c72-ab89-4286-9fea-9ae11d547fcf"
    [DEBUG] 2020-07-10 09:49:14.313 [Converge PipelineAction::Create<main>] rubydebug - config LogStash::Codecs::RubyDebug/@enable_metric = true
    [DEBUG] 2020-07-10 09:49:14.313 [Converge PipelineAction::Create<main>] rubydebug - config LogStash::Codecs::RubyDebug/@metadata = false
    [DEBUG] 2020-07-10 09:49:14.395 [Converge PipelineAction::Create<main>] stdout - config LogStash::Outputs::Stdout/@codec = <LogStash::Codecs::RubyDebug id=>"rubydebug_3dd38c72-ab89-4286-9fea-9ae11d547fcf", enable_metric=>true, metadata=>false>
    [DEBUG] 2020-07-10 09:49:14.397 [Converge PipelineAction::Create<main>] stdout - config LogStash::Outputs::Stdout/@id = "8ba6fd2959e72b372718339ec5c263f798ba81a3125c21b28f58c18a8d164272"
    [DEBUG] 2020-07-10 09:49:14.397 [Converge PipelineAction::Create<main>] stdout - config LogStash::Outputs::Stdout/@enable_metric = true
    [DEBUG] 2020-07-10 09:49:14.398 [Converge PipelineAction::Create<main>] stdout - config LogStash::Outputs::Stdout/@workers = 1
    [DEBUG] 2020-07-10 09:49:14.439 [Converge PipelineAction::Create<main>] javapipeline - Starting pipeline {:pipeline_id=>"main"}
    [INFO ] 2020-07-10 09:49:14.539 [[main]-pipeline-manager] javapipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>250, "pipeline.sources"=>["/usr/share/logstash/bin/tlocal.conf"], :thread=>"#<Thread:0x2a625cbd run>"}
    [DEBUG] 2020-07-10 09:49:15.309 [Ruby-0-Thread-5: :1] CompiledPipeline - Compiled output
     P[output-stdout{"codec"=>"rubydebug"}|[file]/usr/share/logstash/bin/tlocal.conf:12:3:```
    stdout {
       codec => rubydebug
      }
    ```]
     into
     org.logstash.config.ir.compiler.ComputeStepSyntaxElement@3a13aaf1
    [DEBUG] 2020-07-10 09:49:15.319 [Ruby-0-Thread-6: :1] CompiledPipeline - Compiled output
     P[output-stdout{"codec"=>"rubydebug"}|[file]/usr/share/logstash/bin/tlocal.conf:12:3:```
    stdout {
       codec => rubydebug
      }
    ```]
     into
     org.logstash.config.ir.compiler.ComputeStepSyntaxElement@3a13aaf1
    [INFO ] 2020-07-10 09:49:15.680 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"}
    [DEBUG] 2020-07-10 09:49:15.689 [Converge PipelineAction::Create<main>] javapipeline - Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x2a625cbd run>"}
    [DEBUG] 2020-07-10 09:49:15.717 [logstash-pipeline-flush] PeriodicFlush - Pushing flush onto pipeline.
    [INFO ] 2020-07-10 09:49:15.748 [Agent thread] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
    [DEBUG] 2020-07-10 09:49:15.804 [Api Webserver] agent - Starting puma
    [DEBUG] 2020-07-10 09:49:15.814 [Api Webserver] agent - Trying to start WebServer {:port=>9600}
    [INFO ] 2020-07-10 09:49:15.815 [[main]<file] observingtail - START, creating Discoverer, Watch with file and sincedb collections
    [DEBUG] 2020-07-10 09:49:15.897 [Api Webserver] service - [api-service] start
    [INFO ] 2020-07-10 09:49:16.100 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600}
    [DEBUG] 2020-07-10 09:49:17.618 [pool-3-thread-1] jvm - collector name {:name=>"ParNew"}

In filebeat, setting ignore_older to zero tells filebeat not to filter files based on age.

In a logstash file input, setting ignore_older to zero tells the input to ignore any files more than zero seconds old, which is every file.

Thanks for your reply.

I tried to remove the "ignore_older => 0", but Logstash 7.8 still didn't read the existing files in "/usr/data/nsg/". Here is the new debug log.

Reflections - Reflections took 45 ms to scan 1 urls, producing 21 keys and 41 values
Reflections - expanded subtype co.elastic.logstash.api.Plugin -> co.elastic.logstash.api.Input
Reflections - expanded subtype co.elastic.logstash.api.Plugin -> co.elastic.logstash.api.Codec
Reflections - expanded subtype org.jruby.RubyBasicObject -> org.jruby.RubyObject
Reflections - expanded subtype java.lang.Cloneable -> org.jruby.RubyBasicObject
Reflections - expanded subtype org.jruby.runtime.builtin.IRubyObject -> org.jruby.RubyBasicObject
Reflections - expanded subtype java.io.Serializable -> org.jruby.RubyBasicObject
Reflections - expanded subtype java.lang.Comparable -> org.jruby.RubyBasicObject
Reflections - expanded subtype org.jruby.runtime.marshal.CoreObjectType -> org.jruby.RubyBasicObject
Reflections - expanded subtype org.jruby.runtime.builtin.InstanceVariables -> org.jruby.RubyBasicObject
Reflections - expanded subtype org.jruby.runtime.builtin.InternalVariables -> org.jruby.RubyBasicObject
Reflections - expanded subtype co.elastic.logstash.api.Plugin -> co.elastic.logstash.api.Output
Reflections - expanded subtype co.elastic.logstash.api.Metric -> co.elastic.logstash.api.NamespacedMetric
Reflections - expanded subtype java.security.SecureClassLoader -> java.net.URLClassLoader
Reflections - expanded subtype java.lang.ClassLoader -> java.security.SecureClassLoader
Reflections - expanded subtype java.io.Closeable -> java.net.URLClassLoader
Reflections - expanded subtype java.lang.AutoCloseable -> java.io.Closeable
Reflections - expanded subtype java.lang.Comparable -> java.lang.Enum
Reflections - expanded subtype java.io.Serializable -> java.lang.Enum
Reflections - expanded subtype co.elastic.logstash.api.Plugin -> co.elastic.logstash.api.Filter
registry - On demand adding plugin to the registry {:name=>"file", :type=>"input", :class=>LogStash::Inputs::File}
registry - On demand adding plugin to the registry {:name=>"json", :type=>"codec", :class=>LogStash::Codecs::JSON}
json - config LogStash::Codecs::JSON/@id = "json_2ff734cf-d281-4a95-a0d9-d8cbc545b730"
json - config LogStash::Codecs::JSON/@enable_metric = true
json - config LogStash::Codecs::JSON/@charset = "UTF-8"
file - config LogStash::Inputs::File/@start_position = "beginning"
file - config LogStash::Inputs::File/@path = ["/usr/data/nsg/*"]
file - config LogStash::Inputs::File/@codec = <LogStash::Codecs::JSON id=>"json_2ff734cf-d281-4a95-a0d9-d8cbc545b730", enable_metric=>tru
file - config LogStash::Inputs::File/@id = "72d69bed517c441e5624a174d4ed2637f7365a0209e521ed2e55963ef165a5ca"
file - config LogStash::Inputs::File/@sincedb_path = "/dev/null"
file - config LogStash::Inputs::File/@enable_metric = true
file - config LogStash::Inputs::File/@add_field = {}
file - config LogStash::Inputs::File/@stat_interval = 1.0
file - config LogStash::Inputs::File/@discover_interval = 15
file - config LogStash::Inputs::File/@sincedb_write_interval = 15.0
file - config LogStash::Inputs::File/@delimiter = "\n"
file - config LogStash::Inputs::File/@close_older = 3600.0
file - config LogStash::Inputs::File/@mode = "tail"
file - config LogStash::Inputs::File/@file_completed_action = "delete"
file - config LogStash::Inputs::File/@sincedb_clean_after = 1209600.0
file - config LogStash::Inputs::File/@file_chunk_size = 32768
file - config LogStash::Inputs::File/@file_chunk_count = 140737488355327
file - config LogStash::Inputs::File/@file_sort_by = "last_modified"
file - config LogStash::Inputs::File/@file_sort_direction = "asc"
file - config LogStash::Inputs::File/@exit_after_read = false
file - config LogStash::Inputs::File/@check_archive_validity = false
registry - On demand adding plugin to the registry {:name=>"stdout", :type=>"output", :class=>LogStash::Outputs::Stdout}
registry - On demand adding plugin to the registry {:name=>"rubydebug", :type=>"codec", :class=>LogStash::Codecs::RubyDebug}
rubydebug - config LogStash::Codecs::RubyDebug/@id = "rubydebug_ee0ca518-1668-4072-ab50-bd2142b5d3a1"
rubydebug - config LogStash::Codecs::RubyDebug/@enable_metric = true
rubydebug - config LogStash::Codecs::RubyDebug/@metadata = false
stdout - config LogStash::Outputs::Stdout/@codec = <LogStash::Codecs::RubyDebug id=>"rubydebug_ee0ca518-1668-4072-ab50-bd2142b5d3a1", ena
stdout - config LogStash::Outputs::Stdout/@id = "47bb3b6796623d1e55a7c65004685a3c476a1173f64251c7cb84467ce5be4123"
stdout - config LogStash::Outputs::Stdout/@enable_metric = true
stdout - config LogStash::Outputs::Stdout/@workers = 1
javapipeline - Starting pipeline {:pipeline_id=>"main"}

Enable trace level logging. You will get a bunch of message that should tell you what it is doing. Stuff like

[2019-01-10T16:39:51,066][TRACE][filewatch.tailmode.handlers.createinitial] handle_specifically opened file handle: 85, path: foo.txt
[2019-01-10T16:39:51,093][TRACE][filewatch.tailmode.handlers.createinitial] add_new_value_sincedb_collection {"position"=>0, "watched_file details"=>"<FileWatch::Watched File: @filename='foo.txt', @state='active', @recent_states='[:watched, :watched]', @bytes_read='0', @bytes_unread='954', current_size='954', last_stat_size='954', file_open?='true', @initial=true, @sincedb_key='59521 0 51714'>"}
[2019-01-10T16:39:51,130][TRACE][filewatch.tailmode.processor] Active - file grew: foo.txt: new size is 954, bytes read 0

Thanks Badger.

The Trace Log File part:

file - config LogStash::Inputs::File/@start_position = "beginning"
file - config LogStash::Inputs::File/@path = ["/usr/data/nsg/*"]
file - config LogStash::Inputs::File/@codec = <LogStash::Codecs::JSON id=>"json_3cfcc2b3-27ba-4b9a-a84c-63d28e083233", enable_metric=>tr
file - config LogStash::Inputs::File/@id = "20f0cdfa17e0faea50ad3ab101628fef32ea94a8852f643b6955d66b7d7954e4"
file - config LogStash::Inputs::File/@sincedb_path = "/dev/null"
file - config LogStash::Inputs::File/@enable_metric = true
file - config LogStash::Inputs::File/@add_field = {}
file - config LogStash::Inputs::File/@stat_interval = 1.0
file - config LogStash::Inputs::File/@discover_interval = 15
file - config LogStash::Inputs::File/@sincedb_write_interval = 15.0
file - config LogStash::Inputs::File/@delimiter = "\n"
file - config LogStash::Inputs::File/@close_older = 3600.0
file - config LogStash::Inputs::File/@mode = "tail"
file - config LogStash::Inputs::File/@file_completed_action = "delete"
file - config LogStash::Inputs::File/@sincedb_clean_after = 1209600.0
file - config LogStash::Inputs::File/@file_chunk_size = 32768
file - config LogStash::Inputs::File/@file_chunk_count = 140737488355327
file - config LogStash::Inputs::File/@file_sort_by = "last_modified"
file - config LogStash::Inputs::File/@file_sort_direction = "asc"
file - config LogStash::Inputs::File/@exit_after_read = false
file - config LogStash::Inputs::File/@check_archive_validity = false

The Trace Log read file part

[TRACE] 2020-07-14 06:12:42.258 [[main]<file] createinitial - handling: PT1H.json
[TRACE] 2020-07-14 06:12:42.276 [[main]<file] createinitial - opening PT1H.json
[TRACE] 2020-07-14 06:12:42.292 [[main]<file] createinitial - handle_specifically opened file handle: 85, path: PT1H.json
[TRACE] 2020-07-14 06:12:42.340 [[main]<file] createinitial - add_new_value_sincedb_collection {"position"=>0, "watched_file details"=>"<FileWatch::WatchedFile: @filename='PT1H.json', @state='active', @recent_states='[:watched, :watched]', @bytes_read='0', @bytes_unread='141968', current_size='141968', last_stat_size='141968', file_open?='true', @initial=true, @sincedb_key='790986 0 2065'>"}
[TRACE] 2020-07-14 06:12:42.425 [[main]<file] processor - Active - file grew: PT1H.json: new size is 141968, bytes read 0
[TRACE] 2020-07-14 06:12:42.428 [[main]<file] grow - handling: PT1H.json
[TRACE] 2020-07-14 06:12:42.444 [[main]<file] grow - reading... {"iterations"=>4, "amount"=>32768, "filename"=>"PT1H.json"}
[DEBUG] 2020-07-14 06:12:42.445 [[main]<file] grow - read_to_eof: get chunk
[TRACE] 2020-07-14 06:12:42.473 [[main]<file] grow - buffer_extract: a delimiter can't be found in current chunk, maybe there are no more delimiters or the delimiter is incorrect or the text before the delimiter, a 'line', is very large, if this message is logged often try increasing the `file_chunk_size` setting. {"delimiter"=>"\n", "read_position"=>0, "bytes_read_count"=>32768, "last_known_file_size"=>141968, "file_path"=>"/usr/data/nsg/PT1H.json"}

The full conf:

input {
   file
   {
      path => "/usr/data/nsg/*"
      start_position => "beginning"
      codec => "json"
      sincedb_path => "/dev/null"
 }
}
output {
  stdout {
   codec => rubydebug
  }
}

A file input reads lines from a file. It did not find a line ending in the first 32768 bytes. If the file does not contain a newline the file input cannot read it. If there are lines in the file longer than 32768 bytes then increase file_chunk_size.

Thanks Badger.

My JSON file is like below, seems it can't be parsed to JSON by file plugin. For the multi line json document in xxx.json file, is there any idea?

{
  "array": [
    1,
    2,
    3
  ],
  "boolean": true,
  "color": "gold",
  "null": null,
  "number": 123,
  "object": {
    "a": "b",
    "c": "d"
  },
  "string": "Hello World"
}

Is there anyone know the answer?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.