Logstash stopped working after upgrade 6.2.4 --> 6.4.0


(Ron Grosberg) #1

Hi,
Yesterday I tried migrating logstash from 6.2.4 to 6.4 because I've been waiting for the "Read" mode for ages!!
then I found that logstash did nothing after I started it - the sincedb is being updated when logstash is started, and never updated again. I waited for a few hours....
I then removed the "Read" mode, and went back to my old config, but that didn't help either.
I tried removing the sincedb and starting over again - a new sincedb file was created with 0 bytes, and was never updated.

I then I closed logstash and went back to my old 6.2.4 logstash, and it just worked with the same config files....
I didn't see any breaking change, am I missing something? how can I debug that?

Just to make it clearer - the logstash 6.2.4 is working well with the new Kibana and Elastic version 6.4, so the only change here is logstash itself.


(Ron Grosberg) #2

Anyone has any suggestions?


(panaman) #3

are you using the persistent Q? i believe you need to delete the Q completely if you are before you upgrade. Something in the format is incompatible in the db of the Q with logstash > 6.3.0


(Ron Grosberg) #4

No, I am actually not using persistent queues.... it just doesn't work :frowning:


(Ry Biesemeyer) #5

What input(s) are you using, and what do the Logstash logs say?


#6

I'm suffering from this same problem. I upgraded Logstash from version 6.1.1 to 6.4.1, and all of my file inputs stopped working.

Here's my complete inputs pipeline configuration:

input {
    tcp {
        port => 5000
        codec => json
    }
    file {
        path => "/var/log/containers/*.log"
        sincedb_path => "/var/log/.logstash-containers-since.db"
        start_position => "beginning"
    }
    file {
        path => "/var/log/auth.log"
        sincedb_path => "/var/log/.logstash-auth-since.db"
        start_position => "beginning"
    }
    file {
        path => "/var/log/syslog"
        sincedb_path => "/var/log/.logstash-syslog-since.db"
        start_position => "beginning"
    }
    file {
        path => "/var/log/kern.log"
        sincedb_path => "/var/log/.logstash-kern-since.db"
        start_position => "beginning"
    }
}

After the upgrade, I can watch the stats API endpoint on logstash, and I see that no events are being processed. The logs say nothing useful, even when turned up to debug verbosity. I can, however, send events to the TCP input, and see them get processed.

When I downgrade my Logstash back to 6.1.1, the files start processing again.

Like RonGros, I am not using persistent queues.

UPDATE: The feature that I was upgrading for was in version 6.3.1, so I tried that instead of 6.4.1. Everything is working. So if there's a problem here, it seems to be with version 6.4.x.


(Guy Boertje) #8

@RonGros

Please post your read mode based Logstash config. Read mode must be turned on as "tail" mode is the default.


(Ron Grosberg) #9

@yaauie @guyboertje I have 3 pipes, all are using the file input. for the sake of the test, I tried disabling two of them and I kept only one called "crashlog".
Just to make it clear - @guyboertje - I tried it first with read mode, and when it didn't work I went back to tail mode and nothing happens - no files are being collected. when I use the same config file with logstash 6.2.4 everything works fine... (my config files are on a different directory so all I do is kill the logstash process and run the other version).

I do see that the sincedb timestamp is being updated when I first run logstash, but the file is not being updated at all.
Also - usually when logstash is running on my computer (which is a VM) the CPU gets to 100%. with version 6.4 - it doesn't, it looks like it just doesn't do anything...

Here are the logs of the initialization:

[2018-09-25T09:41:58,249][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
[2018-09-25T09:41:58,296][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
[2018-09-25T09:41:58,327][DEBUG][logstash.outputs.elasticsearch] Found existing Elasticsearch template. Skipping template management {:name=>"logstash"}
[2018-09-25T09:41:58,968][INFO ][filewatch.observingtail ] START, creating Discoverer, Watch with file and sincedb collections
[2018-09-25T09:41:58,983][INFO ][logstash.pipeline ] Pipeline started successfully {:pipeline_id=>"crashlog", :thread=>"#<Thread:0x1f22237b sleep>"}
[2018-09-25T09:41:59,187][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:crashlog], :non_running_pipelines=>[]}
[2018-09-25T09:41:59,640][DEBUG][logstash.agent ] Starting puma
[2018-09-25T09:41:59,874][DEBUG][logstash.agent ] Trying to start WebServer {:port=>9600}
[2018-09-25T09:41:59,937][DEBUG][logstash.api.service ] [api-service] start
[2018-09-25T09:42:01,812][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
[2018-09-25T09:42:03,327][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
[2018-09-25T09:42:03,327][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
[2018-09-25T09:42:03,905][DEBUG][logstash.pipeline ] Pushing flush onto pipeline {:pipeline_id=>"crashlog", :thread=>"#<Thread:0x1f22237b sleep>"}
[2018-09-25T09:42:08,343][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}

And then I just keep getting these in an endless loop :

ogstash.pipeline ] Pushing flush onto pipeline {:pipeline_id=>"crashlog", :thread=>"#<Thread:0x1f22237b sleep>"}
ogstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
ogstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}

my config file is pretty straight forward:

input {
file {
path => ["//files.algotec.co.il/projects/Mediprime/Crash Logs/@AutomaticCollection/*.json"]
start_position => beginning
type => "json"
ignore_older => 0
sincedb_path => "F:\LogstashConfigFiles\Data\sincedb"
close_older => 1
max_open_files => 50000
}
}
filter {
json {
source => "message"
}
}
output {
elasticsearch {
action => index
index => crashlog
}
}


(Guy Boertje) #10

Ok. Now show me what you used for the read mode equivalent input config.

Some notes/questions.

  • path => ["//files.algotec.co.il/projects/Mediprime/Crash Logs/@AutomaticCollection/*.json"] is a SMB share?
  • when using close_older => 1, max_open_files => 50000 is unnecessarily high. You can use max_open_files => 50 and get the same effect.
  • In 6.4 we corrected a regression from wrt LS 5.6 behaviour regarding start_position => beginning. In LS 1.X -> 5.6.X start position only referred to the action taken when a new file was discovered after LS started. and the action for existing files at start is tail -f, i.e. equivalent to start_position => end. I made a mistake when I brought over the file watch code into the file input and prepared the code for "read" mode. I let the start_position govern what action to take on existing files - so because of this 6.2.4 is actually wrong behaviour.

That said, "read" mode will take the starting position of any file from either the beginning if there is no entry in the sincedb or from the last-read-position in the sincedb if there is a matching windows inode equivalent.


(Ron Grosberg) #11

@guyboertje, thanks for your input!
See my notes...


(Ron Grosberg) #12

Any idea?


(Guy Boertje) #13

Ok. Now show me what you used for the read mode equivalent input config.


(Ron Grosberg) #14

Guy, can we first try to understand why the normal tail mode is not working and then move to read mode?


(Ron Grosberg) #15

Please...?


(Guy Boertje) #16

I don't understand what you mean by tail mode not working.

You say you have been waiting for read mode, now that means to me that your files are not following the grow/truncate/grow style of logging associated with a typical log tailing scenario.

If you are using tail mode to process logs that are seen as fully written (remote FS can mess with this) then there are some step needed to achieve this. Due to the regression bug the behaviour is different between 6.2.4 and 6.4.1.

For both versions you need to use close_older, set this to 1 second to close a file so more than 4095 can be processed.

We have two scenarios:

A. Files exist in the watched path but have not been read, i.e. using an empty sincedb file

With start_position => "end" (default)

v6.2.4 behaviour (due to the bug)

  1. LS starts up
  2. LS discovers some files
  3. Using start_position => end, LS does not read the files from the beginning, instead it monitors them for file size increases. However, as the files are complete, there is no new content.
  4. LS appears to be doing nothing.

v6.4.1 (bug fixed)

  1. LS starts up
  2. LS discovers some files
  3. Ignoring start_position => end, LS does not read the files from the beginning, instead it monitors them for file size increases. However, as the files are complete, there is no new content.
  4. LS appears to be doing nothing.

With start_position => "beginning"

v6.2.4 behaviour (due to the bug)

  1. LS starts up
  2. LS discovers some files
  3. Using start_position => beginning, LS does read the files from the beginning, all content is read.
  4. LS appears to be doing something.

v6.4.1 (bug fixed)

  1. LS starts up
  2. LS discovers some files
  3. Ignoring start_position => beginning, LS does not read the files from the beginning, instead it monitors them for file size increases. However, as the files are complete, there is no new content
  4. LS appears to be doing nothing


B. No Files exist in the watched path

With start_position => "end" (default)

v6.2.4 behaviour (due to the bug)

  1. LS starts up
  2. LS discovers no files
  3. A file is copied into the glob path
  4. Using start_position => end, LS does not read the file from the beginning, instead it monitors them for file size increases. However, as the files are complete, there is no new content
  5. LS appears to be doing nothing

v6.4.1 (bug fixed)

  1. LS starts up
  2. LS discovers no files
  3. A file is copied into the glob path
  4. Using start_position => end, LS does not read the file from the beginning, instead it monitors them for file size increases. However, as the files are complete, there is no new content
  5. LS appears to be doing nothing

With start_position => "beginning"

v6.2.4 behaviour (due to the bug)

  1. LS starts up
  2. LS discovers no files
  3. A file is copied into the glob path
  4. Using start_position => beginning, LS does read the file from the beginning
  5. LS appears to be doing something

v6.4.1 (bug fixed)

  1. LS starts up
  2. LS discovers no files
  3. A file is copied into the glob path
  4. Using start_position => beginning, LS does read the file from the beginning
  5. LS appears to be doing something

(Ron Grosberg) #17

@guyboertje, for me the only scenario is new files that are not in the sincedb.
I have start_position => beginning, yet nothing happens it does not recognize the new files.
As I said - the same configuration file (in tail mode) works with 6.2.4 and also worked with versions 2 and 5 of Kibana.....

By the way - probably not you but the 6.4.1 Kibana crashes A LOT(!!!) it never used to crash for me on older versions...


(system) #18

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.