Logstash is re-importing entire logfile instead of just new records

I have my configuration to extract my date/time out of each log record and it's going into the @timestamp successfully.

I ran logstash against my log file and everything imported fine to elasticsearch.

Now I've added a couple records to the bottom of my log file, saved the file and re-run logstash against it. It has re-imported all the records instead of just the two new ones, why is this, and how can I control it. I thought populating the @timestamp would eliminate that from happening.

Thanks,
Kyle

Could you provide more information regarding your setup, and the procedure you used to add those lines?

Logstash typically keeps track of where he's at in a file, and stores the information in a hidden file (.sincedb....)

If you for example deleted the file and replaced it with a file of the same name on a linux system for example would have logstash consider the file to be a new one.

So assuming you have set start_position to "beginning" in your config, logstash will see a new file and process it completely.

more information on how this works is available at https://www.elastic.co/guide/en/logstash/current/plugins-inputs-file.html

to be honest, I did have start_position set to beginning at first, but it's changed to 'end' and it still seems to be re-importing the entire file.

  • I'm doing all of this in a Win 7 environment.
  • I'm not deleting my log file

here's basically the process I've tried:

  • I have my logfile open in Vim, just on the very odd chance this is issue, but unlikely
  • I run logstash against it with start_position set to 'end', it imports entire file
  • I add one or two lines to the bottom of the file and do a ':w!' in Vim to save the file
  • Whether I keep logstash running or stop it first and restart it, it then re-imports the entire log file after I've added those couple lines through Vim.

I'm a bit puzzled, considering that your start_position set to 'end' would mean logstash will only process new entries in the file, not caring about all entries made prior to his start.
I do not know where the .sincedb file is located on your machine, but would check for it, as well as researching to find the identifier of your file.

If you edit the file using an editor, a new file replacing the old can be created, which makes Logstash treat it as a new file and read it again. Make sure that you append to the file, perhaps through a script.

1 Like

good to know, I'll see about proving it but glad to have that info. thanks. Once this goes into production, the log file would obviously be handled more typically like that so hopefully this is my issue.