Logstash is re-importing entire logfile instead of just new records

Kyle_Hanson · September 23, 2016, 1:07pm

I have my configuration to extract my date/time out of each log record and it's going into the @timestamp successfully.

I ran logstash against my log file and everything imported fine to elasticsearch.

Now I've added a couple records to the bottom of my log file, saved the file and re-run logstash against it. It has re-imported all the records instead of just the two new ones, why is this, and how can I control it. I thought populating the @timestamp would eliminate that from happening.

Thanks,
Kyle

vandamo · September 23, 2016, 2:39pm

Could you provide more information regarding your setup, and the procedure you used to add those lines?

Logstash typically keeps track of where he's at in a file, and stores the information in a hidden file (.sincedb....)

If you for example deleted the file and replaced it with a file of the same name on a linux system for example would have logstash consider the file to be a new one.

So assuming you have set start_position to "beginning" in your config, logstash will see a new file and process it completely.

more information on how this works is available at https://www.elastic.co/guide/en/logstash/current/plugins-inputs-file.html

Kyle_Hanson · September 23, 2016, 2:45pm

to be honest, I did have start_position set to beginning at first, but it's changed to 'end' and it still seems to be re-importing the entire file.

I'm doing all of this in a Win 7 environment.
I'm not deleting my log file

here's basically the process I've tried:

I have my logfile open in Vim, just on the very odd chance this is issue, but unlikely
I run logstash against it with start_position set to 'end', it imports entire file
I add one or two lines to the bottom of the file and do a ':w!' in Vim to save the file
Whether I keep logstash running or stop it first and restart it, it then re-imports the entire log file after I've added those couple lines through Vim.

vandamo · September 23, 2016, 2:59pm

I'm a bit puzzled, considering that your start_position set to 'end' would mean logstash will only process new entries in the file, not caring about all entries made prior to his start.
I do not know where the .sincedb file is located on your machine, but would check for it, as well as researching to find the identifier of your file.

Christian_Dahlqvist · September 23, 2016, 3:06pm

If you edit the file using an editor, a new file replacing the old can be created, which makes Logstash treat it as a new file and read it again. Make sure that you append to the file, perhaps through a script.

Kyle_Hanson · September 23, 2016, 3:08pm

good to know, I'll see about proving it but glad to have that info. thanks. Once this goes into production, the log file would obviously be handled more typically like that so hopefully this is my issue.

Topic		Replies	Views
Importing the same file Logstash	19	3078	March 23, 2018
Logstash is reading whole log file every time it updates Elasticsearch	3	661	April 15, 2019
How to import the old event from file to es use logstash Logstash	3	737	May 9, 2017
Abnormal "file" input behavior? Elasticsearch	2	333	July 6, 2017
Logstash is reindexing lastline and newline Logstash	21	1438	May 16, 2019

Logstash is re-importing entire logfile instead of just new records

Related topics