How do I know/or make sure that elastic search does not add same log file twice?

Dani · March 29, 2016, 9:54pm

When using a file input, with elasticsearch output

how do I make sure that each of the lines in the log files I generate, and have logstash process only get added once?

If I add another output, like S3 some time later, will it update that output with the older data? Or will the data need to be re-loaded from scratch and have the since_db erased and will I need to clear the elasticsearch indexes?

warkolm · March 30, 2016, 1:58am

I'm moving this to the Logstash area as it's more relevant there.

magnusbaeck · March 30, 2016, 5:37am

how do I make sure that each of the lines in the log files I generate, and have logstash process only get added once?

Logstash tracks the inodes of files it has processed so unless you rotate the files by copying them to new files and have start_position => "beginning" set you should be fine.

If I add another output, like S3 some time later, will it update that output with the older data? Or will the data need to be re-loaded from scratch and have the since_db erased and will I need to clear the elasticsearch indexes?

When you add additional outputs only data processed by Logstash from then on will reach the new output. So yeah, in this case you need to clear sincedb and reprocess the files.

Topic		Replies	Views
Elasticsearch input on logstash duplicating documents Logstash	1	293	October 20, 2020
Will logstash duplicate already indexed data in elasticsearch? Logstash	2	1122	July 6, 2017
Logstash S3 Input - How does it know where to start? Logstash	2	1562	May 8, 2018
Duplicate data parsed by Logstash, which cause duplicate data in Elasticsearch index Logstash	5	1621	October 6, 2017
Same message being sent twice Logstash	5	745	August 24, 2018

How do I know/or make sure that elastic search does not add same log file twice?

Related topics