Multiple .csv file data is being added together

cxbarbosa · September 12, 2017, 12:31am

I have a wildcard to ingest multiple .csv files, however in the elastic search visualizations, its adding the same column items together instead of skipping repeat entries.

I want to create a data dump where new logs update only new items.

I'm pretty sure I read by default it would do this.
Is there a setting in the filter plugin I need to add?

warkolm · September 12, 2017, 1:23am

Can you clarify what you mean by that?

cxbarbosa · September 12, 2017, 4:55pm

I have user information entries on .csv files being downloaded as reports

I want to be able to have those reports sent automatically into log stash, adding only new records.

currently when I add a new report, it sees all the entries as brand new ones, and adds them together.

so if I have 30,000 items in a column, when I add a near identical report to the log stash folder, it then changes it to 60,000 and so on.

It was to my understanding that log stash could recognize this and only update new entries.

warkolm · September 12, 2017, 10:45pm

Nope, Logstash is not a state machine and treats everything is a unique event.

You may want to look at creating a unique document ID based on some of the fields. That way if you download another file that has the same entries, Logstash will simply update the existing one in Elasticsearch and not create a new one.

system · October 10, 2017, 10:50pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Updating data in CSV logstash is pushing the entire CSV file again with updated data which duplicates my records in index. But I just want to sync the data Logstash	14	1931	May 9, 2019
Reg csv file import into Elasticsearch using Logstash Logstash	13	2017	February 26, 2021
Logstash neither detect the newly added data in CSV file nor push it into elasticsearch Logstash	3	733	April 30, 2019
Index multiple csvs Logstash	4	718	March 23, 2017
Problem beetween elastic and logstash: csv files are saved twice Logstash	6	597	July 20, 2020

Multiple .csv file data is being added together

Related topics