I have a wildcard to ingest multiple .csv files, however in the elastic search visualizations, its adding the same column items together instead of skipping repeat entries.
I want to create a data dump where new logs update only new items.
I'm pretty sure I read by default it would do this.
Is there a setting in the filter plugin I need to add?
Nope, Logstash is not a state machine and treats everything is a unique event.
You may want to look at creating a unique document ID based on some of the fields. That way if you download another file that has the same entries, Logstash will simply update the existing one in Elasticsearch and not create a new one.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.