Problem beetween elastic and logstash: csv files are saved twice

IneedHelp · June 11, 2020, 9:07am

Hello !
I use ELK to parse log file and csv file but yesterday a little problem appeared. All the data in my csv file were saved twice so a 1000 lines document is now a 2000 lines document. In fact you just have to divide all the result by two but my dashboards are on the local network so it's annoying for the others users.

First of all, I tried to locate the problem so i changed the output of my logstash config file with stdout{} and they were no problem with the output. Therefore I think the problem is beetween elastic and logstash.

I cheked elasticsearch but didn't find anything. I use the same logstash config file to parse log and csv so i don't understand while only csv files are impacted.

Here my logstash config file:

input { 
	beats {
        port => "5044"
    }
}
filter {
       if "Log" in [tags] {
       ...
       }
       if "Csv" in [tags] {
       ...
       }
}
output {
	elasticsearch {
		hosts => ["127.0.0.1:9200"]
		index => "squid-%{File_Type}"
	}
}

( File_Type is csv or log )

If you any idea i take it !
Thanks you

dadoonet · June 12, 2020, 3:48am

To make sure it does not happen, you should use one the columns of your CSV file as the _id of the document. That way, if for whatever reason the file gets parsed again, you will just overwrite the existing values.

Thompso1n · June 12, 2020, 6:02am

This Logstash configuration file directs Logstash to read apache error logs!

IneedHelp · June 22, 2020, 7:23am

I'm sorry for this delayed answer but i took a vacation
Thanks for your answer, i read a similar answer 1 week ago but i don't know how to do it. Can you exlain me please ?
In fact, Should I use the line number as the _id or add an id column in all my csv files and define it as the _id ?

IneedHelp · June 22, 2020, 7:26am

Like for daddonet I'm sorry for the delayed answer but i thanks you for your answer.
What do you mean by:

read apache error logs

dadoonet · June 22, 2020, 8:27am

Both would work I guess but the easiest is to add a column. Note that logstash won't be able to know the line number I think.

system · July 20, 2020, 8:27am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Import CSV into Elasticsearch with Logstash issue Logstash	6	1627	May 2, 2018
Parse 2 times csv files Elasticsearch	4	411	June 20, 2017
Preventing duplicates when reading the same data multiple times Logstash	3	708	June 22, 2021
Logstash reading the file line multiple times Logstash	2	562	July 23, 2019
Logstash ingest and export to elasticsearch files twice Logstash	16	686	March 16, 2022

Problem beetween elastic and logstash: csv files are saved twice

Related topics