Problem beetween elastic and logstash: csv files are saved twice

Hello !
I use ELK to parse log file and csv file but yesterday a little problem appeared. All the data in my csv file were saved twice so a 1000 lines document is now a 2000 lines document. In fact you just have to divide all the result by two but my dashboards are on the local network so it's annoying for the others users.

First of all, I tried to locate the problem so i changed the output of my logstash config file with stdout{} and they were no problem with the output. Therefore I think the problem is beetween elastic and logstash.

I cheked elasticsearch but didn't find anything. I use the same logstash config file to parse log and csv so i don't understand while only csv files are impacted.

Here my logstash config file:

input { 
	beats {
        port => "5044"
filter {
       if "Log" in [tags] {
       if "Csv" in [tags] {
output {
	elasticsearch {
		hosts => [""]
		index => "squid-%{File_Type}"

( File_Type is csv or log )

If you any idea i take it !
Thanks you

To make sure it does not happen, you should use one the columns of your CSV file as the _id of the document. That way, if for whatever reason the file gets parsed again, you will just overwrite the existing values.

1 Like

This Logstash configuration file directs Logstash to read apache error logs!

1 Like

I'm sorry for this delayed answer but i took a vacation :sunglasses:
Thanks for your answer, i read a similar answer 1 week ago but i don't know how to do it. Can you exlain me please ?
In fact, Should I use the line number as the _id or add an id column in all my csv files and define it as the _id ?

Like for daddonet I'm sorry for the delayed answer but i thanks you for your answer.
What do you mean by:

read apache error logs

Both would work I guess but the easiest is to add a column. Note that logstash won't be able to know the line number I think.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.