Logstash can't recognize csv columns

m.thecoder · August 22, 2019, 7:17pm

Hello Guys,

I've configured logstah to read multiple csv files, each one has a different columns set (different column names), it reads the first one successfully, but it never reads the columns for any csv after reading the first one!

here is my code:

input {
    file {
       path => '/tmp/*.csv'
       type => 'testcsv'
       sincedb_path => "/dev/null"
    }
}

filter {
 csv {
  separator => ","
  autodetect_column_names => "true"
 }
}

output {
   elasticsearch {
                hosts => ["localhost:9200"]
                index => "%{tmp_index}"
   }
}

Badger · August 22, 2019, 8:12pm

autodetect_column_names uses the very first event that logstash sees to determine the column names and never changes them. You could count columns or else do a conditional based on the filename.

m.thecoder · August 22, 2019, 8:19pm

Thanks for the reply, but what would I do with the fields number or the conditional on having the file's name changed in the path field?
Sorry, I didn't catch your drift!

ruby { code => 'event.set("[@metadata][fields]", 1 + event.get("message").count(","))' }

because the listener might have files at anytime and with same # of columns but different names, and I can't have a hard-code if else condition, like the one in the link you've put in the reply.

Badger · August 22, 2019, 8:34pm

If you want to ingest files with different formats (different sets of columns) the processing has to be conditional on which format a given event has. If you do not want to use if/else then ingest files one at a time.

m.thecoder · August 22, 2019, 8:56pm

can you give me an example about this part? because the files won't be ingested at the same time, I might have one file to ingest now and three files after X minutes and so on.

Badger · August 23, 2019, 12:01am

That's the post I linked to. In that case the conditional was based on the number of columns, but it could also be based on a regexp match to the filename.

If you want to use autodetect_column_names then you would have to run logstash one time for each file, which is going to be very expensive for small files (it takes logstash about 45 seconds of CPU to start up on my system).

system · September 20, 2019, 12:01am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Multiple csv with different columns Logstash	4	733	May 28, 2021
Autodetect_column_names is not working as expected in csv filter plugin Logstash	3	302	June 12, 2023
Autodetect_column_name with 2 different CSV Logstash	5	469	September 29, 2022
How do I conditionally parse a CSV into different columns? Logstash	2	248	August 13, 2021
CSV filter plugin and autodetect column names Logstash	3	721	September 4, 2019

Logstash can't recognize csv columns

Related topics