Autodetect_column_names is not working as expected in csv filter plugin

vladislav · May 12, 2023, 8:43am

Hi,

I have this logstash .conf file:

input {
  file {
    path => "/eee/*.csv"
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }
}
filter {
  csv {
   autodetect_column_names => true
 }
}

And multiple .csv files, containing different data, all of them need to be placed in the same index.
Example, file1.csv contains:

Name,City,Date,Comment
Josh,city1,2022-01-02,active
John,city2,2022-04-29,passive

And file2.csv contains different headers:
Name,first_seen,last_seen,favorite
Josh,2020-04-05,2022-01-02,yes
John,2019-05-05,2022-04-29,no

And when I open index in Kibana, after indexing in elasticsearch I see that all the header names were taken from the first file
So, from file2.csv I see:
Name: Josh, City: 2020-04-05, Date: 2022-01-02, Comment: yes

How can I avoid this problem? Big thanks in advance. Using logstash 8.7.1, pipeline.workers is set to 1.

Badger · May 12, 2023, 3:53pm

autodetect_column_names sets the names once and never changes them. If you want multiple sets of column names then you need multiple csv filters.

You may be able to use a conditional based on the number of fields, or else based on the [log][file][path].

vladislav · May 15, 2023, 6:39am

thanks!

system · June 12, 2023, 6:39am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
CSV filter plugin and autodetect column names Logstash	3	721	September 4, 2019
Multiple workers fail to process CSV files when auto detect column names is set Logstash	3	437	September 30, 2021
Autodetect_column_name with 2 different CSV Logstash	5	469	September 29, 2022
Logstash read csv column problem Logstash	3	635	March 15, 2019
Autodetect_column_names & different header names mixup Logstash	1	1137	October 5, 2017

Autodetect_column_names is not working as expected in csv filter plugin

Related topics