Autodetect_column_name with 2 different CSV

EliottB · August 31, 2022, 5:09am

Hello all,

I have an issue with the autodetect_column_name of my pipeline below:

filter {
  csv {
    separator =>","
    autodetect_column_names => true
      }

When I process the first CSV message
country,city,name,address
USA,NYC,John,TestStreet

It process correctly the message
country => USA,
city => NYC,
name => John,
address => TestStreet

But then when I process a second CSV
country,city,address
Canada,Vancouver,GreenStreet

It will mess up the message using the same columns detected from the first message
country => Canada
city => Vancouver
name => GreenStreet

I already activated pipline.workers : 1 in my logstash.yml. I feel somehow it is buffering the first message columns it has detected and does not try to autodetect the column again for the new messages coming.

Thanks for your help !

Rios · August 31, 2022, 6:47am

Check this

EliottB · August 31, 2022, 7:42am

Hey Rios,

Thanks, I already set up the pipeline.workers to 1 and it is uncommented but it does not work. The ingesting keeps using the colums it detected the very first time and does not refresh at every new message.

Badger · August 31, 2022, 4:41pm

With autodetect_column_names the code eats the first line to use as column names and never changes them. See this thread for a way to support multiple CSV formats.

EliottB · September 1, 2022, 8:37am

Hey Badger,

Thanks a lot it works perfectly fine !

ruby { code => 'event.set("[@metadata][fields]", 1 + event.get("message").count(","))'}
if [@metadata][fields] == 3 {
 	  csv {
  		skip_header => true
  		separator =>","
  		columns => 
  			[
  			"column1",
  			"column2",
                        "column3"
                         ]
                   } }
else if [@metadata][fields] == 2 {
 	  csv {
  		skip_header => true
  		separator =>","
  		columns => 
  			[
  			"column1",
  			"column2"
                        ]
                    } }
else {
 	  csv {
  		skip_header => true
  		separator =>","
  		columns => 
  			[
  			"column1"
                        ]
                    } }

system · September 29, 2022, 8:38am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Autodetect_column_names is not working as expected in csv filter plugin Logstash	3	303	June 12, 2023
Autodetect_column_names is not working as expected in csv filter pluing Logstash	3	1979	May 15, 2019
Multiple csv with different columns Logstash	4	733	May 28, 2021
Multiple workers fail to process CSV files when auto detect column names is set Logstash	3	437	September 30, 2021
CSV: autodetect_column_names vs. autogenerate_column_names Logstash	3	4434	February 12, 2019

Autodetect_column_name with 2 different CSV

Related topics