Autodetect_column_name with 2 different CSV

Hello all,

I have an issue with the autodetect_column_name of my pipeline below:

filter {
  csv {
    separator =>","
    autodetect_column_names => true
      }

When I process the first CSV message
country,city,name,address
USA,NYC,John,TestStreet

It process correctly the message
country => USA,
city => NYC,
name => John,
address => TestStreet

But then when I process a second CSV
country,city,address
Canada,Vancouver,GreenStreet

It will mess up the message using the same columns detected from the first message
country => Canada
city => Vancouver
name => GreenStreet

I already activated pipline.workers : 1 in my logstash.yml. I feel somehow it is buffering the first message columns it has detected and does not try to autodetect the column again for the new messages coming.

Thanks for your help !

Check this

Hey Rios,

Thanks, I already set up the pipeline.workers to 1 and it is uncommented but it does not work. The ingesting keeps using the colums it detected the very first time and does not refresh at every new message.

With autodetect_column_names the code eats the first line to use as column names and never changes them. See this thread for a way to support multiple CSV formats.

1 Like

Hey Badger,

Thanks a lot it works perfectly fine !

ruby { code => 'event.set("[@metadata][fields]", 1 + event.get("message").count(","))'}
if [@metadata][fields] == 3 {
 	  csv {
  		skip_header => true
  		separator =>","
  		columns => 
  			[
  			"column1",
  			"column2",
                        "column3"
                         ]
                   } }
else if [@metadata][fields] == 2 {
 	  csv {
  		skip_header => true
  		separator =>","
  		columns => 
  			[
  			"column1",
  			"column2"
                        ]
                    } }
else {
 	  csv {
  		skip_header => true
  		separator =>","
  		columns => 
  			[
  			"column1"
                        ]
                    } }
                   
               
1 Like