CSV: autodetect_column_names vs. autogenerate_column_names

Aralex · January 14, 2019, 5:56pm

Hi,

What are the exact functional differences between autodetect_column_names and autogenerate_column_names in the csv filter?
Is it bad/good using both (set to true)?
IS there a recommended order for using both simultaneously?

Neither seem to be working as expected in my case (just a standard csv filled with basic info).

Since we're on this:
4) In my case, Filebeat sends the csv data to Logstash. What are the detailed operational differences if I start Filebeat before Logstash and vice versa?

Thanks.

Badger · January 14, 2019, 7:07pm

If autogenerate_column_names is enabled, it will create its own names for columns where no name is supplied. For example, if we have a csv with two field and we parse it with

     autogenerate_column_names => false
     columns => [ "Foo" ]

then the events will only have data from the first column, which will be in a field called Foo. If we parse it with

     autogenerate_column_names => true
     columns => [ "Foo" ]

then the events will have two fields of data. One call Foo and one called column2.

For autodetect ... suppose we have a file that contains

foo,BAR,baz
1,2,3

If we parse that using

csv {
     autodetect_column_names => true
}

We will get a single event that has these fields in it

       "foo" => "1",
       "baz" => "3",
       "BAR" => "2"

It detects the column name by consuming the header line.

Setting both to true might be useful if you were consuming a file that had an incomplete header line, like this

foo,BAR
1,2,3

Aralex · January 15, 2019, 9:13am

This is incredibly helpful, thank you so much Badger. This could also explain why I've been getting the same data parsed twice, I had strong suspicions with these two settings but wasn't sure of their exact difference.
Thanks again.

system · February 12, 2019, 9:13am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Autodetect_column_names is not working as expected in csv filter pluing Logstash	3	1980	May 15, 2019
Autodetect_column_name with 2 different CSV Logstash	5	472	September 29, 2022
CSV filter plugin and autodetect column names Logstash	3	721	September 4, 2019
Autodetect_column_names is not working as expected in csv filter plugin Logstash	3	303	June 12, 2023
Multiple csv with different columns Logstash	4	733	May 28, 2021

CSV: autodetect_column_names vs. autogenerate_column_names

Related topics