Import CSV - Column names are also present in the values

Hi,

I've used Logstash to import a csv-file which works great and all was looking good but then I tried another csv-file and then I saw something strange. In Kibana I saw that the headers of the file are also in the values and I have no idea why this has happened.

My dataset looks more or less like this (in bold the column names):

C1 C2 C3
10 AB 15
20 CD 12
30 EF 16

This is the .conf I use to index a csv-file:

input
{
file
{
path => "/home/denny/test.csv"
start_position => "beginning"
ignore_older => 0
sincedb_path => "/dev/null"
}
}

filter
{
csv {
columns => ["C1", "C2", "C3 ]
separator => ","
}
}

output
{
elasticsearch {
action => "index"
hosts => "localhost"
index => "test"
workers => 1
}
stdout {}
}

And once it has been indexed and I go to Kibana. I grab the new index and then I go to Discover. Then I click on a column and I see that the name of the column is also present in the values. So when I visualize a bar chart then the column names will also have a bar of its own .

I did not encounter this with the first csv I have indexed with Logstash so I find it strange why this would happen and I hope someone can enlighten me :slight_smile:

EDIT: The first csv-file was one I downloaded from the internet to try it out (the titanic dataset from Kaggle), this dataset has been made by me with Excel.

Logstash's file input knows nothing about CSV files and the csv filter doesn't know about skipping the first line. See https://github.com/logstash-plugins/logstash-filter-csv/issues/13.

1 Like