How to restrict fields from csv file using logstash?

For example in csv file i have 100 fields. But i want to load only 10 specific fileds which i needed.
How can i achieve this using logstash?

Thanks in advance..

Thanks & Regards,
Shankarananth.T

The csv filter will always extract all fields, but you can follow up with a ruby filter to delete all fields except the ones you specify. See http://stackoverflow.com/a/30343349/414355.

Hai magnusbaeck,

First of all Thank you very much for your reply.

As you mentioned in the link, I'm trying to pull the fields which are needed for me from csv.
For that i'm using the below mentioned filter format in logstash.conf file.

input {
file {
path => "D:\logstatsh\logstash-1.5.4\weblog.csv"
start_position => "beginning"
}
}

filter {
ruby {
code => "
wanted_fields = ['ApplicationName', 'BrokerName','CounterId']
event.to_hash.keys.each { |k|
event.remove(k) unless wanted_fields.include? k
}
"
}
}

output {
elasticsearch {
action => "index"
protocol => "http"
host => "localhost"
index => "Web_log_loading"

}
stdout {}

}

but the problem is after calling the above mentioned logstash.conf file command prompt data is not getting loaded into elastic search.
As i'm very new to ELK i'm not able to get what is the real problem.

Thanks and Regards,
Shankarananth.T

Logstash is probably tailing weblog.csv and waiting for additional lines to be added to the file. To read the file from the beginning you need to clear the sincedb file. See the file input documentation and countless other questions that talk about this.

Hai magnus,

Ya sure i will go through it.
But for the same below mentioned filter is loading the data completely in elastic search.
for that also i'm using the same path and start_position .
if u have some samples for restricting the fields from csv can u post it here.
I am a newbie, if i asked anything wrong im sorry. kindly correct me.

input {
file {
path => "D:\logstatsh\logstash-1.5.4\dataset.csv"
start_position => "beginning"
}
}

filter {
csv {
separator => ","
columns => ["Date","Open","High","Low","Close","Volume","Adj Close"]
}
mutate {convert => ["High", "float"]}
mutate {convert => ["Open", "float"]}
mutate {convert => ["Low", "float"]}
mutate {convert => ["Close", "float"]}
mutate {convert => ["Volume", "float"]}
}

output {
elasticsearch {
action => "index"
protocol => "http"
host => "localhost"
index => "data"

}
stdout {}

}

regards,
Shankarananth.T

Have you ever before tried to have Logstash read dataset.csv? If not, i.e. the file was previously unseen, the file will be read from the beginning (thanks to start_position => beginning).

No i never used Logstash read dataset.csv . But will this may be issue in restricting the fileds from csv file.
If u have any sample file using ruby filter to restric the field from csv file can u post it here.

Thank u

No i never used Logstash read dataset.csv

Okay. But you've attempted to process weblog.csv more than once, yes?

If u have any sample file using ruby filter to restric the field from csv file can u post it here.

I've already posted a StackOverflow link containing an example of this. I have nothing more to add on that matter.

ok thank u very much magnus.
Let me try again .

Hi @magnusbaeck as you said I tried ruby example but it gives me error

This is my .conf file.

input { file { path => "/home/keval/Keval/NODE/comparison/flipkart_mobile.csv" type => "flipkart" start_position => "beginning" } } filter { ruby { code => " wanted_fields = ['productId', 'title', 'mrp', 'sellingPrice', 'productUrl', 'productFamily'] event.to_hash.keys.each { |k| event.remove(k) unless wanted_fields.include? k } " } } output { elasticsearch { action => "index" hosts => "127.0.0.1:9200" index => "flipkart" workers => 1 document_id => "%{productId}" } stdout {} }

Error:

NoMethodError: undefined method `to_iso8601' for nil:NilClass

I think its @time field issue. can you please help me to restrict csv fields

Without more context it's impossible to tell what's going wrong. Please try the prune filter instead. It's specifically made for this task.

Hi everyone,
I need a solution for a similar problem.
i have a logstash configuration using the csv filter containing columns say
csv{ columns => ["A", "B", "C","D","E"] seperator => "," skip_empty_columns => true }
and say if my CSV file has values only for [B,C,D,E] ie, there is no column named "A".
when i run my Logstash with particular configuration, what happens is that Logstash puts the value of column "B" in place of Column "A" and so on. Finally there is no value for column "E". Since i have given the option of
skip_empty_columns => true
Column "E" is omitted.

I am looking for a solution where Logstash has to take values and map them to the respective Column and not in some order.

@karthikeyan95, please start a new thread for your question.