Generate output index from input filename


#1

Hi,

I'm trying to parse and integrate multiple .csv files into elasticsearch through logstash csv plugin. However, as I have around 4000 csv files, is there a way to make the same input csv filename the same as the elasticsearch output index? To give some context, every csv file starts with "Amp_.csv", where "" is a integer. Any way to make the index name as "Amp_" with "" as the corresponded digit(s)?

input {
file {
path => "/home/emanuel/ondas/amp/Amp_*.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
separator => ","
columns => ["id", "Amplitude", "time", "dist"]
remove_field => ["True"]
}
mutate {convert => ["id", "integer"] }
mutate {convert => ["Amplitude", "float"] }
mutate {convert => ["time", "float"] }
mutate {convert => ["dist", "float"] }
}
output {
elasticsearch {
hosts => "192.168.20.32:9200"
document_type => "data"
index => "WHAT TO PUT HERE?"
}
stdout { codec => json_lines }
}


(Magnus B├Ąck) #2

The input filename should end up in the path field so you can use e.g. a grok filter to extract parts of the path into a field. That field can then be referenced in the elasticsearch output's index option. This has been discussed many times before so please consult the archives.

Having one index per input file is probably misguided. Indexes have a fixed overhead and the cost of having 4000 indexes in an ES cluster is not insignificant.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.