Hello there,
I'm trying to parse a large .csv file via Logstash to elasticsearch, but it is too slow, about 100 events per second, my .csv file has over a million events.
I'm running both Logstash and Elasticsearch locally, version 6.1.3.
My PC configuration:
-Ubuntu 16.04
-Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
-4GB DDR3 1600MHz
An example of some events:
|Date|Key|AFC|AGC|C|TN|Or|De|FCLS|Ftnt|OWRT|
2017-01-01,1BANYCLONAXXYD1T1,6424.16,1254.16,BA,1,aaa,bbb,AAA5D1T1,8P,RT
2017-01-01,1BANYCLONAXXY5D1T1,6424.16,1254.16,BA,1,aaa,bbb,AAA5D1T1,8P,RT
2017-01-02,1BANYCLONAXXY5D1T1,6424.16,1254.16,BA,1,aaa,bbb,AAA5D1T1,8P,RT
2017-01-02,1BANYCLONXXY5D1T1,6424.16,1254.16,BA,1,aaa,bbb,AAA5D1T1,8P,RT
The filter used:
input {
file {
path => "/home/tmp/file.csv"
start_position => beginning
}
}
filter {
csv {
columns => ["Date",
"Key",
"AFC",
"AGC",
"C",
"TN",
"Or",
"De",
"FCLS",
"Ftnt",
"OWRT"
]
separator => ","
remove_field => ["message"]
}
date {
match => ["Date", "yyyy-MM-dd"]
}
}
output {
elasticsearch { hosts => ["localhost:9200"]
index => "dev_index"
}
stdout { codec => dots }
}
I increased the workers number on logstash.yml to 4, but no did not appear to change anything.
How can I improve this performance?
Thanks,