How to speed up indexing of csv file via logstash

ludaca · September 3, 2019, 8:31am

Hello,
ES 7.3 cluster on k8s, 6 data nodes
csv file: 4500000 rows, 90 fields
Indexing takes an hour and a half approximately.
</> logstash conf file:
input { stdin {
type => "stdin-type"
}
}
filter {
csv {
separator => ","
skip_header => "true"
columns => [ ..... ]
}
date {
match => [ "my_date", "YYYYMMddHHmmss" ]
target => "my_date"
}
mutate {
convert => {
"Speed_Min" => "integer"
"Speed_Max" => "integer"
"Speed_Avg" => "integer"
}
}
}
output {
elasticsearch {
hosts => [ "elasticsearch:9200"]
index => "index-%{+YYYY-MM-dd}"
}
}
</>
What can be done to improve indexing time?
Thanks in advance

Badger · September 3, 2019, 1:15pm

To find that out you need to identify where the bottleneck is. It could be logstash, it could be elasticsearch. It could be CPU, it could (in elasticsearch) be disk throughput. It could be several other things. You need to monitor the components of your system and see which one is limiting throughput. Nobody can predict that without visibility into your servers.

system · October 1, 2019, 1:20pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Taking too much time to index Logstash	6	1085	June 30, 2017
Reading CSV and applying indexing from logstash taking too much time Logstash	4	326	April 8, 2019
Logstash parse too slow to elasticsearch Logstash	9	2311	March 2, 2018
How to tune Logstash for ES indexing speed Logstash	3	1470	October 4, 2018
Logstash performance for indexing to ES Logstash	6	665	April 18, 2017

How to speed up indexing of csv file via logstash

Related topics