Help Needed in improving the data ingestion time

Hi

I am having csv file of size 1.6 GB and i used default configuration of logstash and elasticsearch to upload the data via logstash , it took more than 15hrs and when i tried with LS_HEAP_SIZE=2gb the data uploaded in 4 hrs. I am doing some POC with system configuration 64 bit ubuntu machine with dual core and 4 gb ram.

Can some one help me to improve the data uploading rate as my real time data will be more than 1GB and we cant wait for long time for each uploading.

actually i used LS_HEAP_SIZE= 2gb along with -b 1000 and -w 2

Are the CPUs saturated during the hours Logstash is working on the data? What kind of filters do you have? What's the event rate? Have you looked into using the montoring APIs introduced in Logstash 5 to analyze where the bottlenecks are?

Hi Magnus

i am using the following filter

input {
file {
path => "\home\Projects\kibana\DataSet\test.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}

filter {
csv {
separator => ","
columns => ["time","id_geo","gw","id","server_id","good","responses"]

} 

date{
 match => ["time","UNIX"]
 target => "unixtime"
}
mutate {convert =>["id_geo","string"]}
mutate {convert =>["gw","string"]}
mutate {convert =>["id","integer"]}
mutate {convert =>["server_id","integer"]}
mutate {convert =>["good","integer"]}
mutate {convert =>["responses","integer"]}

}

output {
elasticsearch {
hosts => "http://localhost:9200"
index => 'waittest-point'
}
stdout{}
}

And i am not sure about monitoring API , can you give me pointers how to use that and also how to find CPU is getting saturated or not

Is Logstash even the bottleneck here? Or is it Elasticsearch? Maybe they're competing for the same machine resources.

And i am not sure about monitoring API , can you give me pointers how to use that

Did you look at the documentation?

how to find CPU is getting saturated or not

Use top? Are the CPUs running at full load? If not increasing the parallelism could be an option.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.