I am having csv file of size 1.6 GB and i used default configuration of logstash and elasticsearch to upload the data via logstash , it took more than 15hrs and when i tried with LS_HEAP_SIZE=2gb the data uploaded in 4 hrs. I am doing some POC with system configuration 64 bit ubuntu machine with dual core and 4 gb ram.
Can some one help me to improve the data uploading rate as my real time data will be more than 1GB and we cant wait for long time for each uploading.
Are the CPUs saturated during the hours Logstash is working on the data? What kind of filters do you have? What's the event rate? Have you looked into using the montoring APIs introduced in Logstash 5 to analyze where the bottlenecks are?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.