Bottleneck Data Pipeline


(Jonar B) #1

I have a csv with more than 300kr of rows per hr. I use filebeat to ship data into elasticsearch.
My problem is that, the sending of data is very slow like 2k of rows per 3-5mins only and sometimes it stops for a while.

Are there any config hacks which will make the data shipping faster?

TIA


(Christian Dahlqvist) #2

Where are you sending the data?


(Len Rugen) #3

Also, what is the filebeat OS and is the CSV on local or shared disk?


(Jonar B) #4

from filebeat -> logstash -> elasticsearch


(Jonar B) #5

i'm using filebeat for Win OS. the CSV is stored in the local


(Len Rugen) #6

Well, I guess the next step is to see if the delay is in harvesting or publishing. Have you checked the logs for filebeat and logstash? Do you have other beats sending OK?


(Jonar B) #7

yes. there's no error in the filebeat logs.


(Christian Dahlqvist) #8

What is the specification of your Elasticsearch cluster? What kind of hardware and storage are you using?

If you want to test if Elasticsearch is limiting throughput, you can e.g. temporarily replace the Elasticsearch output with a file output and see if that changes the throughput of data collected.


(system) #9

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.