I was running one Logstash JDBC instance to sync 3 tables to ES every 5 minuts, but now we need to sync 8 tables. Sync 8 takes too long. So I tried to run multi instances of Logstash on single server. It doesn't help. Theoretically, shouldn't run 8 concurrently much faster?
Is there a way to know how to config Logstash to make it faster?
Env: OS Ubuntu 16.04, CPU 4 core Intel , RAM 8G Logstash and ES on same server!!!
What I have tried is:
set jvm.options from -Xms256m -Xmx1g to -Xms512m -Xmx512m
Theoretically, shouldn't run 8 concurrently much faster?
By default Logstash runs as many pipeline worker threads as you have CPU cores. If that's enough to saturate the CPUs then running multiple Logstash instances won't make a difference.
Is there a way to know how to config Logstash to make it faster?
Are you saturating the CPUs? If so it's not clear that any tweaking of the settings is going to help (apart from speeding up any Logstash filters).
I have reset all to default for logstash, but we need to reduce the total sync time, so I tried 2 cron jobs, each have 4 tables in queue. Looks like CPU and RAM still have place to improve, any suggestions?
Here is screen shot for top:
top - 08:13:44 up 181 days, 9:22, 1 user, load average: 0.66, 0.73, 0.65
Tasks: 159 total, 1 running, 158 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 0.1 sy, 0.0 ni, 99.8 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st
KiB Mem : 8076772 total, 373748 free, 3266440 used, 4436584 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 4369440 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2202 elastic+ 20 0 9893336 2.597g 689540 S 0.7 33.7 3715:24 java
397 root 20 0 3648420 588128 17896 S 0.3 7.3 0:52.17 java
538 root 20 0 3648420 583300 17908 S 0.3 7.2 0:38.48 java
580 root 20 0 41804 3624 3028 R 0.3 0.0 0:00.02 top
22571 root 20 0 0 0 0 S 0.3 0.0 0:03.60 kworker/1:2
1 root 20 0 119580 4348 2604 S 0.0 0.1 6:23.94 systemd
2 root 20 0 0 0 0 S 0.0 0.0 1:04.63 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 0:21.63 ksoftirqd/0
Oh, right. Currently, There is no incremental data. Client is offline
We don't have too much data, each table will have <=3000 records a day.
If I run one instance, each table takes 50+ s, but running 2 instance, each table will take 2+ minutes.
(no incremental data, so only has logstash + JDBC input running )
Should I suggest my CTO to buy more servers?
Hell NO, Never give up, it turns out JRuby start up needs lots entropy, followed this and fixed start slow.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.