How to reduce collection time

jang · July 3, 2020, 4:54am

hello,

I'm using elk.

After a day of testing, there was a problem.

The difference between collection time and log generation time is widening.

For example,
the collection time and log generation time were exactly the same at first.
after a day...
There is a difference of 9 seconds between the collection time and the log time.

please introduce some typical ways to narrow the gap.

Christian_Dahlqvist · July 3, 2020, 5:10am

How are you collecting the logs? What is the specification of your Elasticsearch cluster? How much data are you collecting per day?

jang · July 3, 2020, 5:16am

i'm using filebeat -> logstash -> Amazon ES

elastic cluster node is five.(logstash is only one.)

There are about 30,000 logs a day.

Christian_Dahlqvist · July 3, 2020, 6:32am

If Elasticsearch is not able to keep up indexing documents back pressure will be applied. Logstash and Filebeat will eventually adjust and only read logs as fast as Elasticsearch can consume them. The first point to check is therefore whether Elasticsearch is the bottleneck or not. Look for disk I/O, high CPU or signs of long GC or merge throttling in the Elasticsearch logs. You may also be able to get this through the cluster stats API.

If Elasticsearch does not appear to be the bottleneck move up in the chain from there.

What instance types are you using? What type and size of storage do you have?

How much data does these logs result in per day? How many indices and shards are these indexed into?

system · July 31, 2020, 6:32am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.